cdna nucleotide sequence: Topics by Science.gov

Sample records for cdna nucleotide sequence

Hop stunt viroid: molecular cloning and nucleotide sequence of the complete cDNA copy.

PubMed Central

Ohno, T; Takamatsu, N; Meshi, T; Okada, Y

1983-01-01

The complete cDNA of hop stunt viroid (HSV) has been cloned by the method of Okayama and Berg (Mol.Cell.Biol.2,161-170. (1982] and the complete nucleotide sequence has been established. The covalently closed circular single-stranded HSV RNA consists of 297 nucleotides. The secondary structure predicted for HSV contains 67% of its residues base-paired. The native HSV can possess an extended rod-like structure characteristic of viroids previously established. The central region of the native HSV has a similar structure to the conserved region found in all viroids sequenced so far except for avocado sunblotch viroid. The sequence homologous to the 5'-end of U1a RNA is also found in the sequence of HSV but not in the central conserved region. Images PMID:6312412
Sequence of a cDNA encoding pancreatic preprosomatostatin-22.

PubMed Central

Magazin, M; Minth, C D; Funckes, C L; Deschenes, R; Tavianini, M A; Dixon, J E

1982-01-01

We report the nucleotide sequence of a precursor to somatostatin that upon proteolytic processing may give rise to a hormone of 22 amino acids. The nucleotide sequence of a cDNA from the channel catfish (Ictalurus punctatus) encodes a precursor to somatostatin that is 105 amino acids (Mr, 11,500). The cDNA coding for somatostatin-22 consists of 36 nucleotides in the 5' untranslated region, 315 nucleotides that code for the precursor to somatostatin-22, 269 nucleotides at the 3' untranslated region, and a variable length of poly(A). The putative preprohormone contains a sequence of hydrophobic amino acids at the amino terminus that has the properties of a "signal" peptide. A connecting sequence of approximately 57 amino acids is followed by a single Arg-Arg sequence, which immediately precedes the hormone. Somatostatin-22 is homologous to somatostatin-14 in 7 of the 14 amino acids, including the Phe-Trp-Lys sequence. Hybridization selection of mRNA, followed by its translation in a wheat germ cell-free system, resulted in the synthesis of a single polypeptide having a molecular weight of approximately 10,000 as estimated on Na-DodSO4/polyacrylamide gels. Images PMID:6127673
Human somatostatin I: sequence of the cDNA.

PubMed Central

Shen, L P; Pictet, R L; Rutter, W J

1982-01-01

RNA has been isolated from a human pancreatic somatostatinoma and used to prepare a cDNA library. After prescreening, clones containing somatostatin I sequences were identified by hybridization with an anglerfish somatostatin I-cloned cDNA probe. From the nucleotide sequence of two of these clones, we have deduced an essentially full-length mRNA sequence, including the preprosomatostatin coding region, 105 nucleotides from the 5' untranslated region and the complete 150-nucleotide 3' untranslated region. The coding region predicts a 116-amino acid precursor protein (Mr, 12.727) that contains somatostatin-14 and -28 at its COOH terminus. The predicted amino acid sequence of human somatostatin-28 is identical to that of somatostatin-28 isolated from the porcine and ovine species. A comparison of the amino acid sequences of human and anglerfish preprosomatostatin I indicated that the COOH-terminal region encoding somatostatin-14 and the adjacent 6 amino acids are highly conserved, whereas the remainder of the molecule, including the signal peptide region, is more divergent. However, many of the amino acid differences found in the pro region of the human and anglerfish proteins are conservative changes. This suggests that the propeptides have a similar secondary structure, which in turn may imply a biological function for this region of the molecule. Images PMID:6126875
cDNA encoding a polypeptide including a hevein sequence

DOEpatents

Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

1993-02-16

A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.
Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

PubMed

Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

2010-05-07

Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.
CDNA encoding a polypeptide including a hevein sequence

DOEpatents

Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

1995-03-21

A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.
Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly

PubMed Central

Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka

2010-01-01

Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877
The cDNA sequence of a neutral horseradish peroxidase.

PubMed

Bartonek-Roxå, E; Eriksson, H; Mattiasson, B

1991-02-16

A cDNA clone encoding a horseradish (Armoracia rusticana) peroxidase has been isolated and characterized. The cDNA contains 1378 nucleotides excluding the poly(A) tail and the deduced protein contains 327 amino acids which includes a 28 amino acid leader sequence. The predicted amino acid sequence is nine amino acids shorter than the major isoenzyme belonging to the horseradish peroxidase C group (HRP-C) and the sequence shows 53.7% identity with this isoenzyme. The described clone encodes nine cysteines of which eight correspond well with the cysteines found in HRP-C. Five potential N-glycosylation sites with the general sequence Asn-X-Thr/Ser are present in the deduced sequence. Compared to the earlier described HRP-C this is three glycosylation sites less. The shorter sequence and fewer N-glycosylation sites give the native isoenzyme a molecular weight of several thousands less than the horseradish peroxidase C isoenzymes. Comparison with the net charge value of HRP-C indicates that the described cDNA clone encodes a peroxidase which has either the same or a slightly less basic pI value, depending on whether the encoded protein is N-terminally blocked or not. This excludes the possibility that HRP-n could belong to either the HRP-A, -D or -E groups. The low sequence identity (53.7%) with HRP-C indicates that the described clone does not belong to the HRP-C isoenzyme group and comparison of the total amino acid composition with the HRP-B group does not place the described clone within this isoenzyme group. Our conclusion is that the described cDNA clone encodes a neutral horseradish peroxidase which belongs to a new, not earlier described, horseradish peroxidase group.
[Cloning and sequence analysis of full-length cDNA of secoisolariciresinol dehydrogenase of Dysosma versipellis].

PubMed

Xu, Li; Ding, Zhi-Shan; Zhou, Yun-Kai; Tao, Xue-Fen

2009-06-01

To obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis by RACE PCR,then investigate the character of Secoisolariciresinol Dehydrogenase gene. The full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene was obtained by 3'-RACE and 5'-RACE from Dysosma versipellis. We first reported the full cDNA sequences of Secoisolariciresinol Dehydrogenase in Dysosma versipellis. The acquired gene was 991bp in full length, including 5' untranslated region of 42bp, 3' untranslated region of 112bp with Poly (A). The open reading frame (ORF) encoding 278 amino acid with molecular weight 29253.3 Daltons and isolectric point 6.328. The gene accession nucleotide sequence number in GeneBank was EU573789. Semi-quantitative RT-PCR analysis revealed that the Secoisolariciresinol Dehydrogenase gene was highly expressed in stem. Alignment of the amino acid sequence of Secoisolariciresinol Dehydrogenase indicated there may be some significant amino acid sequence difference among different species. Obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis.
cDNA encoding a polypeptide including a hevein sequence

DOEpatents

Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

1999-05-04

A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.
cDNA encoding a polypeptide including a hevein sequence

DOEpatents

Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

1999-05-04

A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.
cDNA encoding a polypeptide including a hevein sequence

DOEpatents

Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

1995-03-21

A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.
cDNA encoding a polypeptide including a hev ein sequence

DOEpatents

Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

2000-07-04

A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.
[Complete nucleotide sequences and genome structure of two Chinese tobacco mosaic virus isolates deduced from full-length infectious cDNA clones].

PubMed

Yang, G; Liu, X G; Qiu, B S

2000-07-01

The complete nucleotides of two Chinese tobacco mosaic virus (TMV) isolates, TMV-Cv (vulgare strain) and TMV-N14 (an attenuated virus originated from a tomato strain), were determined from their respective full-length infectious cDNA clones and compared with published TMV sequences. The genome structure of TMV-Cv contained 6395 nucleotides, in which four functional open reading frames (ORF), coding for replicase (126 kD/183 kD), movement protein (MP, 30 kD) and coat protein (CP, 17.6 kD) respectively, could be recognized. TMV-N14 contained 6384 nucleotides in its genome. In contrast to TMV-Cv, five functional ORFs encoding the replicase 98.5 kD/126 kD/183 kD, MP(27 kD) and CP(17.6 kD), respectively, were detected in the TMV-N14 genome. TMV-Cv is 99% homologous to a Korean TMV isolate belonging to the vulgare strain at the nucleotide level. TMV-N14 is 99% homologous to a highly virulent Japanese isolate TMV-L (tomato strain) at the nucleotide level. In TMV-N14, one opal nulation (UGA) occurred in the replicase gene and one ochre nutation (UAA) in the MP gene. The former mutation created a potential, additional ORF within the replicase gene, the latter reduced the size of the MP to 27 kD. In addition, there were also 13 amino acid substitutions in the replicase gene of TMV-N14 when compared to that of TMV-L. Collectively, these changes may have significant implications in the attenuation of the virulence of TMV-N14.
Nucleotide sequence of Hungarian grapevine chrome mosaic nepovirus RNA1.

PubMed Central

Le Gall, O; Candresse, T; Brault, V; Dunez, J

1989-01-01

The nucleotide sequence of the RNA1 of hungarian grapevine chrome mosaic virus, a nepovirus very closely related to tomato black ring virus, has been determined from cDNA clones. It is 7212 nucleotides in length excluding the 3' terminal poly(A) tail and contains a large open reading frame extending from nucleotides 216 to 6971. The presumably encoded polyprotein is 2252 amino acids in length with a molecular weight of 250 kDa. The primary structure of the polyprotein was compared with that of other viral polyproteins, revealing the same general genetic organization as that of other picorna-like viruses (comoviruses, potyviruses and picornaviruses), except that an additional protein is suspected to occupy the N-terminus of the polyprotein. PMID:2798128
cDNA encoding a polypeptide including a hevein sequence

DOEpatents

Raikhel, N.V.; Broekaert, W.F.; Namhai Chua; Kush, A.

1993-02-16

A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids.
Nucleotide Sequence Analysis of RNA Synthesized from Rabbit Globin Complementary DNA

PubMed Central

Poon, Raymond; Paddock, Gary V.; Heindell, Howard; Whitcome, Philip; Salser, Winston; Kacian, Dan; Bank, Arthur; Gambino, Roberto; Ramirez, Francesco

1974-01-01

Rabbit globin complementary DNA made with RNA-dependent DNA polymerase (reverse transcriptase) was used as template for in vitro synthesis of 32P-labeled RNA. The sequences of the nucleotides in most of the fragments resulting from combined ribonuclease T1 and alkaline phosphatase digestion have been determined. Several fragments were long enough to fit uniquely with the α or β globin amino-acid sequences. These data demonstrate that the cDNA was copied from globin mRNA and contained no detectable contaminants. Images PMID:4139714
Isolation and sequence of partial cDNA clones of human L1: homology of human and rodent L1 in the cytoplasmic region.

PubMed

Harper, J R; Prince, J T; Healy, P A; Stuart, J K; Nauman, S J; Stallcup, W B

1991-03-01

We have isolated cDNA clones coding for the human homologue of the neuronal cell adhesion molecule L1. The nucleotide sequence of the cDNA clones and the deduced primary amino acid sequence of the carboxy terminal portion of the human L1 are homologous to the corresponding sequences of mouse L1 and rat NILE glycoprotein, with an especially high sequences identity in the cytoplasmic regions of the proteins. There is also protein sequence homology with the cytoplasmic region of the Drosophila cell adhesion molecule, neuroglian. The conservation of the cytoplasmic domain argues for an important functional role for this portion of the molecule.
Molecular cloning and nucleotide sequence of CYP6BF1 from the diamondback moth, Plutella xylostella

PubMed Central

Li, Hongshan; Dai, Huaguo; Wei, Hui

2005-01-01

A novel cDNA clong encoding a cytochrome P450 was screened from the insecticide-susceptible strain of Plutella xylostella (L.) (Lepidoptera:Yponomeutidae). The nucleotide sequence of the clone, designated CYP6BF1, was determined. This is the first full-length sequence of the CYP6 family from Plutella xylostella (L.). The cDNA is 1661bp in length and contains an open reading frame from base pairs 26 to 1570, encoding a protein of 514 amino acid residues. It is similar to the other insect P450s in gene family 6, including CYP6AE1 from Depressaria pastinacella, (46%). The GenBank accession number is AY971374. PMID:17119627
Intervening sequences in a plant gene-comparison of the partial sequence of cDNA and genomic DNA of French bean phaseolin

NASA Astrophysics Data System (ADS)

Sun, S. M.; Slightom, J. L.; Hall, T. C.

1981-01-01

A plant gene coding for the major storage protein (phaseolin, G1-globulin) of the French bean was isolated from a genomic library constructed in the phage vector Charon 24A. Comparison of the nucleotide sequence of part of the gene with that of the cloned messenger RNA (cDNA) revealed the presence of three intervening sequences, all beginning with GTand ending with AG. The 5' and 3' boundaries of intervening sequences TVS-A (88 base pairs) and IVS-B (124 base pairs) are similar to those described for animal and viral genes, but the 3' boundary of IVS-C (129 base pairs) shows some differences. A sequence of 185 amino acids deduced from the cloned DMAs represents about 40% of a phaseolin polypeptide.

The cDNA sequence of mouse Pgp-1 and homology to human CD44 cell surface antigen and proteoglycan core/link proteins.

PubMed

Wolffe, E J; Gause, W C; Pelfrey, C M; Holland, S M; Steinberg, A D; August, J T

1990-01-05

We describe the isolation and sequencing of a cDNA encoding mouse Pgp-1. An oligonucleotide probe corresponding to the NH2-terminal sequence of the purified protein was synthesized by the polymerase chain reaction and used to screen a mouse macrophage lambda gt11 library. A cDNA clone with an insert of 1.2 kilobases was selected and sequenced. In Northern blot analysis, only cells expressing Pgp-1 contained mRNA species that hybridized with this Pgp-1 cDNA. The nucleotide sequence of the cDNA has a single open reading frame that yields a protein-coding sequence of 1076 base pairs followed by a 132-base pair 3'-untranslated sequence that includes a putative polyadenylation signal but no poly(A) tail. The translated sequence comprises a 13-amino acid signal peptide followed by a polypeptide core of 345 residues corresponding to an Mr of 37,800. Portions of the deduced amino acid sequence were identical to those obtained by amino acid sequence analysis from the purified glycoprotein, confirming that the cDNA encodes Pgp-1. The predicted structure of Pgp-1 includes an NH2-terminal extracellular domain (residues 14-265), a transmembrane domain (residues 266-286), and a cytoplasmic tail (residues 287-358). Portions of the mouse Pgp-1 sequence are highly similar to that of the human CD44 cell surface glycoprotein implicated in cell adhesion. The protein also shows sequence similarity to the proteoglycan tandem repeat sequences found in cartilage link protein and cartilage proteoglycan core protein which are thought to be involved in binding to hyaluronic acid.
Cloning, sequencing, and expression of cDNA for human. beta. -glucuronidase

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oshima, A.; Kyle, J.W.; Miller, R.D.

1987-02-01

The authors report here the cDNA sequence for human placental ..beta..-glucuronidase (..beta..-D-glucuronoside glucuronosohydrolase, EC 3.2.1.31) and demonstrate expression of the human enzyme in transfected COS cells. They also sequenced a partial cDNA clone from human fibroblasts that contained a 153-base-pair deletion within the coding sequence and found a second type of cDNA clone from placenta that contained the same deletion. Nuclease S1 mapping studies demonstrated two types of mRNAs in human placenta that corresponded to the two types of cDNA clones isolated. The NH/sub 2/-terminal amino acid sequence determined for human spleen ..beta..-glucuronidase agreed with that inferred from the DNAmore » sequence of the two placental clones, beginning at amino acid 23, suggesting a cleaved signal sequence of 22 amino acids. When transfected into COS cells, plasmids containing either placental clone expressed an immunoprecipitable protein that contained N-linked oligosaccharides as evidenced by sensitivity to endoglycosidase F. However, only transfection with the clone containing the 153-base-pair segment led to expression of human ..beta..-glucuronidase activity. These studies provide the sequence for the full-length cDNA for human ..beta..-glucuronidase, demonstrate the existence of two populations of mRNA for ..beta..-glucuronidase in human placenta, only one of which specifies a catalytically active enzyme, and illustrate the importance of expression studies in verifying that a cDNA is functionally full-length.« less
Complete nucleotide sequence of spring beauty latent virus, a bromovirus infectious to Arabidopsis thaliana.

PubMed

Fujisaki, K; Hagihara, F; Kaido, M; Mise, K; Okuno, T

2003-01-01

Spring beauty latent virus (SBLV), a bromovirus, systemically and efficiently infected Arabidopsis thaliana, whereas the well-studied bromoviruses brome mosaic virus (BMV) and cowpea chlorotic mottle virus (CCMV) did not infect and poorly infected A. thaliana, respectively. We constructed biologically active cDNA clones of SBLV genomic RNAs and determined their complete nucleotide sequences. Interestingly, SBLV RNA3 contains both the box B motif in the intercistronic region, as does BMV, and the subgenomic promoter-like sequence in the 5' noncoding region, as does CCMV. Sequence comparisons of SBLV, BMV, CCMV, and broad bean mottle virus demonstrated that SBLV is closely related to BMV and CCMV.
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.

PubMed

Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li

2007-06-01

The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
Sequence of a cDNA and expression of the gene encoding a putative epidermal chitin synthase of Manduca sexta.

PubMed

Zhu, Yu-Cheng; Specht, Charles A; Dittmer, Neal T; Muthukrishnan, Subbaratnam; Kanost, Michael R; Kramer, Karl J

2002-11-01

Glycosyltransferases are enzymes that synthesize oligosaccharides, polysaccharides and glycoconjugates. One type of glycosyltransferase is chitin synthase, a very important enzyme in biology, which is utilized by insects, fungi, and other invertebrates to produce chitin, a polysaccharide of beta-1,4-linked N-acetylglucosamine. Chitin is an important component of the insect's exoskeletal cuticle and gut lining. To identify and characterize a chitin synthase gene of the tobacco hornworm, Manduca sexta, degenerate primers were designed from two highly conserved regions in fungal and nematode chitin synthase protein sequences and then used to amplify a similar region from Manduca cDNA. A full-length cDNA of 5152 nucleotides was assembled for the putative Manduca chitin synthase gene, MsCHS1, and sequencing of genomic DNA verified the contiguity of the sequence. The MsCHS1 cDNA has an ORF of 4692 nucleotides that encodes a transmembrane protein of 1564 amino acid residues with a mass of approximately 179 kDa (GenBank no. AY062175). It is most similar, over its entire length of protein sequence, to putative chitin synthases from other insects and nematodes, with 68% identity to enzymes from both the blow fly, Lucilia cuprina, and the fruit fly, Drosophila melanogaster. The similarity with fungal chitin synthases is restricted to the putative catalytic domain, and the MsCHS1 protein has, at equivalent positions, several amino acids that are essential for activity as revealed by mutagenesis of the fungal enzymes. A 5.3-kb transcript of MsCHS1 was identified by northern blot hybridization of RNA from larval epidermis, suggesting that the enzyme functions to make chitin deposited in the cuticle. Further examination by RT-PCR showed that MsCHS1 expression is regulated in the epidermis, with the amount of transcript increasing during phases of cuticle deposition.
Sequence characterization of cDNA sequence of encoding of an antimicrobial Peptide with no disulfide bridge from the Iranian mesobuthus eupeus venomous glands.

PubMed

Farajzadeh-Sheikh, Ahmad; Jolodar, Abbas; Ghaemmaghami, Shamsedin

2013-01-01

Scorpion venom glands produce some antimicrobial peptides (AMP) that can rapidly kill a broad range of microbes and have additional activities that impact on the quality and effectiveness of innate responses and inflammation. In this study, we reported the identification of a cDNA sequence encoding cysteine-free antimicrobial peptides isolated from venomous glands of this species. Total RNA was extracted from the Iranian mesobuthus eupeus venom glands, and cDNA was synthesized by using the modified oligo (dT). The cDNA was used as the template for applying Semi-nested RT- PCR technique. PCR Products were used for direct nucleotide sequencing and the results were compared with Gen Bank database. A 213 BP cDNA fragment encoding the entire coding region of an antimicrobial toxin from the Iranian scorpion M. Eupeus venom glands were isolated. The full-length sequence of the coding region was 210 BP contained an open reading frame of 70 amino with a predicted molecular mass of 7970.48 Da and theoretical Pi of 9.10. The open reading frame consists of 210 BP encoding a precursor of 70 amino acid residues, including a signal peptide of 23 residues a propertied of 7 residues, and a mature peptide of 34 residues with no disulfide bridge. The peptide has detectable sequence identity to the Lesser Asian mesobuthus eupeus MeVAMP-2 (98%), MeVAMP-9 (60%) and several previously described AMPs from other scorpion venoms including mesobuthus martensii (94%) and buthus occitanus Israelis (82%). The secondary structure of the peptide mainly consisted of α-helical structure which was generally conserved by previously reported scorpion counterparts. The phylogenetic analysis showed that the Iranian MeAMP-like toxin was similar but not identical with that of venom antimicrobial peptides from lesser Asian scorpion mesobuthus eupeus.
Complete nucleotide sequences of the coat protein messenger RNAs of brome mosaic virus and cowpea chlorotic mottle virus.

PubMed Central

Dasgupta, R; Kaesberg, P

1982-01-01

The nucleotide sequences of the subgenomic coat protein messengers (RNA4's) of two related bromoviruses, brome mosaic virus (BMV) and cowpea chlorotic mottle virus (CCMV), have been determined by direct RNA and CDNA sequencing without cloning. BMV RNA4 is 876 b long including a 5' noncoding region of nine nucleotides and a 3' noncoding region of 300 nucleotides. CCMV RNA 4 is 824 b long, including a 5' noncoding region of 10 nucleotides and a 3' noncoding region of 244 nucleotides. The encoded coat proteins are similar in length (188 amino acids for BMV and 189 amino acids for CCMV) and display about 70% homology in their amino acid sequences. Length difference between the two RNAs is due mostly to a single deletion, in CCMV with respect to BMV, of about 57 b immediately following the coding region. Allowing for this deletion the RNAs are indicate that mutations leading to divergence were constrained in the coding region primarily by the requirement of maintaining a favorable coat protein structure and in the 3' noncoding region primarily by the requirement of maintaining a favorable RNA spatial configuration. PMID:6895941
Nucleotide sequences of bovine alpha S1- and kappa-casein cDNAs.

PubMed Central

Stewart, A F; Willis, I M; Mackinlay, A G

1984-01-01

The nucleotide sequences corresponding to bovine alpha S1- and kappa-casein mRNAs are presented. An unusual alpha S1-casein cDNA has been characterised whose 5' end commences upstream from its putative TATA box. The alpha S1-casein mRNA is compared to rat alpha-casein mRNA and two components of divergence are identified. Firstly, the two sequences have diverged at a high point mutation rate and the rate of amino acid replacement by this mechanism is at least as great as the rate of divergence of any other part of the mRNAs. Secondly, the protein coding sequence has been subjected to several insertion/deletion events, one of which may be an example of exon shuffling . The kappa-casein mRNA sequence verifies the proposition that it has arisen from a different ancestral gene to the other caseins. Images PMID:6328443
The EMBL nucleotide sequence database

PubMed Central

Stoesser, Guenter; Baker, Wendy; van den Broek, Alexandra; Camon, Evelyn; Garcia-Pastor, Maria; Kanz, Carola; Kulikova, Tamara; Lombard, Vincent; Lopez, Rodrigo; Parkinson, Helen; Redaschi, Nicole; Sterk, Peter; Stoehr, Peter; Tuli, Mary Ann

2001-01-01

The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:11125039
Sequence of the cDNA of a human dihydrodiol dehydrogenase isoform (AKR1C2) and tissue distribution of its mRNA.

PubMed Central

Shiraishi, H; Ishikura, S; Matsuura, K; Deyashiki, Y; Ninomiya, M; Sakai, S; Hara, A

1998-01-01

Human liver contains three isoforms (DD1, DD2 and DD4) of dihydrodiol dehydrogenase with 20alpha- or 3alpha-hydroxysteroid dehydrogenase activity; the dehydrogenases belong to the aldo-oxo reductase (AKR) superfamily. cDNA species encoding DD1 and DD4 have been identified. However, four cDNA species with more than 99% sequence identity have been cloned and are compatible with a partial amino acid sequence of DD2. In this study we have isolated a cDNA clone encoding DD2, which was confirmed by comparison of the properties of the recombinant and hepatic enzymes. This cDNA showed differences of one, two, four and five nucleotides from the previously reported four cDNA species for a dehydrogenase of human colon carcinoma HT29 cells, human prostatic 3alpha-hydroxysteroid dehydrogenase, a human liver 3alpha-hydroxysteroid dehydrogenase-like protein and chlordecone reductase-like protein respectively. Expression of mRNA species for the five similar cDNA species in 20 liver samples and 10 other different tissue samples was examined by reverse transcriptase-mediated PCR with specific primers followed by diagnostic restriction with endonucleases. All the tissues expressed only one mRNA species corresponding to the newly identified cDNA for DD2: mRNA transcripts corresponding to the other cDNA species were not detected. We suggest that the new cDNA is derived from the principal gene for DD2, which has been named AKR1C2 by a new nomenclature for the AKR superfamily. It is possible that some of the other cDNA species previously reported are rare allelic variants of this gene. PMID:9716498
Cloning and Expression of cDNA for Rat Heme Oxygenase

NASA Astrophysics Data System (ADS)

Shibahara, Shigeki; Muller, Rita; Taguchi, Hayao; Yoshida, Tadashi

1985-12-01

Two cDNA clones for rat heme oxygenase have been isolated from a rat spleen cDNA library in λ gt11 by immunological screening using a specific polyclonal antibody. One of these clones has an insert of 1530 nucleotides that contains the entire protein-coding region. To confirm that the isolated cDNA encodes heme oxygenase, we transfected monkey kidney cells (COS-7) with the cDNA carried in a simian virus 40 vector. The heme oxygenase was highly expressed in endoplasmic reticulum of transfected cells. The nucleotide sequence of the cloned cDNA was determined and the primary structure of heme oxygenase was deduced. Heme oxygenase is composed of 289 amino acids and has one hydrophobic segment at its carboxyl terminus, which is probably important for the insertion of heme oxygenase into endoplasmic reticulum. The cloned cDNA was used to analyze the induction of heme oxygenase in rat liver by treatment with CoCl2 or with hemin. RNA blot analysis showed that both CoCl2 and hemin increased the amount of hybridizable mRNA, suggesting that these substances may act at the transcriptional level to increase the amount of heme oxygenase.
Construction and EST sequencing of full-length, drought stress cDNA libraries for common beans (Phaseolus vulgaris L.)

PubMed Central

2011-01-01

Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the
cDNA nucleotide sequence coding for stearoyl-CoA desaturase and its expression in the zebrafish (Danio rerio) embryo.

PubMed

Hsieh, S L; Liu, R W; Wu, C H; Cheng, W T; Kuo, Ching-Ming

2003-12-01

A cDNA sequence of stearoyl-CoA desaturase (SCD) was determined from zebrafish (Danio rerio) and compared to the corresponding genes in several teleosts. Zebrafish SCD cDNA has a size of 1,061 bp, encodes a polypeptide of 325 amino acids, and shares 88, 85, 84, and 83% similarities with tilapia (Oreochromis mossambicus), grass carp (Ctenopharyngodon idella), common carp (Cyprinus carpio), and milkfish (Chanos chanos), respectively. This 1,061 bp sequence specifies a protein that, in common with other fatty acid desaturases, contains three histidine boxes, believed to be involved in catalysis. These observations suggested that SCD genes are highly conserved. In addition, an oligonucleotide probe complementary to zebrafish SCD mRNA was hybridized to mRNA of approximately 396 bases with Northern blot analysis. The Northern blot and RT-PCR analyses showed that the SCD mRNA was expressed predominantly in the liver, intestine, gill, and muscle, while a lower level was found in the brain. Furthermore, we utilized whole-mount in situ hybridization and real-time quantitative RT-PCR to identify expression of the zebrafish SCD gene at five different stages of development. This revealed that very high levels of transcripts were found in zebrafish at all stages during embryogenesis and early development. Copyright 2003 Wiley-Liss, Inc.
Nucleotide sequence of the L1 ribosomal protein gene of Xenopus laevis: remarkable sequence homology among introns.

PubMed Central

Loreni, F; Ruberti, I; Bozzoni, I; Pierandrei-Amaldi, P; Amaldi, F

1985-01-01

Ribosomal protein L1 is encoded by two genes in Xenopus laevis. The comparison of two cDNA sequences shows that the two L1 gene copies (L1a and L1b) have diverged in many silent sites and very few substitution sites; moreover a small duplication occurred at the very end of the coding region of the L1b gene which thus codes for a product five amino acids longer than that coded by L1a. Quantitatively the divergence between the two L1 genes confirms that a whole genome duplication took place in Xenopus laevis approximately 30 million years ago. A genomic fragment containing one of the two L1 gene copies (L1a), with its nine introns and flanking regions, has been completely sequenced. The 5' end of this gene has been mapped within a 20-pyridimine stretch as already found for other vertebrate ribosomal protein genes. Four of the nine introns have a 60-nucleotide sequence with 80% homology; within this region some boxes, one of which is 16 nucleotides long, are 100% homologous among the four introns. This feature of L1a gene introns is interesting since we have previously shown that the activity of this gene is regulated at a post-transcriptional level and it involves the block of the normal splicing of some intron sequences. Images Fig. 3. Fig. 5. PMID:3841512
Nucleotide sequences encoding a thermostable alkaline protease

DOEpatents

Wilson, David B.; Lao, Guifang

1998-01-01

Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium.
Isolation and characterization of a cDNA clone for the complete protein coding region of the delta subunit of the mouse acetylcholine receptor.

PubMed Central

LaPolla, R J; Mayne, K M; Davidson, N

1984-01-01

A mouse cDNA clone has been isolated that contains the complete coding region of a protein highly homologous to the delta subunit of the Torpedo acetylcholine receptor (AcChoR). The cDNA library was constructed in the vector lambda 10 from membrane-associated poly(A)+ RNA from BC3H-1 mouse cells. Surprisingly, the delta clone was selected by hybridization with cDNA encoding the gamma subunit of the Torpedo AcChoR. The nucleotide sequence of the mouse cDNA clone contains an open reading frame of 520 amino acids. This amino acid sequence exhibits 59% and 50% sequence homology to the Torpedo AcChoR delta and gamma subunits, respectively. However, the mouse nucleotide sequence has several stretches of high homology with the Torpedo gamma subunit cDNA, but not with delta. The mouse protein has the same general structural features as do the Torpedo subunits. It is encoded by a 3.3-kilobase mRNA. There is probably only one, but at most two, chromosomal genes coding for this or closely related sequences. Images PMID:6096870
Nucleotide sequences encoding a thermostable alkaline protease

DOEpatents

Wilson, D.B.; Lao, G.

1998-01-06

Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium. 3 figs.
Acetylcholinesterase of the sand fly, Phlebotomus papatasi (Scopoli): cDNA sequence, baculovirus expression, and biochemical properties

PubMed Central

2013-01-01

Background Millions of people and domestic animals around the world are affected by leishmaniasis, a disease caused by various species of flagellated protozoans in the genus Leishmania that are transmitted by several sand fly species. Insecticides are widely used for sand fly population control to try to reduce or interrupt Leishmania transmission. Zoonotic cutaneous leishmaniasis caused by L. major is vectored mainly by Phlebotomus papatasi (Scopoli) in Asia and Africa. Organophosphates comprise a class of insecticides used for sand fly control, which act through the inhibition of acetylcholinesterase (AChE) in the central nervous system. Point mutations producing an altered, insensitive AChE are a major mechanism of organophosphate resistance in insects and preliminary evidence for organophosphate-insensitive AChE has been reported in sand flies. This report describes the identification of complementary DNA for an AChE in P. papatasi and the biochemical characterization of recombinant P. papatasi AChE. Methods A P. papatasi Israeli strain laboratory colony was utilized to prepare total RNA utilized as template for RT-PCR amplification and sequencing of cDNA encoding acetylcholinesterase 1 using gene specific primers and 3’-5’-RACE. The cDNA was cloned into pBlueBac4.5/V5-His TOPO, and expressed by baculovirus in Sf21 insect cells in serum-free medium. Recombinant P. papatasi acetylcholinesterase was biochemically characterized using a modified Ellman’s assay in microplates. Results A 2309 nucleotide sequence of PpAChE1 cDNA [GenBank: JQ922267] of P. papatasi from a laboratory colony susceptible to insecticides is reported with 73-83% nucleotide identity to acetylcholinesterase mRNA sequences of Culex tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a 710-amino acid protein [GenBank: AFP20868] exhibiting 85% amino acid identity with acetylcholinesterases of Cx. pipiens, Aedes aegypti, and 92% amino acid identity for
Sequence verification as quality-control step for production of cDNA microarrays.

PubMed

Taylor, E; Cogdell, D; Coombes, K; Hu, L; Ramdas, L; Tabor, A; Hamilton, S; Zhang, W

2001-07-01

To generate cDNA arrays in our core laboratory, we amplified about 2300 PCR products from a human, sequence-verified cDNA clone library. As a quality-control step, we sequenced the PCR products immediately before printing. The sequence information was used to search the GenBank database to confirm the identities. Although these clones were previously sequence verified by the company, we found that only 79% of the clones matched the original database after handling. Our experience strongly indicates the necessity to sequence verify the clones at the final stage before printing on microarray slides and to modify the gene list accordingly.
Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

PubMed Central

2011-01-01

Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot

Nucleotide sequence of a complementary DNA encoding pea cytosolic copper/zinc superoxide dismutase. [Pisum sativum L

DOE Office of Scientific and Technical Information (OSTI.GOV)

White, D.A.; Zilinskas, B.A.

1991-08-01

The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity)more » with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).« less
Nucleotide sequence determination of guinea-pig casein B mRNA reveals homology with bovine and rat alpha s1 caseins and conservation of the non-coding regions of the mRNA.

PubMed Central

Hall, L; Laird, J E; Craig, R K

1984-01-01

Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375
Isolation, nucleotide sequence and expression of a cDNA encoding feline granulocyte colony-stimulating factor.

PubMed

Dunham, S P; Onions, D E

2001-06-21

A cDNA encoding feline granulocyte colony stimulating factor (fG-CSF) was cloned from alveolar macrophages using the reverse transcriptase-polymerase chain reaction. The cDNA is 949 bp in length and encodes a predicted mature protein of 174 amino acids. Recombinant fG-CSF was expressed as a glutathione S-transferase fusion and purified by affinity chromatography. Biological activity of the recombinant protein was demonstrated using the murine myeloblastic cell line GNFS-60, which showed an ED50 for fG-CSF of approximately 2 ng/ml. Copyright 2001 Academic Press.
Base Preferences in Non-Templated Nucleotide Incorporation by MMLV-Derived Reverse Transcriptases

PubMed Central

Zajac, Pawel; Islam, Saiful; Hochgerner, Hannah; Lönnerberg, Peter; Linnarsson, Sten

2013-01-01

Reverse transcriptases derived from Moloney Murine Leukemia Virus (MMLV) have an intrinsic terminal transferase activity, which causes the addition of a few non-templated nucleotides at the 3´ end of cDNA, with a preference for cytosine. This mechanism can be exploited to make the reverse transcriptase switch template from the RNA molecule to a secondary oligonucleotide during first-strand cDNA synthesis, and thereby to introduce arbitrary barcode or adaptor sequences in the cDNA. Because the mechanism is relatively efficient and occurs in a single reaction, it has recently found use in several protocols for single-cell RNA sequencing. However, the base preference of the terminal transferase activity is not known in detail, which may lead to inefficiencies in template switching when starting from tiny amounts of mRNA. Here, we used fully degenerate oligos to determine the exact base preference at the template switching site up to a distance of ten nucleotides. We found a strong preference for guanosine at the first non-templated nucleotide, with a greatly reduced bias at progressively more distant positions. Based on this result, and a number of careful optimizations, we report conditions for efficient template switching for cDNA amplification from single cells. PMID:24392002
Nucleotide sequences of two genomic DNAs encoding peroxidase of Arabidopsis thaliana.

PubMed

Intapruk, C; Higashimura, N; Yamamoto, K; Okada, N; Shinmyo, A; Takano, M

1991-02-15

The peroxidase (EC 1.11.1.7)-encoding gene of Arabidopsis thaliana was screened from a genomic library using a cDNA encoding a neutral isozyme of horseradish, Armoracia rusticana, peroxidase (HRP) as a probe, and two positive clones were isolated. From the comparison with the sequences of the HRP-encoding genes, we concluded that two clones contained peroxidase-encoding genes, and they were named prxCa and prxEa. Both genes consisted of four exons and three introns; the introns had consensus nucleotides, GT and AG, at the 5' and 3' ends, respectively. The lengths of each putative exon of the prxEa gene were the same as those of the HRP-basic-isozyme-encoding gene, prxC3, and coded for 349 amino acids (aa) with a sequence homology of 89% to that encoded by prxC3. The prxCa gene was very close to the HRP-neutral-isozyme-encoding gene, prxC1b, and coded for 354 aa with 91% homology to that encoded by prxC1b. The aa sequence homology was 64% between the two peroxidases encoded by prxCa and prxEa.
The complete nucleotide sequence of RNA 3 of a peach isolate of Prunus necrotic ringspot virus.

PubMed

Hammond, R W; Crosslin, J M

1995-04-01

The complete nucleotide sequence of RNA 3 of the PE-5 peach isolate of Prunus necrotic ringspot ilarvirus (PNRSV) was obtained from cloned cDNA. The RNA sequence is 1941 nucleotides and contains two open reading frames (ORFs). ORF 1 consisted of 284 amino acids with a calculated molecular weight of 31,729 Da and ORF 2 contained 224 amino acids with a calculated molecular weight of 25,018 Da. ORF 2 corresponds to the coat protein gene. Expression of ORF 2 engineered into a pTrcHis vector in Escherichia coli results in a fusion polypeptide of approximately 28 kDa which cross-reacts with PNRSV polyclonal antiserum. Analysis of the coat protein amino acid sequence reveals a putative "zinc-finger" domain at the amino-terminal portion of the protein. Two tetranucleotide AUGC motifs occur in the 3'-UTR of the RNA and may function in coat protein binding and genome activation. ORF 1 homologies to other ilarviruses and alfalfa mosaic virus are confined to limited regions of conserved amino acids. The translated amino acid sequence of the coat protein gene shows 92% similarity to one isolate of apple mosaic virus, a closely related member of the ilarvirus group of plant viruses, but only 66% similarity to the amino acid sequence of the coat protein gene of a second isolate. These relationships are also reflected at the nucleotide sequence level. These results in one instance confirm the close similarities observed at the biophysical and serological levels between these two viruses, but on the other hand call into question the nomenclature used to describe these viruses.
Analysis of beta-carotene hydroxylase gene cDNA isolated from the American oil-palm (Elaeis oleifera) mesocarp tissue cDNA library

PubMed Central

Bhore, Subhash J; Kassim, Amelia; Loh, Chye Ying; Shah, Farida H

2010-01-01

It is well known that the nutritional quality of the American oil-palm (Elaeis oleifera) mesocarp oil is superior to that of African oil-palm (Elaeis guineensis Jacq. Tenera) mesocarp oil. Therefore, it is of important to identify the genetic features for its superior value. This could be achieved through the genome sequencing of the oil-palm. However, the genome sequence is not available in the public domain due to commercial secrecy. Hence, we constructed a cDNA library and generated expressed sequence tags (3,205) from the mesocarp tissue of the American oil-palm. We continued to annotate each of these cDNAs after submitting to GenBank/DDBJ/EMBL. A rough analysis turned our attention to the beta-carotene hydroxylase (Chyb) enzyme encoding cDNA. Then, we completed the full sequencing of cDNA clone for its both strands using M13 forward and reverse primers. The full nucleotide and protein sequence was further analyzed and annotated using various Bioinformatics tools. The analysis results showed the presence of fatty acid hydroxylase superfamily domain in the protein sequence. The multiple sequence alignment of selected Chyb amino acid sequences from other plant species and algal members with E. oleifera Chyb using ClustalW and its phylogenetic analysis suggest that Chyb from monocotyledonous plant species, Lilium hubrid, Crocus sativus and Zea mays are the most evolutionary related with E. oleifera Chyb. This study reports the annotation of E. oleifera Chyb. Abbreviations ESTs - expressed sequence tags, EoChyb - Elaeis oleifera beta-carotene hydroxylase, MC - main cluster PMID:21364789
RICD: a rice indica cDNA database resource for rice functional genomics.

PubMed

Lu, Tingting; Huang, Xuehui; Zhu, Chuanrang; Huang, Tao; Zhao, Qiang; Xie, Kabing; Xiong, Lizhong; Zhang, Qifa; Han, Bin

2008-11-26

The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
Minimap2: pairwise alignment for nucleotide sequences.

PubMed

Li, Heng

2018-05-10

Recent advances in sequencing technologies promise ultra-long reads of ∼100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms. Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of ≥ 100bp in length, ≥1kb genomic reads at error rate ∼15%, full-length noisy Direct RNA or cDNA reads, and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap cost for long insertions and deletions (INDELs) and introduces new heuristics to reduce spurious alignments. It is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mappers at higher accuracy, surpassing most aligners specialized in one type of alignment. https://github.com/lh3/minimap2. hengli@broadinstitute.org.
DNA Nucleotide Sequence Restricted by the RI Endonuclease

PubMed Central

Hedgpeth, Joe; Goodman, Howard M.; Boyer, Herbert W.

1972-01-01

The sequence of DNA base pairs adjacent to the phosphodiester bonds cleaved by the RI restriction endonuclease in unmodified DNA from coliphage λ has been determined. The 5′-terminal nucleotide labeled with 32P and oligonucleotides up to the heptamer were analyzed from a pancreatic DNase digest. The following sequence of nucleotides adjacent to the RI break made in λ DNA was deduced from these data and from the 3′-dinucleotide sequence and nearest-neighbor analysis obtained from repair synthesis with the DNA polymerase of Rous sarcoma virus [Formula: see text] The RI endonuclease cleavage of the phosphodiester bonds (indicated by arrows) generates 5′-phosphoryls and short cohesive termini of four nucleotides, pApApTpT. The most striking feature of the sequence is its symmetry. PMID:4343974
Complete nucleotide and derived amino acid sequence of cDNA encoding the mitochondrial uncoupling protein of rat brown adipose tissue: lack of a mitochondrial targeting presequence.

PubMed Central

Ridley, R G; Patel, H V; Gerber, G E; Morton, R C; Freeman, K B

1986-01-01

A cDNA clone spanning the entire amino acid sequence of the nuclear-encoded uncoupling protein of rat brown adipose tissue mitochondria has been isolated and sequenced. With the exception of the N-terminal methionine the deduced N-terminus of the newly synthesized uncoupling protein is identical to the N-terminal 30 amino acids of the native uncoupling protein as determined by protein sequencing. This proves that the protein contains no N-terminal mitochondrial targeting prepiece and that a targeting region must reside within the amino acid sequence of the mature protein. Images PMID:3012461
Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

PubMed

Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

2016-08-24

To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules.
Construction of Infectious cDNA Clone of a Chrysanthemum stunt viroid Korean Isolate

PubMed Central

Yoon, Ju-Yeon; Cho, In-Sook; Choi, Gug-Seoun; Choi, Seung-Kook

2014-01-01

Chrysanthemum stunt viroid (CSVd), a noncoding infectious RNA molecule, causes seriously economic losses of chrysanthemum for 3 or 4 years after its first infection. Monomeric cDNA clones of CSVd isolate SK1 (CSVd-SK1) were constructed in the plasmids pGEM-T easy vector and pUC19 vector. Linear positive-sense transcripts synthesized in vitro from the full-length monomeric cDNA clones of CSVd-SK1 could infect systemically tomato seedlings and chrysanthemum plants, suggesting that the linear CSVd RNA transcribed from the cDNA clones could be replicated as efficiently as circular CSVd in host species. However, direct inoculation of plasmid cDNA clones containing full-length monomeric cDNA of CSVd-SK1 failed to infect tomato and chrysanthemum and linear negative-sense transcripts from the plasmid DNAs were not infectious in the two plant species. The cDNA sequences of progeny viroid in systemically infected tomato and chrysanthemum showed a few substitutions at a specific nucleotide position, but there were no deletions and insertions in the sequences of the CSVd progeny from tomato and chrysanthemum plants. PMID:25288987
cDNA cloning of the human peroxisomal enoyl-CoA hydratase: 3-Hydroxyacyl-CoA dehydrogenase bifunctional enzyme and localization to chromosome 3q26. 3-3q28: A free left Alu arm is inserted in the 3[prime] noncoding region

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoefler, G.; Forstner, M.; Hulla, W.

1994-01-01

Enoyl-CoA hydratase:3-hydroxyacyl-CoA dehydrogenase bifunctional enzyme is one of the four enzymes of the peroxisomal, [beta]-oxidation pathway. Here, the authors report the full-length human cDNA sequence and the localization of the corresponding gene on chromosome 3q26.3-3q28. The cDNA sequence spans 3779 nucleotides with an open reading frame of 2169 nucleotides. The tripeptide SKL at the carboxy terminus, known to serve as a peroxisomal targeting signal, is present. DNA sequence comparison of the coding region showed an 80% homology between human and rat bifunctional enzyme cDNA. The 3[prime] noncoding sequence contains 117 nucleotides homologous to an Alu repeat. Based on sequence comparison,more » they propose that these nucleotides are a free left Alu arm with 86% homology to the Alu-J family. RNA analysis shows one band with highest intensity in liver and kidney. This cDNA will allow in-depth studies of molecular defects in patients with defective peroxisomal bifunctional enzyme. Moreover, it will also provide a means for studying the regulation of peroxisomal [beta]-oxidation in humans. 33 refs., 5 figs.« less
77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-10-29

... DEPARTMENT OF COMMERCE Patent and Trademark Office Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request... Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of...
Molecular Cloning and Characterization of cDNA Encoding a Putative Stress-Induced Heat-Shock Protein from Camelus dromedarius

PubMed Central

Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.

2011-01-01

Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
Characterization and mapping of cDNA encoding aspartate aminotransferase in rice, Oryza sativa L.

PubMed

Song, J; Yamamoto, K; Shomura, A; Yano, M; Minobe, Y; Sasaki, T

1996-10-31

Fifteen cDNA clones, putatively identified as encoding aspartate aminotransferase (AST, EC 2.6.1.1.), were isolated and partially sequenced. Together with six previously isolated clones putatively identified to encode ASTs (Sasaki, et al. 1994, Plant Journal 6, 615-624), their sequences were characterized and classified into 4 cDNA species. Two of the isolated clones, C60213 and C2079, were full-length cDNAs, and their complete nucleotide sequences were determined. C60213 was 1612 bp long and its deduced amino acid sequence showed 88% homology with that of Panicum miliaceum L. mitochondrial AST. The C60213-encoded protein had an N-terminal amino acid sequence that was characteristic of a mitochondrial transit peptide. On the other hand, C2079 was 1546 bp long and had 91% amino acid sequence homology with P. miliaceum L. cytosolic AST but lacked in the transit peptide sequence. The homologies of nucleotide sequences and deduced amino acid sequences of C2079 and C60213 were 54% and 52%, respectively. C2079 and C60213 were mapped on chromosomes 1 and 6, respectively, by restriction fragment length polymorphism linkage analysis. Northern blot analysis using C2079 as a probe revealed much higher transcript levels in callus and root than in green and etiolated shoots, suggesting tissue-specific variations of AST gene expression.
A putative peroxidase cDNA from turnip and analysis of the encoded protein sequence.

PubMed

Romero-Gómez, S; Duarte-Vázquez, M A; García-Almendárez, B E; Mayorga-Martínez, L; Cervantes-Avilés, O; Regalado, C

2008-12-01

A putative peroxidase cDNA was isolated from turnip roots (Brassica napus L. var. purple top white globe) by reverse transcriptase-polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends (RACE). Total RNA extracted from mature turnip roots was used as a template for RT-PCR, using a degenerated primer designed to amplify the highly conserved distal motif of plant peroxidases. The resulting partial sequence was used to design the rest of the specific primers for 5' and 3' RACE. Two cDNA fragments were purified, sequenced, and aligned with the partial sequence from RT-PCR, and a complete overlapping sequence was obtained and labeled as BbPA (Genbank Accession No. AY423440, named as podC). The full length cDNA is 1167bp long and contains a 1077bp open reading frame (ORF) encoding a 358 deduced amino acid peroxidase polypeptide. The putative peroxidase (BnPA) showed a calculated Mr of 34kDa, and isoelectric point (pI) of 4.5, with no significant identity with other reported turnip peroxidases. Sequence alignment showed that only three peroxidases have a significant identity with BnPA namely AtP29a (84%), and AtPA2 (81%) from Arabidopsis thaliana, and HRPA2 (82%) from horseradish (Armoracia rusticana). Work is in progress to clone this gene into an adequate host to study the specific role and possible biotechnological applications of this alternative peroxidase source.
Sequencing of cDNA Clones from the Genetic Map of Tomato (Lycopersicon esculentum)

PubMed Central

Ganal, Martin W.; Czihal, Rosemarie; Hannappel, Ulrich; Kloos, Dorothee-U.; Polley, Andreas; Ling, Hong-Qing

1998-01-01

The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes was found. The information will permit the conversion of those markers to STS markers and allow their use in PCR-based mapping experiments. Furthermore, it will facilitate the comparative mapping of genes across distantly related plant species by direct comparison of DNA sequences and map positions. [cDNA sequence data reported in this paper have been submitted to the EMBL database under accession nos. AA824695–AA825005 and the dbEST_Id database under accession nos. 1546519–1546862.] PMID:9724330
Population structure of pigs determined by single nucleotide polymorphisms observed in assembled expressed sequence tags.

PubMed

Matsumoto, Toshimi; Okumura, Naohiko; Uenishi, Hirohide; Hayashi, Takeshi; Hamasima, Noriyuki; Awata, Takashi

2012-01-01

We have collected more than 190000 porcine expressed sequence tags (ESTs) from full-length complementary DNA (cDNA) libraries and identified more than 2800 single nucleotide polymorphisms (SNPs). In this study, we tentatively chose 222 SNPs observed in assembled ESTs to study pigs of different breeds; 104 were selected by comparing the cDNA sequences of a Meishan pig and samples of three-way cross pigs (Landrace, Large White, and Duroc: LWD), and 118 were selected from LWD samples. To evaluate the genetic variation between the chosen SNPs from pig breeds, we determined the genotypes for 192 pig samples (11 pig groups) from our DNA reference panel with matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Of the 222 reference SNPs, 186 were successfully genotyped. A neighbor-joining tree showed that the pig groups were classified into two large clusters, namely, Euro-American and East Asian pig populations. F-statistics and the analysis of molecular variance of Euro-American pig groups revealed that approximately 25% of the genetic variations occurred because of intergroup differences. As the F(IS) values were less than the F(ST) values(,) the clustering, based on the Bayesian inference, implied that there was strong genetic differentiation among pig groups and less divergence within the groups in our samples. © 2011 The Authors. Animal Science Journal © 2011 Japanese Society of Animal Science.

Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

PubMed Central

2012-01-01

Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742
Nucleotide sequence and structural organization of the human vasopressin pituitary receptor (V3) gene.

PubMed

René, P; Lenne, F; Ventura, M A; Bertagna, X; de Keyzer, Y

2000-01-04

In the pituitary, vasopressin triggers ACTH release through a specific receptor subtype, termed V3 or V1b. We cloned the V3 cDNA and showed that its expression was almost exclusive to pituitary corticotrophs and some corticotroph tumors. To study the determinants of this tissue specificity, we have now cloned the gene for the human (h) V3 receptor and characterized its structure. It is composed of two exons, spanning 10kb, with the coding region interrupted between transmembrane domains 6 and 7. We established that the transcription initiation site is located 498 nucleotides upstream of the initiator codon and showed that two polyadenylation sites may be used, while the most frequent is the most downstream. Sequence analysis of the promoter region showed no TATA box but identified consensus binding motifs for Sp1, CREB, and half sites of the estrogen receptor binding site. However comparison with another corticotroph-specific gene, proopiomelanocortin, did not identify common regulatory elements in the two promoters except for a short GC-rich region. Unexpectedly, hV3 gene analysis revealed that a formerly cloned 'artifactual' hV3 cDNA indeed corresponded to a spliced antisense transcript, overlapping the 5' part of the coding sequence in exon 1 and the promoter region. This transcript, hV3rev, was detected in normal pituitary and in many corticotroph tumors expressing hV3 sense mRNA and may therefore play a role in hV3 gene expression.
Cloning a Chymotrypsin-Like 1 (CTRL-1) Protease cDNA from the Jellyfish Nemopilema nomurai

PubMed Central

Heo, Yunwi; Kwon, Young Chul; Bae, Seong Kyeong; Hwang, Duhyeon; Yang, Hye Ryeon; Choudhary, Indu; Lee, Hyunkyoung; Yum, Seungshic; Shin, Kyoungsoon; Yoon, Won Duk; Kang, Changkeun; Kim, Euikyung

2016-01-01

An enzyme in a nematocyst extract of the Nemopilema nomurai jellyfish, caught off the coast of the Republic of Korea, catalyzed the cleavage of chymotrypsin substrate in an amidolytic kinetic assay, and this activity was inhibited by the serine protease inhibitor, phenylmethanesulfonyl fluoride. We isolated the full-length cDNA sequence of this enzyme, which contains 850 nucleotides, with an open reading frame of 801 encoding 266 amino acids. A blast analysis of the deduced amino acid sequence showed 41% identity with human chymotrypsin-like (CTRL) and the CTRL-1 precursor. Therefore, we designated this enzyme N. nomurai CTRL-1. The primary structure of N. nomurai CTRL-1 includes a leader peptide and a highly conserved catalytic triad of His69, Asp117, and Ser216. The disulfide bonds of chymotrypsin and the substrate-binding sites are highly conserved compared with the CTRLs of other species, including mammalian species. Nemopilema nomurai CTRL-1 is evolutionarily more closely related to Actinopterygii than to Scyphozoan (Aurelia aurita) or Hydrozoan (Hydra vulgaris). The N. nomurai CTRL1 was amplified from the genomic DNA with PCR using specific primers designed based on the full-length cDNA, and then sequenced. The N. nomurai CTRL1 gene contains 2434 nucleotides and four distinct exons. The 5′ donor splice (GT) and 3′ acceptor splice sequences (AG) are wholly conserved. This is the first report of the CTRL1 gene and cDNA structures in the jellyfish N. nomurai. PMID:27399771
Cloning a Chymotrypsin-Like 1 (CTRL-1) Protease cDNA from the Jellyfish Nemopilema nomurai.

PubMed

Heo, Yunwi; Kwon, Young Chul; Bae, Seong Kyeong; Hwang, Duhyeon; Yang, Hye Ryeon; Choudhary, Indu; Lee, Hyunkyoung; Yum, Seungshic; Shin, Kyoungsoon; Yoon, Won Duk; Kang, Changkeun; Kim, Euikyung

2016-07-05

An enzyme in a nematocyst extract of the Nemopilema nomurai jellyfish, caught off the coast of the Republic of Korea, catalyzed the cleavage of chymotrypsin substrate in an amidolytic kinetic assay, and this activity was inhibited by the serine protease inhibitor, phenylmethanesulfonyl fluoride. We isolated the full-length cDNA sequence of this enzyme, which contains 850 nucleotides, with an open reading frame of 801 encoding 266 amino acids. A blast analysis of the deduced amino acid sequence showed 41% identity with human chymotrypsin-like (CTRL) and the CTRL-1 precursor. Therefore, we designated this enzyme N. nomurai CTRL-1. The primary structure of N. nomurai CTRL-1 includes a leader peptide and a highly conserved catalytic triad of His(69), Asp(117), and Ser(216). The disulfide bonds of chymotrypsin and the substrate-binding sites are highly conserved compared with the CTRLs of other species, including mammalian species. Nemopilema nomurai CTRL-1 is evolutionarily more closely related to Actinopterygii than to Scyphozoan (Aurelia aurita) or Hydrozoan (Hydra vulgaris). The N. nomurai CTRL1 was amplified from the genomic DNA with PCR using specific primers designed based on the full-length cDNA, and then sequenced. The N. nomurai CTRL1 gene contains 2434 nucleotides and four distinct exons. The 5' donor splice (GT) and 3' acceptor splice sequences (AG) are wholly conserved. This is the first report of the CTRL1 gene and cDNA structures in the jellyfish N. nomurai.
Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

PubMed Central

Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

2009-01-01

Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA
Gene discovery in Eimeria tenella by immunoscreening cDNA expression libraries of sporozoites and schizonts with chicken intestinal antibodies.

PubMed

Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie

2003-04-02

Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.
Cloning and expression of a cDNA coding for a human monocyte-derived plasminogen activator inhibitor.

PubMed

Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P

1988-02-01

Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators.
Cloning and expression of a cDNA coding for a human monocyte-derived plasminogen activator inhibitor.

PubMed Central

Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P

1988-01-01

Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators. Images PMID:3257578
Cloning and expression of a cDNA coding for catalase from zebrafish (Danio rerio).

PubMed

Ken, C F; Lin, C T; Wu, J L; Shaw, J F

2000-06-01

A full-length complementary DNA (cDNA) clone encoding a catalase was amplified by the rapid amplication of cDNA ends-polymerase chain reaction (RACE-PCR) technique from zebrafish (Danio rerio) mRNA. Nucleotide sequence analysis of this cDNA clone revealed that it comprised a complete open reading frame coding for 526 amino acid residues and that it had a molecular mass of 59 654 Da. The deduced amino acid sequence showed high similarity with the sequences of catalase from swine (86.9%), mouse (85.8%), rat (85%), human (83.7%), fruit fly (75.6%), nematode (71.1%), and yeast (58.6%). The amino acid residues for secondary structures are apparently conserved as they are present in other mammal species. Furthermore, the coding region of zebrafish catalase was introduced into an expression vector, pET-20b(+), and transformed into Escherichia coli expression host BL21(DE3)pLysS. A 60-kDa active catalase protein was expressed and detected by Coomassie blue staining as well as activity staining on polyacrylamide gel followed electrophoresis.
Nucleotide sequencing and identification of some wild mushrooms.

PubMed

Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

2013-01-01

The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits.
Nucleotide sequences specific to Brucella and methods for the detection of Brucella

DOE Office of Scientific and Technical Information (OSTI.GOV)

McCready, Paula M; Radnedge, Lyndsay; Andersen, Gary L

Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Cloning and sequencing of the cDNA species for mammalian dimeric dihydrodiol dehydrogenases.

PubMed Central

Arimitsu, E; Aoki, S; Ishikura, S; Nakanishi, K; Matsuura, K; Hara, A

1999-01-01

Cynomolgus and Japanese monkey kidneys, dog and pig livers and rabbit lens contain dimeric dihydrodiol dehydrogenase (EC 1.3.1.20) associated with high carbonyl reductase activity. Here we have isolated cDNA species for the dimeric enzymes by reverse transcriptase-PCR from human intestine in addition to the above five animal tissues. The amino acid sequences deduced from the monkey, pig and dog cDNA species perfectly matched the partial sequences of peptides digested from the respective enzymes of these animal tissues, and active recombinant proteins were expressed in a bacterial system from the monkey and human cDNA species. Northern blot analysis revealed the existence of a single 1.3 kb mRNA species for the enzyme in these animal tissues. The human enzyme shared 94%, 85%, 84% and 82% amino acid identity with the enzymes of the two monkey strains (their sequences were identical), the dog, the pig and the rabbit respectively. The sequences of the primate enzymes consisted of 335 amino acid residues and lacked one amino acid compared with the other animal enzymes. In contrast with previous reports that other types of dihydrodiol dehydrogenase, carbonyl reductases and enzymes with either activity belong to the aldo-keto reductase family or the short-chain dehydrogenase/reductase family, dimeric dihydrodiol dehydrogenase showed no sequence similarity with the members of the two protein families. The dimeric enzyme aligned with low degrees of identity (14-25%) with several prokaryotic proteins, in which 47 residues are strictly or highly conserved. Thus dimeric dihydrodiol dehydrogenase has a primary structure distinct from the previously known mammalian enzymes and is suggested to constitute a novel protein family with the prokaryotic proteins. PMID:10477285
Cloning and sequence analysis of a full-length cDNA of SmPP1cb encoding turbot protein phosphatase 1 beta catalytic subunit

NASA Astrophysics Data System (ADS)

Qi, Fei; Guo, Huarong; Wang, Jian

2008-02-01

Reversible protein phosphorylation, catalyzed by protein kinases and phosphatases, is an important and versatile mechanism by which eukaryotic cells regulate almost all the signaling processes. Protein phosphatase 1 (PP1) is the first and well-characterized member of the protein serine/threonine phosphatase family. In the present study, a full-length cDNA encoding the beta isoform of the catalytic subunit of protein phosphatase 1(PP1cb), was for the first time isolated and sequenced from the skin tissue of flatfish turbot Scophthalmus maximus, designated SmPP1cb, by the rapid amplification of cDNA ends (RACE) technique. The cDNA sequence of SmPP1cb we obtained contains a 984 bp open reading frame (ORF), flanked by a complete 39 bp 5' untranslated region and 462 bp 3' untranslated region. The ORF encodes a putative 327 amino acid protein, and the N-terminal section of this protein is highly acidic, Met-Ala-Glu-Gly-Glu-Leu-Asp-Val-Asp, a common feature for PP1 catalytic subunit but absent in protein phosphatase 2B (PP2B). And its calculated molecular mass is 37 193 Da and pI 5.8. Sequence analysis indicated that, SmPP1cb is extremely conserved in both amino acid and nucleotide acid levels compared with the PP1cb of other vertebrates and invertebrates, and its Kozak motif contained in the 5'UTR around ATG start codon is GXXAXXGXX ATGG, which is different from mammalian in two positions A-6 and G-3, indicating the possibility of different initiation of translation in turbot, and also the 3'UTR of SmPP1cb is highly diverse in the sequence similarity and length compared with other animals, especially zebrafish. The cloning and sequencing of SmPP1cb gene lays a good foundation for the future work on the biological functions of PP1 in the flatfish turbot.
cDNA, genomic sequence cloning and overexpression of ribosomal protein S25 gene (RPS25) from the Giant Panda.

PubMed

Hao, Yan-Zhe; Hou, Wan-Ru; Hou, Yi-Ling; Du, Yu-Jie; Zhang, Tian; Peng, Zheng-Song

2009-11-01

RPS25 is a component of the 40S small ribosomal subunit encoded by RPS25 gene, which is specific to eukaryotes. Studies in reference to RPS25 gene from animals were handful. The Giant Panda (Ailuropoda melanoleuca), known as a "living fossil", are increasingly concerned by the world community. Studies on RPS25 of the Giant Panda could provide scientific data for inquiring into the hereditary traits of the gene and formulating the protective strategy for the Giant Panda. The cDNA of the RPS25 cloned from Giant Panda is 436 bp in size, containing an open reading frame of 378 bp encoding 125 amino acids. The length of the genomic sequence is 1,992 bp, which was found to possess four exons and three introns. Alignment analysis indicated that the nucleotide sequence of the coding sequence shows a high homology to those of Homo sapiens, Bos taurus, Mus musculus and Rattus norvegicus as determined by Blast analysis, 92.6, 94.4, 89.2 and 91.5%, respectively. Primary structure analysis revealed that the molecular weight of the putative RPS25 protein is 13.7421 kDa with a theoretical pI 10.12. Topology prediction showed there is one N-glycosylation site, one cAMP and cGMP-dependent protein kinase phosphorylation site, two Protein kinase C phosphorylation sites and one Tyrosine kinase phosphorylation site in the RPS25 protein of the Giant Panda. The RPS25 gene was overexpressed in E. coli BL21 and Western Blotting of the RPS25 protein was also done. The results indicated that the RPS25 gene can be really expressed in E. coli and the RPS25 protein fusioned with the N-terminally his-tagged form gave rise to the accumulation of an expected 17.4 kDa polypeptide. The cDNA and the genomic sequence of RPS25 were cloned successfully for the first time from the Giant Panda using RT-PCR technology and Touchdown-PCR, respectively, which were both sequenced and analyzed preliminarily; then the cDNA of the RPS25 gene was overexpressed in E. coli BL21 and immunoblotted, which is the first
Deletions of fetal and adult muscle cDNA in Duchenne and Becker muscular dystrophy patients.

PubMed Central

Cross, G S; Speer, A; Rosenthal, A; Forrest, S M; Smith, T J; Edwards, Y; Flint, T; Hill, D; Davies, K E

1987-01-01

We have isolated a cDNA molecule from a human adult muscle cDNA library which is deleted in several Duchenne muscular dystrophy patients. Patient deletions have been used to map the exons across the Xp21 region of the short arm of the X chromosome. We demonstrate that a very mildly affected 61 year old patient is deleted for at least nine exons of the adult cDNA. We find no evidence for differential exon usage between adult and fetal muscle in this region of the gene. There must therefore be less essential domains of the protein structure which can be removed without complete loss of function. The sequence of 2.0 kb of the adult cDNA shows no homology to any previously described protein listed in the data banks although sequence comparison at the amino acid level suggests that the protein has a structure not dissimilar to rod structures of cytoskeletal proteins such as lamin and myosin. There are single nucleotide differences in the DNA sequence between the adult and fetal cDNAs which result in amino acid changes but none that would be predicted to change the structure of the protein dramatically. Images Fig. 1. Fig. 2. Fig. 3. Fig. 4. Fig. 5. Fig. 7. PMID:3428261
Nucleotide sequence and genetic organization of barley stripe mosaic virus RNA gamma.

PubMed

Gustafson, G; Hunter, B; Hanau, R; Armour, S L; Jackson, A O

1987-06-01

The complete nucleotide sequences of RNA gamma from the Type and ND18 strains of barley stripe mosaic virus (BSMV) have been determined. The sequences are 3164 (Type) and 2791 (ND18) nucleotides in length. Both sequences contain a 5'-noncoding region (87 or 88 nucleotides) which is followed by a long open reading frame (ORF1). A 42-nucleotide intercistronic region separates ORF1 from a second, shorter open reading frame (ORF2) located near the 3'-end of the RNA. There is a high degree of homology between the Type and ND18 strains in the nucleotide sequence of ORF1. However, the Type strain contains a 366 nucleotide direct tandem repeat within ORF1 which is absent in the ND18 strain. Consequently, the predicted translation product of Type RNA gamma ORF1 (mol wt 87,312) is significantly larger than that of ND18 RNA gamma ORF1 (mol wt 74,011). The amino acid sequence of the ORF1 polypeptide contains homologies with putative RNA polymerases from other RNA viruses, suggesting that this protein may function in replication of the BSMV genome. The nucleotide sequence of RNA gamma ORF2 is nearly identical in the Type and ND18 strains. ORF2 codes for a polypeptide with a predicted molecular weight of 17,209 (Type) or 17,074 (ND18) which is known to be translated from a subgenomic (sg) RNA. The initiation point of this sgRNA has been mapped to a location 27 nucleotides upstream of the ORF2 initiation codon in the intercistronic region between ORF1 and ORF2. The sgRNA is not coterminal with the 3'-end of the genomic RNA, but instead contains heterogeneous poly(A) termini up to 150 nucleotides long (J. Stanley, R. Hanau, and A. O. Jackson, 1984, Virology 139, 375-383). In the genomic RNA gamma, ORF2 is followed by a short poly(A) tract and a 238-nucleotide tRNA-like structure.
VarDetect: a nucleotide sequence variation exploratory tool

PubMed Central

Ngamphiw, Chumpol; Kulawonganunchai, Supasak; Assawamakin, Anunchai; Jenwitheesuk, Ekachai; Tongsima, Sissades

2008-01-01

Background Single nucleotide polymorphisms (SNPs) are the most commonly studied units of genetic variation. The discovery of such variation may help to identify causative gene mutations in monogenic diseases and SNPs associated with predisposing genes in complex diseases. Accurate detection of SNPs requires software that can correctly interpret chromatogram signals to nucleotides. Results We present VarDetect, a stand-alone nucleotide variation exploratory tool that automatically detects nucleotide variation from fluorescence based chromatogram traces. Accurate SNP base-calling is achieved using pre-calculated peak content ratios, and is enhanced by rules which account for common sequence reading artifacts. The proposed software tool is benchmarked against four other well-known SNP discovery software tools (PolyPhred, novoSNP, Genalys and Mutation Surveyor) using fluorescence based chromatograms from 15 human genes. These chromatograms were obtained from sequencing 16 two-pooled DNA samples; a total of 32 individual DNA samples. In this comparison of automatic SNP detection tools, VarDetect achieved the highest detection efficiency. Availability VarDetect is compatible with most major operating systems such as Microsoft Windows, Linux, and Mac OSX. The current version of VarDetect is freely available at . PMID:19091032
LEDGF/p75 Deficiency Increases Deletions at the HIV-1 cDNA Ends.

PubMed

Bueno, Murilo T D; Reyes, Daniel; Llano, Manuel

2017-09-15

Processing of unintegrated linear HIV-1 cDNA by the host DNA repair system results in its degradation and/or circularization. As a consequence, deficient viral cDNA integration generally leads to an increase in the levels of HIV-1 cDNA circles containing one or two long terminal repeats (LTRs). Intriguingly, impaired HIV-1 integration in LEDGF/p75-deficient cells does not result in a correspondent increase in viral cDNA circles. We postulate that increased degradation of unintegrated linear viral cDNA in cells lacking the lens epithelium-derived growth factor (LEDGF/p75) account for this inconsistency. To evaluate this hypothesis, we characterized the nucleotide sequence spanning 2-LTR junctions isolated from LEDGF/p75-deficient and control cells. LEDGF/p75 deficiency resulted in a significant increase in the frequency of 2-LTRs harboring large deletions. Of note, these deletions were dependent on the 3' processing activity of integrase and were not originated by aberrant reverse transcription. Our findings suggest a novel role of LEDGF/p75 in protecting the unintegrated 3' processed linear HIV-1 cDNA from exonucleolytic degradation.
The nucleotide sequence and genome organization of Plasmopara halstedii virus.

PubMed

Heller-Dohmen, Marion; Göpfert, Jens C; Pfannstiel, Jens; Spring, Otmar

2011-03-17

Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2) were established. RNA1 consisted of 2793 nucleotides (nt) exclusive its 3' poly(A) tract and a single open-reading frame (ORF1) of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR) of 18 nt and a 3' untranslated region (3' UTR) of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp) and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A) and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A) tract and a second ORF (ORF2) of 1128 nt. ORF2 coded for the single viral coat protein (CP) and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb) and RNA2 (ca. 1.4 kb) were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. The results showed the presence of a single and new virus type in different P. halstedii isolates
Purification, characterization, and cDNA cloning of a novel acidic endoglycoceramidase from the jellyfish, Cyanea nozakii.

PubMed

Horibata, Y; Okino, N; Ichinose, S; Omori, A; Ito, M

2000-10-06

Endoglycoceramidase (EC ) is an enzyme capable of cleaving the glycosidic linkage between oligosaccharides and ceramides in various glycosphingolipids. We report here the purification, characterization, and cDNA cloning of a novel endoglycoceramidase from the jellyfish, Cyanea nozakii. The purified enzyme showed a single protein band estimated to be 51 kDa on SDS-polyacrylamide gel electrophoresis. The enzyme showed a pH optimum of 3.0 and was activated by Triton X-100 and Lubrol PX but not by sodium taurodeoxycholate. This enzyme preferentially hydrolyzed gangliosides, especially GT1b and GQ1b, whereas neutral glycosphingolipids were somewhat resistant to hydrolysis by the enzyme. A full-length cDNA encoding the enzyme was cloned by 5'- and 3'-rapid amplification of cDNA ends using a partial amino acid sequence of the purified enzyme. The open reading frame of 1509 nucleotides encoded a polypeptide of 503 amino acids including a signal sequence of 25 residues and six potential N-glycosylation sites. Interestingly, the Asn-Glu-Pro sequence, which is the putative active site of Rhodococcus endoglycoceramidase, was conserved in the deduced amino acid sequences. This is the first report of the cloning of an endoglycoceramidase from a eukaryote.

37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.

PubMed Central

Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T

1993-01-01

A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829
A rapid and cost-effective method for sequencing pooled cDNA clones by using a combination of transposon insertion and Gateway technology.

PubMed

Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide

2011-09-01

Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Vacuolar H[sup +]-ATPase 69-kilodalton catalytic subunit cDNA from developing cotton (Gossypium hirsutum) ovules

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wilkins, T.A.

1993-06-01

This study investigates the molecular events of vacuole ontogeny in rapidly elongated cotton plant cells. Within the DNA coding region, the cotton and carrot cDNA clones exhibit 82.2% nucleotide sequence homology; at the amino acid level cotton and carrot catalytic subunits exhibited 95.7% identity and 2.1% amino acid similarity. When aligned with the analogous sequences from yeast, the cotton protein shared only 60.5% amino acid identity and 12.7% similarity. 10 refs., 1 tab.
Quantum Point Contact Single-Nucleotide Conductance for DNA and RNA Sequence Identification.

PubMed

Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant

2017-11-28

Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.
Partial nucleotide sequences, and routine typing by polymerase chain reaction-restriction fragment length polymorphism, of the brown trout (Salmo trutta) lactate dehydrogenase, LDH-C1*90 and *100 alleles.

PubMed

McMeel, O M; Hoey, E M; Ferguson, A

2001-01-01

The cDNA nucleotide sequences of the lactate dehydrogenase alleles LDH-C1*90 and *100 of brown trout (Salmo trutta) were found to differ at position 308 where an A is present in the *100 allele but a G is present in the *90 allele. This base substitution results in an amino acid change from aspartic acid at position 82 in the LDH-C1 100 allozyme to a glycine in the 90 allozyme. Since aspartic acid has a net negative charge whilst glycine is uncharged, this is consistent with the electrophoretic observation that the LDH-C1 100 allozyme has a more anodal mobility relative to the LDH-C1 90 allozyme. Based on alignment of the cDNA sequence with the mouse genomic sequence, a local primer set was designed, incorporating the variable position, and was found to give very good amplification with brown trout genomic DNA. Sequencing of this fragment confirmed the difference in both homozygous and heterozygous individuals. Digestion of the polymerase chain reaction products with BslI, a restriction enzyme specific for the site difference, gave one, two and three fragments for the two homozygotes and the heterozygote, respectively, following electrophoretic separation. This provides a DNA-based means of routine screening of the highly informative LDH-C1* polymorphism in brown trout population genetic studies. Primer sets presented could be used to sequence cDNA of other LDH* genes of brown trout and other species.
Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing.

PubMed

Hargreaves, Adam D; Mulley, John F

2015-01-01

Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0-2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5' and 3' UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species.
Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing

PubMed Central

Hargreaves, Adam D.

2015-01-01

Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0–2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5′ and 3′ UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species. PMID:26623194
Structure and characterization of a cDNA clone for phenylalanine ammonia-lyase from cut-injured roots of sweet potato

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tanaka, Yoshiyuki; Matsuoka, Makoto; Yamanoto, Naoki

A cDNA clone for phenylalanine ammonia-lyase (PAL) induced in wounded sweet potato (Ipomoea batatas Lam.) root was obtained by immunoscreening a cDNA library. The protein produced in Escherichia coli cells containing the plasmid pPAL02 was indistinguishable from sweet potato PAL as judged by Ouchterlony double diffusion assays. The M{sub r} of its subunit was 77,000. The cells converted ({sup 14}C)-L-phenylalanine into ({sup 14}C)-t-cinnamic acid and PAL activity was detected in the homogenate of the cells. The activity was dependent on the presence of the pPAL02 plasmid DNA. The nucleotide sequence of the cDNA contained a 2,121-base pair (bp) open-reading framemore » capable of coding for a polypeptide with 707 amino acids (M{sub r} 77,137), a 22-bp 5{prime}-noncoding region and a 207-bp 3{prime}-noncoding region. The results suggest that the insert DNA fully encoded the amino acid sequence for sweet potato PAL that is induced by wounding. Comparison of the deduced amino acid sequence with that of a PAL cDNA fragment from Phaseolus vulgaris revealed 78.9% homology. The sequence from amino acid residues 258 to 494 was highly conserved, showing 90.7% homology.« less
Illumina sequencing of green stink bug nymph and adult cdna to identify potential rnai gene targets

USDA-ARS?s Scientific Manuscript database

Whole-body transcriptomes for nymphs and adults of the green stink bug, Acrosternum hilare (Say), were sequenced on an Illumina® Genome Analyzer IIx sequencer. The insects were collected from sites in North Carolina and Virginia, USA. The cDNA library for each sample was sequenced on one lane of an...
Interactive computer programs for the graphic analysis of nucleotide sequence data.

PubMed Central

Luckow, V A; Littlewood, R K; Rownd, R H

1984-01-01

A group of interactive computer programs have been developed which aid in the collection and graphical analysis of nucleotide and protein sequence data. The programs perform the following basic functions: a) enter, edit, list, and rearrange sequence data; b) permit automatic entry of nucleotide sequence data directly from an autoradiograph into the computer; c) search for restriction sites or other specified patterns and plot a linear or circular restriction map, or print their locations; d) plot base composition; e) analyze homology between sequences by plotting a two-dimensional graphic matrix; and f) aid in plotting predicted secondary structures of RNA molecules. PMID:6546437
WEB-server for search of a periodicity in amino acid and nucleotide sequences

NASA Astrophysics Data System (ADS)

E Frenkel, F.; Skryabin, K. G.; Korotkov, E. V.

2017-12-01

A new web server (http://victoria.biengi.ac.ru/splinter/login.php) was designed and developed to search for periodicity in nucleotide and amino acid sequences. The web server operation is based upon a new mathematical method of searching for multiple alignments, which is founded on the position weight matrices optimization, as well as on implementation of the two-dimensional dynamic programming. This approach allows the construction of multiple alignments of the indistinctly similar amino acid and nucleotide sequences that accumulated more than 1.5 substitutions per a single amino acid or a nucleotide without performing the sequences paired comparisons. The article examines the principles of the web server operation and two examples of studying amino acid and nucleotide sequences, as well as information that could be obtained using the web server.
Acetylcholinesterase of the Sand Fly, Phlebotomus papatasi (Scopoli): cDNA Sequence, Baculovirus Expression, and Biochemical Properties

DTIC Science & Technology

2013-01-01

identity to acetylcholinesterase mRNA sequences of Culex tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a...tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a 710-amino acid protein [GenBank: AFP20868] exhibiting 85...improve effectiveness of pesticide application for control of the new world sand fly Lutzomyia longipalpis in chicken sheds [13]. Attempts to control
A general method to eliminate laboratory induced recombinants during massive, parallel sequencing of cDNA library.

PubMed

Waugh, Caryll; Cromer, Deborah; Grimm, Andrew; Chopra, Abha; Mallal, Simon; Davenport, Miles; Mak, Johnson

2015-04-09

Massive, parallel sequencing is a potent tool for dissecting the regulation of biological processes by revealing the dynamics of the cellular RNA profile under different conditions. Similarly, massive, parallel sequencing can be used to reveal the complexity of viral quasispecies that are often found in the RNA virus infected host. However, the production of cDNA libraries for next-generation sequencing (NGS) necessitates the reverse transcription of RNA into cDNA and the amplification of the cDNA template using PCR, which may introduce artefact in the form of phantom nucleic acids species that can bias the composition and interpretation of original RNA profiles. Using HIV as a model we have characterised the major sources of error during the conversion of viral RNA to cDNA, namely excess RNA template and the RNaseH activity of the polymerase enzyme, reverse transcriptase. In addition we have analysed the effect of PCR cycle on detection of recombinants and assessed the contribution of transfection of highly similar plasmid DNA to the formation of recombinant species during the production of our control viruses. We have identified RNA template concentrations, RNaseH activity of reverse transcriptase, and PCR conditions as key parameters that must be carefully optimised to minimise chimeric artefacts. Using our optimised RT-PCR conditions, in combination with our modified PCR amplification procedure, we have developed a reliable technique for accurate determination of RNA species using NGS technology.
1,4-Benzoquinone reductase from Phanerochaete chrysosporium: cDNA cloning and regulation of expression

DOE Office of Scientific and Technical Information (OSTI.GOV)

Akileswaran, L.; Brock, B.J.; Cereghino, J.L.

1999-02-01

A cDNA clone encoding a quinone reductase (QR) from the white rot basidiomycete Phanerochaete chrysosporium was isolated and sequenced. The cDNA consisted of 1,007 nucleotides and a poly(A) tail and encoded a deduced protein containing 271 amino acids. The experimentally determined eight-amino-acid N-germinal sequence of the purified QR protein from P. chrysosporium matched amino acids 72 to 79 of the predicted translation product of the cDNA. The M{sub r} of the predicted translation product, beginning with Pro-72, was essentially identical to the experimentally determined M{sub r} of one monomer of the QR dimer, and this finding suggested that QR ismore » synthesized as a proenzyme. The results of in vitro transcription-translation experiments suggested that QR is synthesized as a proenzyme with a 71-amino-acid leader sequence. This leader sequence contains two potential KEX2 cleavage sites and numerous potential cleavage sites for dipeptidyl aminopeptidase. The QR activity in cultures of P. chrysosporium increased following the addition of 2-dimethoxybenzoquinone, vanillic acid, or several other aromatic compounds. An immunoblot analysis indicated that induction resulted in an increase in the amount of QR protein, and a Northern blot analysis indicated that this regulation occurs at the level of the qr mRNA.« less
Nucleotide sequences specific to Yersinia pestis and methods for the detection of Yersinia pestis

DOEpatents

McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Motin, Vladinir L [League City, TX

2009-02-24

Nucleotide sequences specific to Yersinia pestis that serve as markers or signatures for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Information capacity of nucleotide sequences and its applications.

PubMed

Sadovsky, M G

2006-05-01

The information capacity of nucleotide sequences is defined through the specific entropy of frequency dictionary of a sequence determined with respect to another one containing the most probable continuations of shorter strings. This measure distinguishes a sequence both from a random one, and from ordered entity. A comparison of sequences based on their information capacity is studied. An order within the genetic entities is found at the length scale ranged from 3 to 8. Some other applications of the developed methodology to genetics, bioinformatics, and molecular biology are discussed.
Attomole-level Genomics with Single-molecule Direct DNA, cDNA and RNA Sequencing Technologies.

PubMed

Ozsolak, Fatih

2016-01-01

With the introduction of next-generation sequencing (NGS) technologies in 2005, the domination of microarrays in genomics quickly came to an end due to NGS's superior technical performance and cost advantages. By enabling genetic analysis capabilities that were not possible previously, NGS technologies have started to play an integral role in all areas of biomedical research. This chapter outlines the low-quantity DNA and cDNA sequencing capabilities and applications developed with the Helicos single molecule DNA sequencing technology.
Human Hrs, a tyrosine kinase substrate in growth factor-stimulated cells: cDNA cloning and mapping of the gene to chromosome 17.

PubMed

Lu, L; Komada, M; Kitamura, N

1998-06-15

Hrs is a 115kDa zinc finger protein which is rapidly tyrosine phosphorylated in cells stimulated with various growth factors. We previously purified the protein from a mouse cell line and cloned its cDNA. In the present study, we cloned a human Hrs cDNA from a human placenta cDNA library by cross-hybridization, using the mouse cDNA as a probe, and determined its nucleotide sequence. The human Hrs cDNA encoded a 777-amino-acid protein whose sequence was 93% identical to that of mouse Hrs. Northern blot analysis showed that the Hrs mRNA was about 3.0kb long and was expressed in all the human adult and fetal tissues tested. In addition, we showed by genomic Southern blot analysis that the human Hrs gene was a single-copy gene with a size of about 20kb. Furthermore, the human Hrs gene was mapped to chromosome 17 by Southern blotting of genomic DNAs from human/rodent somatic cell hybrids. Copyright 1998 Elsevier Science B.V. All rights reserved.

Hibiscus latent Fort Pierce virus in Brazil and synthesis of its biologically active full-length cDNA clone.

PubMed

Gao, Ruimin; Niu, Shengniao; Dai, Weifang; Kitajima, Elliot; Wong, Sek-Man

2016-10-01

A Brazilian isolate of Hibiscus latent Fort Pierce virus (HLFPV-BR) was firstly found in a hibiscus plant in Limeira, SP, Brazil. RACE PCR was carried out to obtain the full-length sequences of HLFPV-BR which is 6453 nucleotides and has more than 99.15 % of complete genomic RNA nucleotide sequence identity with that of HLFPV Japanese isolate. The genomic structure of HLFPV-BR is similar to other tobamoviruses. It includes a 5' untranslated region (UTR), followed by open reading frames encoding for a 128-kDa protein and a 188-kDa readthrough protein, a 38-kDa movement protein, 18-kDa coat protein, and a 3' UTR. Interestingly, the unique feature of poly(A) tract is also found within its 3'-UTR. Furthermore, from the total RNA extracted from the local lesions of HLFPV-BR-infected Chenopodium quinoa leaves, a biologically active, full-length cDNA clone encompassing the genome of HLFPV-BR was amplified and placed adjacent to a T7 RNA polymerase promoter. The capped in vitro transcripts from the cloned cDNA were infectious when mechanically inoculated into C. quinoa and Nicotiana benthamiana plants. This is the first report of the presence of an isolate of HLFPV in Brazil and the successful synthesis of a biologically active HLFPV-BR full-length cDNA clone.
Nucleotide sequences of Japanese isolates of citrus vein enation virus.

PubMed

Nakazono-Nagaoka, Eiko; Fujikawa, Takashi; Iwanami, Toru

2017-03-01

The genomic sequences of five Japanese isolates of citrus vein enation virus (CVEV) isolates that induce vein enation were determined and compared with that of the Spanish isolate VE-1. The nucleotide sequences of all Japanese isolates were 5,983 nt in length. The genomic RNA of Japanese isolates had five potential open reading frames (ORF 0, ORF 1, ORF 2, ORF 3, and ORF 5) in the positive-sense strand. The nucleotide sequence identity among the Japanese isolates and Spanish isolate VE-1 ranged from 98.0% to 99.8%. Comparison of the partial amino acid sequences of ten Japanese isolates and three Spanish isolates suggested that four amino acid residues, at positions of 83, 104, and 113 in ORF 2 and position 41 in ORF 5, might be unique to some Japanese isolates.
Characterization of a cDNA encoding a protein involved in formation of the skeleton during development of the sea urchin Lytechinus pictus.

PubMed

Livingston, B T; Shaw, R; Bailey, A; Wilt, F

1991-12-01

In order to investigate the role of proteins in the formation of mineralized tissues during development, we have isolated a cDNA that encodes a protein that is a component of the organic matrix of the skeletal spicule of the sea urchin, Lytechinus pictus. The expression of the RNA encoding this protein is regulated over development and is localized to the descendents of the micromere lineage. Comparison of the sequence of this cDNA to homologous cDNAs from other species of urchin reveal that the protein is basic and contains three conserved structural motifs: a signal peptide, a proline-rich region, and an unusual region composed of a series of direct repeats. Studies on the protein encoded by this cDNA confirm the predicted reading frame deduced from the nucleotide sequence and show that the protein is secreted and not glycosylated. Comparison of the amino acid sequence to databases reveal that the repeat domain is similar to proteins that form a unique beta-spiral supersecondary structure.
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping

Treesearch

K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale

1998-01-01

DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
Complete nucleotide sequence of a monopartite Begomovirus and associated satellites infecting Carica papaya in Nepal.

PubMed

Shahid, M S; Yoshida, S; Khatri-Chhetri, G B; Briddon, R W; Natsuaki, K T

2013-06-01

Carica papaya (papaya) is a fruit crop that is cultivated mostly in kitchen gardens throughout Nepal. Leaf samples of C. papaya plants with leaf curling, vein darkening, vein thickening, and a reduction in leaf size were collected from a garden in Darai village, Rampur, Nepal in 2010. Full-length clones of a monopartite Begomovirus, a betasatellite and an alphasatellite were isolated. The complete nucleotide sequence of the Begomovirus showed the arrangement of genes typical of Old World begomoviruses with the highest nucleotide sequence identity (>99 %) to an isolate of Ageratum yellow vein virus (AYVV), confirming it as an isolate of AYVV. The complete nucleotide sequence of betasatellite showed greater than 89 % nucleotide sequence identity to an isolate of Tomato leaf curl Java betasatellite originating from Indonesian. The sequence of the alphasatellite displayed 92 % nucleotide sequence identity to Sida yellow vein China alphasatellite. This is the first identification of these components in Nepal and the first time they have been identified in papaya.
Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

DOEpatents

McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Vitalis, Elizabeth A [Livermore, CA

2007-02-06

Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

DOEpatents

McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Vitalis, Elizabeth A [Livermore, CA

2009-02-24

Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Characterization of cDNA for human tripeptidyl peptidase II: The N-terminal part of the enzyme is similar to subtilisin

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tomkinson, B.; Jonsson, A-K

1991-01-01

Tripeptidyl peptidase II is a high molecular weight serine exopeptidase, which has been purified from rat liver and human erythrocytes. Four clones, representing 4453 bp, or 90{percent} of the mRNA of the human enzyme, have been isolated from two different cDNA libraries. One clone, designated A2, was obtained after screening a human B-lymphocyte cDNA library with a degenerated oligonucleotide mixture. The B-lymphocyte cDNA library, obtained from human fibroblasts, were rescreened with a 147 bp fragment from the 5{prime} part of the A2 clone, whereby three different overlapping cDNA clones could be isolated. The deduced amino acid sequence, 1196 amino acidmore » residues, corresponding to the longest open rading frame of the assembled nucleotide sequence, was compared to sequences of current databases. This revealed a 56{percent} similarity between the bacterial enzyme subtilisin and the N-terminal part of tripeptidyl peptidase II. The enzyme was found to be represented by two different mRNAs of 4.2 and 5.0 kilobases, respectively, which probably result from the utilziation of two different polyadenylation sites. Futhermore, cDNA corresponding to both the N-terminal and C-terminal part of tripeptidyl peptidase II hybridized with genomic DNA from mouse, horse, calf, and hen, even under fairly high stringency conditions, indicating that tripeptidyl peptidase II is highly conserved.« less
Sequence evaluation of four specific cDNA libraries for developmental genomics of sunflower.

PubMed

Tamborindeguy, C; Ben, C; Liboz, T; Gentzbittel, L

2004-04-01

Four different cDNA libraries were constructed from sunflower protoplasts growing under embryogenic and non-embryogenic conditions: one standard library from each condition and two subtractive libraries in opposite sense. A total of 22,876 cDNA clones were obtained and 4800 ESTs were sequenced, giving rise to 2479 high quality ESTs representing an unigene set of 1502 sequences. This set was compared with ESTs represented in public databases using the programs BLASTN and BLASTX, and its members were classified according to putative function using the catalog in the Kyoto Encyclopedia of Genes and Genomes (KEGG). Some 33% of sequences failed to align with existing plant ESTs and therefore represent putative novel genes. The libraries show a low level of redundancy and, on average, 50% of the present ESTs have not been previously reported for sunflower. Several potentially interesting genes were identified, based on their homology with genes involved in animal zygotic division or plant embryogenesis. We also identified two ESTs that show significantly different levels of expression under embryogenic and non-embryogenic conditions. The libraries described here represent an original and valuable resource for the discovery of yet unknown genes putatively involved in dicot embryogenesis and improving our knowledge of the mechanisms involved in polarity acquisition by plant embryos.
Tidying Up International Nucleotide Sequence Databases: Ecological, Geographical and Sequence Quality Annotation of ITS Sequences of Mycorrhizal Fungi

PubMed Central

Tedersoo, Leho; Abarenkov, Kessy; Nilsson, R. Henrik; Schüssler, Arthur; Grelet, Gwen-Aëlle; Kohout, Petr; Oja, Jane; Bonito, Gregory M.; Veldre, Vilmar; Jairus, Teele; Ryberg, Martin; Larsson, Karl-Henrik; Kõljalg, Urmas

2011-01-01

Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi. PMID:21949797
Isolation and characterization of full-length cDNA clones coding for cholinesterase from fetal human tissues

DOE Office of Scientific and Technical Information (OSTI.GOV)

Prody, C.A.; Zevin-Sonkin, D.; Gnatt, A.

1987-06-01

To study the primary structure and regulation of human cholinesterases, oligodeoxynucleotide probes were prepared according to a consensus peptide sequence present in the active site of both human serum pseudocholinesterase and Torpedo electric organ true acetylcholinesterase. Using these probes, the authors isolated several cDNA clones from lambdagt10 libraries of fetal brain and liver origins. These include 2.4-kilobase cDNA clones that code for a polypeptide containing a putative signal peptide and the N-terminal, active site, and C-terminal peptides of human BtChoEase, suggesting that they code either for BtChoEase itself or for a very similar but distinct fetal form of cholinesterase. Inmore » RNA blots of poly(A)/sup +/ RNA from the cholinesterase-producing fetal brain and liver, these cDNAs hybridized with a single 2.5-kilobase band. Blot hybridization to human genomic DNA revealed that these fetal BtChoEase cDNA clones hybridize with DNA fragments of the total length of 17.5 kilobases, and signal intensities indicated that these sequences are not present in many copies. Both the cDNA-encoded protein and its nucleotide sequence display striking homology to parallel sequences published for Torpedo AcChoEase. These finding demonstrate extensive homologies between the fetal BtChoEase encoded by these clones and other cholinesterases of various forms and species.« less
Typing of canine parvovirus isolates using mini-sequencing based single nucleotide polymorphism analysis.

PubMed

Naidu, Hariprasad; Subramanian, B Mohana; Chinchkar, Shankar Ramchandra; Sriraman, Rajan; Rana, Samir Kumar; Srinivasan, V A

2012-05-01

The antigenic types of canine parvovirus (CPV) are defined based on differences in the amino acids of the major capsid protein VP2. Type specificity is conferred by a limited number of amino acid changes and in particular by few nucleotide substitutions. PCR based methods are not particularly suitable for typing circulating variants which differ in a few specific nucleotide substitutions. Assays for determining SNPs can detect efficiently nucleotide substitutions and can thus be adapted to identify CPV types. In the present study, CPV typing was performed by single nucleotide extension using the mini-sequencing technique. A mini-sequencing signature was established for all the four CPV types (CPV2, 2a, 2b and 2c) and feline panleukopenia virus. The CPV typing using the mini-sequencing reaction was performed for 13 CPV field isolates and the two vaccine strains available in our repository. All the isolates had been typed earlier by full-length sequencing of the VP2 gene. The typing results obtained from mini-sequencing matched completely with that of sequencing. Typing could be achieved with less than 100 copies of standard plasmid DNA constructs or ≤10¹ FAID₅₀ of virus by mini-sequencing technique. The technique was also efficient for detecting multiple types in mixed infections. Copyright © 2012 Elsevier B.V. All rights reserved.
Primary structure of prostaglandin G/H synthase from sheep vesicular gland determined from the complementary DNA sequence.

PubMed Central

DeWitt, D L; Smith, W L

1988-01-01

Prostaglandin G/H synthase (8,11,14-icosatrienoate, hydrogen-donor:oxygen oxidoreductase, EC 1.14.99.1) catalyzes the first step in the formation of prostaglandins and thromboxanes, the conversion of arachidonic acid to prostaglandin endoperoxides G and H. This enzyme is the site of action of nonsteroidal anti-inflammatory drugs. We have isolated a 2.7-kilobase complementary DNA (cDNA) encompassing the entire coding region of prostaglandin G/H synthase from sheep vesicular glands. This cDNA, cloned from a lambda gt 10 library prepared from poly(A)+ RNA of vesicular glands, hybridizes with a single 2.75-kilobase mRNA species. The cDNA clone was selected using oligonucleotide probes modeled from amino acid sequences of tryptic peptides prepared from the purified enzyme. The full-length cDNA encodes a protein of 600 amino acids, including a signal sequence of 24 amino acids. Identification of the cDNA as coding for prostaglandin G/H synthase is based on comparison of amino acid sequences of seven peptides comprising 103 amino acids with the amino acid sequence deduced from the nucleotide sequence of the cDNA. The molecular weight of the unglycosylated enzyme lacking the signal peptide is 65,621. The synthase is a glycoprotein, and there are three potential sites for N-glycosylation, two of them in the amino-terminal half of the molecule. The serine reported to be acetylated by aspirin is at position 530, near the carboxyl terminus. There is no significant similarity between the sequence of the synthase and that of any other protein in amino acid or nucleotide sequence libraries, and a heme binding site(s) is not apparent from the amino acid sequence. The availability of a full-length cDNA clone coding for prostaglandin G/H synthase should facilitate studies of the regulation of expression of this enzyme and the structural features important for catalysis and for interaction with anti-inflammatory drugs. Images PMID:3125548
cDNA isolated from a human T-cell library encodes a member of the protein-tyrosine-phosphatase family

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cool, D.E.; Tonks, N.K.; Charbonneau, H.

1989-07-01

A human peripheral T-cell cDNA library was screened with two labeled synthetic oligonucleotides encoding regions of a human placenta protein-tyrosine-phosphatase. One positive clone was isolated and the nucleotide sequence was determined. It contained 1,305 base pairs of open reading frame followed by a TAA stop codon and 978 base pairs of 3{prime} untranslated end, although a poly(A){sup +} tail was not found. An initiator methionine residue was predicted at position 61, which would result in a protein of 415 amino acid residues. This was supported by the synthesis of a M{sub r} 48,000 protein in an in vitro reticulocyte lysatemore » translation system using RNA transcribed from the cloned cDNA and T7 RNA polymerase. The deduced amino acid sequence was compared to other known proteins revealing 65% identity to the low M{sub r} PTPase 1B isolated from placenta. In view of the high degree of similarity, the T-cell cDNA likely encodes a newly discovered protein-tyrosine-phosphatase, thus expanding this family of genes.« less
Cloning and sequence analysis of complementary DNA encoding an aberrantly rearranged human T-cell gamma chain.

PubMed Central

Dialynas, D P; Murre, C; Quertermous, T; Boss, J M; Leiden, J M; Seidman, J G; Strominger, J L

1986-01-01

Complementary DNA (cDNA) encoding a human T-cell gamma chain has been cloned and sequenced. At the junction of the variable and joining regions, there is an apparent deletion of two nucleotides in the human cDNA sequence relative to the murine gamma-chain cDNA sequence, resulting simultaneously in the generation of an in-frame stop codon and in a translational frameshift. For this reason, the sequence presented here encodes an aberrantly rearranged human T-cell gamma chain. There are several surprising differences between the deduced human and murine gamma-chain amino acid sequences. These include poor homology in the variable region, poor homology in a discrete segment of the constant region precisely bounded by the expected junctions of exon CII, and the presence in the human sequence of five potential sites for N-linked glycosylation. Images PMID:3458221
First complete genome sequence of an emerging cucumber green mottle mosaic virus isolate in North America

USDA-ARS?s Scientific Manuscript database

The complete genome sequence (6,423 nt) of an emerging Cucumber green mottle mosaic virus (CGMMV) isolate on cucumber in North America was determined through deep sequencing of sRNA and rapid amplification of cDNA ends. It shares 99% nucleotide sequence identity to the Asian genotype, but only 90% t...
The maize stripe virus major noncapsid protein messenger RNA transcripts contain heterogeneous leader sequences at their 5' termini.

PubMed

Huiet, L; Feldstein, P A; Tsai, J H; Falk, B W

1993-12-01

Primer extension analyses and a PCR-based cloning strategy were used to identify and characterize 5' nucleotide sequences on the maize stripe virus (MStV) RNA4 mRNA transcripts encoding the major noncapsid protein (NCP). Direct RNA sequence analysis by primer extension showed that the NCP mRNA transcripts had 10-15 nucleotides beyond the 5' terminus of the MStV RNA4 nucleotide sequence. MStV genomic RNAs isolated from ribonucleoprotein particles (RNPs) lacked the additional 5' nucleotides. cDNA clones representing the 5' region of the mRNA transcripts were constructed, and the nucleotide sequences of the 5' regions were determined for 16 clones. Each was found to have a distinct 10-15 nucleotide sequence immediately 5' of the MStV RNA4 sequence. Eleven of 16 clones had the correct MStV RNA4 5' nucleotide sequence, while five showed minor variations at or near the 5' most MStV RNA4 nucleotide. These characteristics show strong similarities to other viral mRNA transcripts which are synthesized by cap snatching.
[cDNA library construction from panicle meristem of finger millet].

PubMed

Radchuk, V; Pirko, Ia V; Isaenkov, S V; Emets, A I; Blium, Ia B

2014-01-01

The protocol for production of full-size cDNA using SuperScript Full-Length cDNA Library Construction Kit II (Invitrogen) was tested and high quality cDNA library from meristematic tissue of finger millet panicle (Eleusine coracana (L.) Gaertn) was created. The titer of obtained cDNA library comprised 3.01 x 10(5) CFU/ml in avarage. In average the length of cDNA insertion consisted about 1070 base pairs, the effectivity of cDNA fragment insertions--99.5%. The selective sequencing of cDNA clones from created library was performed. The sequences of cDNA clones were identified with usage of BLAST-search. The results of cDNA library analysis and selective sequencing represents prove good functionality and full length character of inserted cDNA clones. Obtained cDNA library from meristematic tissue of finger millet panicle represents good and valuable source for isolation and identification of key genes regulating metabolism and meristematic development and for mining of new molecular markers to conduct out high quality genetic investigations and molecular breeding as well.
Extension of the COG and arCOG databases by amino acid and nucleotide sequences

PubMed Central

Meereis, Florian; Kaufmann, Michael

2008-01-01

Background The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. Results Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information. In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at . Conclusion NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document. PMID:19014535
Complete sequence of HLA-B27 cDNA identified through the characterization of structural markers unique to the HLA-A, -B, and -C allelic series

DOE Office of Scientific and Technical Information (OSTI.GOV)

Szoets, H.; Reithmueller, G.; Weiss, E.

1986-03-01

Antigen HLA-B27 is a high-risk genetic factor with respect to a group of rheumatoid disorders, especially ankylosing spondylitis. A cDNA library was constructed from an autozygous B-cell line expressing HLA-B27, HLA-Cw1, and the previously cloned HLA-A2 antigen. Clones detected with an HLA probe were isolated and sorted into homology groups by differential hybridization and restriction maps. Nucleotide sequencing allowed the unambiguous assignment of cDNAs to HLA-A, -B, and -C loci. The HLA-B27 mRNA has the structure features and the codon variability typical of an HLA class I transcript but it specifies two uncommon amino acid replacements: a cysteine in positionmore » 67 and a serine in position 131. The latter substitution may have functional consequences, because it occurs in a conserved region and at a position invariably occupied by a species-specific arginine in humans and lysine in mice. The availability of the complete sequence of HLA-B27 and of the partial sequence of HLA-Cw1 allows the recognition of locus-specific sequence markers, particularly, but not exclusively, in the transmembrane and cytoplasmic domains.« less

37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

Code of Federal Regulations, 2011 CFR

2011-07-01

... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...
Comprehensive red blood cell and platelet antigen prediction from whole genome sequencing: proof of principle

PubMed Central

Westhoff, Connie M.; Uy, Jon Michael; Aguad, Maria; Smeland‐Wagman, Robin; Kaufman, Richard M.; Rehm, Heidi L.; Green, Robert C.; Silberstein, Leslie E.

2015-01-01

BACKGROUND There are 346 serologically defined red blood cell (RBC) antigens and 33 serologically defined platelet (PLT) antigens, most of which have known genetic changes in 45 RBC or six PLT genes that correlate with antigen expression. Polymorphic sites associated with antigen expression in the primary literature and reference databases are annotated according to nucleotide positions in cDNA. This makes antigen prediction from next‐generation sequencing data challenging, since it uses genomic coordinates. STUDY DESIGN AND METHODS The conventional cDNA reference sequences for all known RBC and PLT genes that correlate with antigen expression were aligned to the human reference genome. The alignments allowed conversion of conventional cDNA nucleotide positions to the corresponding genomic coordinates. RBC and PLT antigen prediction was then performed using the human reference genome and whole genome sequencing (WGS) data with serologic confirmation. RESULTS Some major differences and alignment issues were found when attempting to convert the conventional cDNA to human reference genome sequences for the following genes: ABO, A4GALT, RHD, RHCE, FUT3, ACKR1 (previously DARC), ACHE, FUT2, CR1, GCNT2, and RHAG. However, it was possible to create usable alignments, which facilitated the prediction of all RBC and PLT antigens with a known molecular basis from WGS data. Traditional serologic typing for 18 RBC antigens were in agreement with the WGS‐based antigen predictions, providing proof of principle for this approach. CONCLUSION Detailed mapping of conventional cDNA annotated RBC and PLT alleles can enable accurate prediction of RBC and PLT antigens from whole genomic sequencing data. PMID:26634332
Complete nucleotide sequence of Alfalfa mosaic virus isolated from alfalfa (Medicago sativa L.) in Argentina.

PubMed

Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián

2014-06-01

The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host.
Characterisation and In Silico Analysis of Interleukin-4 cDNA of Nilgai (Boselaphus tragocamelus) and Indian Buffalo (Bubalus bubalis)

PubMed Central

Saini, M.; Palai, T. K.; Das, D. K.; Hatle, K. M.; Gupta, P. K.

2013-01-01

Interleukin-4 (IL-4) produced from Th2 cells modulates both innate and adaptive immune responses. It is a common belief that wild animals possess better immunity against diseases than domestic and laboratory animals; however, the immune system of wild animals is not fully explored yet. Therefore, a comparative study was designed to explore the wildlife immunity through characterisation of IL-4 cDNA of nilgai, a wild ruminant, and Indian buffalo, a domestic ruminant. Total RNA was extracted from peripheral blood mononuclear cells of nilgai and Indian buffalo and reverse transcribed into cDNA. Respective cDNA was further cloned and sequenced. Sequences were analysed in silico and compared with their homologues available at GenBank. The deduced 135 amino acid protein of nilgai IL-4 is 95.6% similar to that of Indian buffalo. N-linked glycosylation sequence, leader sequence, Cysteine residues in the signal peptide region, and 3′ UTR of IL-4 were found to be conserved across species. Six nonsynonymous nucleotide substitutions were found in Indian buffalo compared to nilgai amino acid sequence. Tertiary structure of this protein in both species was modeled, and it was found that this protein falls under 4-helical cytokines superfamily and short chain cytokine family. Phylogenetic analysis revealed a single cluster of ruminants including both nilgai and Indian buffalo that was placed distinct from other nonruminant mammals. PMID:24348167
Nucleotide sequence composition and method for detection of neisseria gonorrhoeae

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lo, A.; Yang, H.L.

1990-02-13

This patent describes a composition of matter that is specific for {ital Neisseria gonorrhoeae}. It comprises: at least one nucleotide sequence for which the ratio of the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria gonorrhoeae} to the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria meningitidis} is greater than about five. The ratio being obtained by a method described.
cDNA sequence and expression of a cold-responsive gene in Citrus unshiu.

PubMed

Hara, M; Wakasugi, Y; Ikoma, Y; Yano, M; Ogawa, K; Kuboi, T

1999-02-01

A cDNA clone encoding a protein (CuCOR19), the sequence of which is similar to Poncirus COR19, of the dehydrin family was isolated from the epicarp of Citrus unshiu. The molecular mass of the predicted protein was 18,980 daltons. CuCOR19 was highly hydrophilic and contained three repeating elements including Lys-rich motifs. The gene expression in leaves increased by cold stress.
Complete cDNA sequence and amino acid analysis of a bovine ribonuclease K6 gene.

PubMed

Pietrowski, D; Förster, M

2000-01-01

The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).
Conserved features of eukaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70.

PubMed Central

Hunt, C; Morimoto, R I

1985-01-01

We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
Purification, cDNA cloning, and characterization of LysM-containing plant chitinase from horsetail (Equisetum arvense).

PubMed

Inamine, Saki; Onaga, Shoko; Ohnuma, Takayuki; Fukamizo, Tamo; Taira, Toki

2015-01-01

Chitinase-A (EaChiA), molecular mass 36 kDa, was purified from the vegetative stems of a horsetail (Equisetum arvense) using a series of column chromatography. The N-terminal amino acid sequence of EaChiA was similar to the lysin motif (LysM). A cDNA encoding EaChiA was cloned by rapid amplification of cDNA ends and polymerase chain reaction. It consisted of 1320 nucleotides and encoded an open reading frame of 361 amino acid residues. The deduced amino acid sequence indicated that EaChiA is composed of a N-terminal LysM domain and a C-terminal plant class IIIb chitinase catalytic domain, belonging to the glycoside hydrolase family 18, linked by proline-rich regions. EaChiA has strong chitin-binding activity, however, no antifungal activity. This is the first report of a chitinase from Equisetopsida, a class of fern plants, and the second report of a LysM-containing chitinase from a plant.
Molecular Cloning and Sequencing of Channel Catfish, Ictalurus punctatus, Cathepsin H and L cDNA

USDA-ARS?s Scientific Manuscript database

Cathepsin H and L, a lysosomal cysteine endopeptidase of the papain family, are ubiquitously expressed and involve in antigen processing. In this communication, the channel catfish cathepsin H and L transcripts were sequenced and analyzed. Total RNA from tissues was extracted and cDNA libraries we...
Isolation and characterization of full-length cDNA clones coding for cholinesterase from fetal human tissues.

PubMed Central

Prody, C A; Zevin-Sonkin, D; Gnatt, A; Goldberg, O; Soreq, H

1987-01-01

To study the primary structure and regulation of human cholinesterases, oligodeoxynucleotide probes were prepared according to a consensus peptide sequence present in the active site of both human serum pseudocholinesterase (BtChoEase; EC 3.1.1.8) and Torpedo electric organ "true" acetylcholinesterase (AcChoEase; EC 3.1.1.7). Using these probes, we isolated several cDNA clones from lambda gt10 libraries of fetal brain and liver origins. These include 2.4-kilobase cDNA clones that code for a polypeptide containing a putative signal peptide and the N-terminal, active site, and C-terminal peptides of human BtChoEase, suggesting that they code either for BtChoEase itself or for a very similar but distinct fetal form of cholinesterase. In RNA blots of poly(A)+ RNA from the cholinesterase-producing fetal brain and liver, these cDNAs hybridized with a single 2.5-kilobase band. Blot hybridization to human genomic DNA revealed that these fetal BtChoEase cDNA clones hybridize with DNA fragments of the total length of 17.5 kilobases, and signal intensities indicated that these sequences are not present in many copies. Both the cDNA-encoded protein and its nucleotide sequence display striking homology to parallel sequences published for Torpedo AcChoEase. These findings demonstrate extensive homologies between the fetal BtChoEase encoded by these clones and other cholinesterases of various forms and species. Images PMID:3035536
Normalized cDNA libraries

DOEpatents

Soares, Marcelo B.; Efstratiadis, Argiris

1997-01-01

This invention provides a method to normalize a directional cDNA library constructed in a vector that allows propagation in single-stranded circle form comprising: (a) propagating the directional cDNA library in single-stranded circles; (b) generating fragments complementary to the 3' noncoding sequence of the single-stranded circles in the library to produce partial duplexes; (c) purifying the partial duplexes; (d) melting and reassociating the purified partial duplexes to moderate Cot; and (e) purifying the unassociated single-stranded circles, thereby generating a normalized cDNA library.
Normalized cDNA libraries

DOEpatents

Soares, M.B.; Efstratiadis, A.

1997-06-10

This invention provides a method to normalize a directional cDNA library constructed in a vector that allows propagation in single-stranded circle form comprising: (a) propagating the directional cDNA library in single-stranded circles; (b) generating fragments complementary to the 3{prime} noncoding sequence of the single-stranded circles in the library to produce partial duplexes; (c) purifying the partial duplexes; (d) melting and reassociating the purified partial duplexes to moderate Cot; and (e) purifying the unassociated single-stranded circles, thereby generating a normalized cDNA library. 4 figs.
Statistical analysis of nucleotide sequences of the hemagglutinin gene of human influenza A viruses.

PubMed Central

Ina, Y; Gojobori, T

1994-01-01

To examine whether positive selection operates on the hemagglutinin 1 (HA1) gene of human influenza A viruses (H1 subtype), 21 nucleotide sequences of the HA1 gene were statistically analyzed. The nucleotide sequences were divided into antigenic and nonantigenic sites. The nucleotide diversities for antigenic and nonantigenic sites of the HA1 gene were computed at synonymous and nonsynonymous sites separately. For nonantigenic sites, the nucleotide diversities were larger at synonymous sites than at nonsynonymous sites. This is consistent with the neutral theory of molecular evolution. For antigenic sites, however, the nucleotide diversities at nonsynonymous sites were larger than those at synonymous sites. These results suggest that positive selection operates on antigenic sites of the HA1 gene of human influenza A viruses (H1 subtype). PMID:8078892
Isolation, cDNA cloning and gene expression of an antibacterial protein from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros.

PubMed

Yang, J; Yamamoto, M; Ishibashi, J; Taniai, K; Yamakawa, M

1998-08-01

An antibacterial protein, designated rhinocerosin, was purified to homogeneity from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros immunized with Escherichia coli. Based on the amino acid sequence of the N-terminal region, a degenerate primer was synthesized and reverse-transcriptase PCR was performed to clone rhinocerosin cDNA. As a result, a 279-bp fragment was obtained. The complete nucleotide sequence was determined by sequencing the extended rhinocerosin cDNA clone by 5' rapid amplification of cDNA ends. The deduced amino acid sequence of the mature portion of rhinocerosin was composed of 72 amino acids without cystein residues and was shown to be rich in glycine (11.1%) and proline (11.1%) residues. Comparison of the deduced amino acid sequence of rhinocerosin with those of other antibacterial proteins indicated that it has 77.8% and 44.6% identity with holotricin 2 and coleoptrecin, respectively. Rhinocerosin had strong antibacterial activity against E. coli, Streptococcus pyogenes, Staphylococcus aureus but not against Pseudomonas aeruginosa. Results of reverse-transcriptase PCR analysis of gene expression in different tissues indicated that the rhinocerosin gene is strongly expressed in the fat body and the Malpighian tubule, and weakly expressed in hemocytes and midgut. In addition, gene expression was inducible by bacteria in the fat body, the Malpighian tubule and hemocyte but constitutive expression was observed in the midgut.
[Cloning and sequencing of KIR2DL1 framework gene cDNA and identification of a novel allele].

PubMed

Sun, Ge; Wang, Chang; Zhen, Jianxin; Zhang, Guobin; Xu, Yunping; Deng, Zhihui

2016-10-01

To develop an assay for cDNA cloning and haplotype sequencing of KIR2DL1 framework gene and determine the genotype of an ethnic Han from southern China. Total RNA was isolated from peripheral blood sample, and complementary DNA (cDNA) transcript was synthesized by RT-PCR. The entire coding sequence of the KIR2DL1 framework gene was amplified with a pair of KIR2DL1-specific PCR primers. The PCR products with a length of approximately 1.2 kb were then subjected to cloning and haplotype sequencing. A specific target fragment of the KIR2DL1 framework gene was obtained. Following allele separation, a wild-type KIR2DL1*00302 allele and a novel variant allele, KIR2DL1*031, were identified. Sequence alignment with KIR2DL1 alleles from the IPD-KIR Database showed that the novel allele KIR2DL1*031 has differed from the closest allele KIR2DL1*00302 by a non-synonymous mutation at CDS nt 188A>G (codon 42 GAG>GGG) in exon 4, which has caused an amino acid change Glu42Gly. The sequence of the novel allele KIR2DL1*031 was submitted to GenBank under the accession number KP025960 and to the IPD-KIR Database under the submission number IWS40001982. A name KIR2DL1*031 has been officially assigned by the World Health Organization (WHO) Nomenclature Committee. An assay for cDNA cloning and haplotype sequencing of KIR2DL1 has been established, which has a broad applications in KIR studies at allelic level.
Nucleotide sequence of a resistance breaking mutant of southern bean mosaic virus.

PubMed

Lee, L; Anderson, E J

1998-01-01

SBMV-S is a resistance-breaking mutant of an Arkansas isolate of the bean strain of southern bean mosaic virus (SBMV-BARK) that is able to move systemically in Phaseolus vulgaris cvs. Pinto and Great Northern, whereas the wild-type SBMV-BARK causes local necrotic lesions and is restricted to the inoculated leaves of these hosts. Sequence analysis of the 4136 nucleotide genomes of SBMV-BARK and SBMV-S revealed seven nucleotide differences, but only four deduced amino acid changes. A single amino acid change occurred in the C-terminal region of the putative RNA-dependent RNA polymerase and three differences were identified in the N-terminal portion of the virus coat protein. SBMV-BARK and SBMV-S were compared with other sobemoviruses and were found to contain a high level of nucleotide sequence identity (91.3%) to SBMV-B. Unlike SBMV-B however, SBMV-BARK and SBMV-S contained four putative overlapping open reading frames, making them more similar in genome organization to the cowpea strain, SBMV-C. The possibility exists that mutations or even errors, that resulted in mis-identification of open reading frames, occurred in previously published information on nucleotide sequence and genomic organization for SBMV-B.
Brain cDNA clone for human cholinesterase

DOE Office of Scientific and Technical Information (OSTI.GOV)

McTiernan, C.; Adkins, S.; Chatonnet, A.

1987-10-01

A cDNA library from human basal ganglia was screened with oligonucleotide probes corresponding to portions of the amino acid sequence of human serum cholinesterase. Five overlapping clones, representing 2.4 kilobases, were isolated. The sequenced cDNA contained 207 base pairs of coding sequence 5' to the amino terminus of the mature protein in which there were four ATG translation start sites in the same reading frame as the protein. Only the ATG coding for Met-(-28) lay within a favorable consensus sequence for functional initiators. There were 1722 base pairs of coding sequence corresponding to the protein found circulating in human serum.more » The amino acid sequence deduced from the cDNA exactly matched the 574 amino acid sequence of human serum cholinesterase, as previously determined by Edman degradation. Therefore, our clones represented cholinesterase rather than acetylcholinesterase. It was concluded that the amino acid sequences of cholinesterase from two different tissues, human brain and human serum, were identical. Hybridization of genomic DNA blots suggested that a single gene, or very few genes coded for cholinesterase.« less
The delta-subunit of murine guanine nucleotide exchange factor eIF-2B. Characterization of cDNAs predicts isoforms differing at the amino-terminal end.

PubMed

Henderson, R A; Krissansen, G W; Yong, R Y; Leung, E; Watson, J D; Dholakia, J N

1994-12-02

Protein synthesis in mammalian cells is regulated at the level of the guanine nucleotide exchange factor, eIF-2B, which catalyzes the exchange of eukaryotic initiation factor 2-bound GDP for GTP. We have isolated and sequenced cDNA clones encoding the delta-subunit of murine eIF-2B. The cDNA sequence encodes a polypeptide of 544 amino acids with molecular mass of 60 kDa. Antibodies against a synthetic polypeptide of 30 amino acids deduced from the cDNA sequence specifically react with the delta-subunit of mammalian eIF-2B. The cDNA-derived amino acid sequence shows significant homology with the yeast translational regulator Gcd2, supporting the hypothesis that Gcd2 may be the yeast homolog of the delta-subunit of mammalian eIF-2B. Primer extension studies and anchor polymerase chain reaction analysis were performed to determine the 5'-end of the transcript for the delta-subunit of eIF-2B. Results of these experiments demonstrate two different mRNAs for the delta-subunit of eIF-2B in murine cells. The isolation and characterization of two different full-length cDNAs also predicts the presence of two alternate forms of the delta-subunit of eIF-2B in murine cells. These differ at their amino-terminal end but have identical nucleotide sequences coding for amino acids 31-544.
Production of a full-length infectious GFP-tagged cDNA clone of Beet mild yellowing virus for the study of plant-polerovirus interactions.

PubMed

Stevens, Mark; Viganó, Felicita

2007-04-01

The full-length cDNA of Beet mild yellowing virus (Broom's Barn isolate) was sequenced and cloned into the vector pLitmus 29 (pBMYV-BBfl). The sequence of BMYV-BBfl (5721 bases) shared 96% and 98% nucleotide identity with the other complete sequences of BMYV (BMYV-2ITB, France and BMYV-IPP, Germany respectively). Full-length capped RNA transcripts of pBMYV-BBfl were synthesised and found to be biologically active in Arabidopsis thaliana protoplasts following electroporation or PEG inoculation when the protoplasts were subsequently analysed using serological and molecular methods. The BMYV sequence was modified by inserting DNA that encoded the jellyfish green fluorescent protein (GFP) into the P5 gene close to its 3' end. A. thaliana protoplasts electroporated with these RNA transcripts were biologically active and up to 2% of transfected protoplasts showed GFP-specific fluorescence. The exploitation of these cDNA clones for the study of the biology of beet poleroviruses is discussed.

Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy.

PubMed

Mankos, Marian; Persson, Henrik H J; N'Diaye, Alpha T; Shadman, Khashayar; Schmid, Andreas K; Davis, Ronald W

2016-01-01

DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging.
Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mankos, Marian; Persson, Henrik H. J.; N’Diaye, Alpha T.

DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectronmore » and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. In conclusion, both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging.« less
Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

DOE PAGES

Mankos, Marian; Persson, Henrik H. J.; N’Diaye, Alpha T.; ...

2016-05-05

DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectronmore » and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. In conclusion, both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging.« less
Development of polymorphic genic-SSR markers by cDNA library sequencing in boxwood, Buxus spp. (Buxaceae)

USDA-ARS?s Scientific Manuscript database

Genic microsatellites or simple sequence repeat (genic-SSR) markers were developed in boxwood (Buxus taxa) for genetic diversity analysis, identification of taxa, and to facilitate breeding. cDNA libraries were developed from mRNA extracted from leaves of Buxus sempervirens ‘Vardar Valley’ and seque...
Molecular cloning and sequencing of the cDNA and gene for a novel elastinolytic metalloproteinase from Aspergillus fumigatus and its expression in Escherichia coli.

PubMed Central

Sirakova, T D; Markaryan, A; Kolattukudy, P E

1994-01-01

An extracellular elastinolytic metalloproteinase, purified from Aspergillus fumigatus isolated from an aspergillosis and patient/and an internal peptide derived from it were subjected to N-terminal sequencing. Oligonucleotide primers based on these sequences were used to PCR amplify a segment of the metalloproteinase cDNA, which was used as a probe to isolate the cDNA and gene for this enzyme. The gene sequence matched exactly with the cDNA sequence except for the four introns that interrupted the open reading frame. According to the deduced amino acid sequence, the metalloproteinase has a signal sequence and 227 additional amino acids preceding the sequence for the mature protein of 389 amino acids with a calculated molecular mass of 42 kDa, which is close to the size of the purified mature fungal proteinase. This sequence contains segments that matched both the N terminus of the mature protein and the internal peptide. A. fumigatus metalloproteinase contains some of the conserved zinc-binding and active-site motifs characteristic of metalloproteinases but shows no overall homology with known metalloproteinases. The cDNA of the mature protein when introduced into Escherichia coli directed the expression of a protein with a size, N-terminal sequence, and immunological cross-reactivity identical to those of the native fungal enzyme. Although the enzyme in the inclusion bodies could not be renatured, expression at 30 degrees C yielded soluble enzyme that showed chromatographic behavior identical to that of the native fungal enzyme and catalyzed hydrolysis of elastin. The metalloproteinase gene described here was not found in Aspergillus flavus. Images PMID:7927676
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2014 CFR

2014-07-01

...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2013 CFR

2013-07-01

...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2012 CFR

2012-07-01

...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
Mapping RNA Structure In Vitro with SHAPE Chemistry and Next-Generation Sequencing (SHAPE-Seq).

PubMed

Watters, Kyle E; Lucks, Julius B

2016-01-01

Mapping RNA structure with selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry has proven to be a versatile method for characterizing RNA structure in a variety of contexts. SHAPE reagents covalently modify RNAs in a structure-dependent manner to create adducts at the 2'-OH group of the ribose backbone at nucleotides that are structurally flexible. The positions of these adducts are detected using reverse transcriptase (RT) primer extension, which stops one nucleotide before the modification, to create a pool of cDNAs whose lengths reflect the location of SHAPE modification. Quantification of the cDNA pools is used to estimate the "reactivity" of each nucleotide in an RNA molecule to the SHAPE reagent. High reactivities indicate nucleotides that are structurally flexible, while low reactivities indicate nucleotides that are inflexible. These SHAPE reactivities can then be used to infer RNA structures by restraining RNA structure prediction algorithms. Here, we provide a state-of-the-art protocol describing how to perform in vitro RNA structure probing with SHAPE chemistry using next-generation sequencing to quantify cDNA pools and estimate reactivities (SHAPE-Seq). The use of next-generation sequencing allows for higher throughput, more consistent data analysis, and multiplexing capabilities. The technique described herein, SHAPE-Seq v2.0, uses a universal reverse transcription priming site that is ligated to the RNA after SHAPE modification. The introduced priming site allows for the structural analysis of an RNA independent of its sequence.
The complete nucleotide sequence of the glnALG operon of Escherichia coli K12.

PubMed Central

Miranda-Ríos, J; Sánchez-Pescador, R; Urdea, M; Covarrubias, A A

1987-01-01

The nucleotide sequence of the E. coli glnALG operon has been determined. The glnL (ntrB) and glnG (ntrC) genes present a high homology, at the nucleotide and aminoacid levels, with the corresponding genes of Klebsiella pneumoniae. The predicted aminoacid sequence for glutamine synthetase allowed us to locate some of the enzyme domains. The structure of this operon is discussed. PMID:2882477
Full-Length Venom Protein cDNA Sequences from Venom-Derived mRNA: Exploring Compositional Variation and Adaptive Multigene Evolution

PubMed Central

Modahl, Cassandra M.; Mackessy, Stephen P.

2016-01-01

Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides
The nucleotide sequence of 5S ribosomal RNA from Micrococcus lysodeikticus.

PubMed Central

Hori, H; Osawa, S; Murao, K; Ishikura, H

1980-01-01

The nucleotide sequence of ribosomal 5S RNA from Micrococcus lysodeikticus is pGUUACGGCGGCUAUAGCGUGGGGGAAACGCCCGGCCGUAUAUCGAACCCGGAAGCUAAGCCCCAUAGCGCCGAUGGUUACUGUAACCGGGAGGUUGUGGGAGAGUAGGUCGCCGCCGUGAOH. When compared to other 5S RNAs, the sequence homology is greatest with Thermus aquaticus, and these two 5S RNAs reveal several features intermediate between those of typical gram-positive bacteria and gram-negative bacteria. PMID:6780979
Subtractive cloning of cDNA from Aspergillus oryzae differentially regulated between solid-state culture and liquid (submerged) culture.

PubMed

Akao, Takeshi; Gomi, Katsuya; Goto, Kuniyasu; Okazaki, Naoto; Akita, Osamu

2002-07-01

In solid-state cultures (SC), Aspergillus oryzae shows characteristics such as high-level production and secretion of enzymes and hyphal differentiation with asexual development which are absent in liquid (submerged) culture (LC). It was predicted that many of the genes involved in the characteristics of A. oryzae in SC are differentially expressed between SC and LC. We generated two subtracted cDNA libraries with bi-directional cDNA subtractive hybridizations to isolate and identify such genes. Among them, we identified genes upregulated in or specific to SC, such as the AOS ( A. oryzae SC-specific gene) series, and those downregulated or not expressed in SC, such as the AOL ( A. oryzae LC-specific) series. Sequencing analyses revealed that the AOS series and the AOL series contain genes encoding extra- and intracellular enzymes and transport proteins. However, half were functionally unclassified by nucleotide sequences. Also, by expression profile, the AOS series comprised two groups. These gene products' molecular functions and physiological roles in SC await further investigation.
PCR amplification and sequences of cDNA clones for the small and large subunits of ADP-glucose pyrophosphorylase from barley tissues.

PubMed

Villand, P; Aalen, R; Olsen, O A; Lüthi, E; Lönneborg, A; Kleczkowski, L A

1992-06-01

Several cDNAs encoding the small and large subunit of ADP-glucose pyrophosphorylase (AGP) were isolated from total RNA of the starchy endosperm, roots and leaves of barley by polymerase chain reaction (PCR). Sets of degenerate oligonucleotide primers, based on previously published conserved amino acid sequences of plant AGP, were used for synthesis and amplification of the cDNAs. For either the endosperm, roots and leaves, the restriction analysis of PCR products (ca. 550 nucleotides each) has revealed heterogeneity, suggesting presence of three transcripts for AGP in the endosperm and roots, and up to two AGP transcripts in the leaf tissue. Based on the derived amino acid sequences, two clones from the endosperm, beps and bepl, were identified as coding for the small and large subunit of AGP, respectively, while a leaf transcript (blpl) encoded the putative large subunit of AGP. There was about 50% identity between the endosperm clones, and both of them were about 60% identical to the leaf cDNA. Northern blot analysis has indicated that beps and bepl are expressed in both the endosperm and roots, while blpl is detectable only in leaves. Application of the PCR technique in studies on gene structure and gene expression of plant AGP is discussed.
Cloning, sequencing and expression in MEL cells of a cDNA encoding the mouse ribosomal protein S5.

PubMed

Vanegas, N; Castañeda, V; Santamaría, D; Hernández, P; Schvartzman, J B; Krimer, D B

1997-06-05

We describe the isolation and characterization of a cDNA encoding the mouse S5 ribosomal protein. It was isolated from a MEL (murine erythroleukemia) cell cDNA library by differential hybridization as a down regulated sequence during HMBA-induced differentiation. Northern series analysis showed that S5 mRNA expression is reduced 5-fold throughout the differentiation process. The mouse S5 mRNA is 760 bp long and encodes for a 204 amino acid protein with 94% homology with the human and rat S5.
Cloning and sequence analysis of Bufo arenarum oviductin cDNA and detection of its orthologous gene expression in the mouse female reproductive tract.

PubMed

Barrera, Daniel; Valdecantos, Pablo A; García, E Vanesa; Miceli, Dora C

2012-02-01

The glycoprotein envelope surrounding the Bufo arenarum egg exists in different functional forms. Conversion between types involves proteolysis of specific envelope glycoproteins. When the egg is released from the ovary, the envelope cannot be penetrated by sperm. Conversion to a penetrable state occurs during passage through the pars recta portion of the oviduct, where oviductin, a serine protease with trypsin-like substrate specificity, hydrolyzes two kinds of envelope glycoproteins: gp84 and gp55. The nucleotide sequence of a 3203 bp B. arenarum oviductin cDNA was obtained. Deduced amino acid sequence showed a complete open reading frame encoding 980 amino acids. B. arenarum oviductin is a multi-domain protein with a protease domain at the N-terminal region followed by two CUB domains and toward the C-terminal region another protease domain, which lacked an active histidine site, and one CUB domain. Expression of ovochymase 2, the mammalian orthologous of amphibian oviductin, was assayed in mouse female reproductive tract. Ovochymase 2 mRNA was unnoticeable in the mouse oviduct but expression was remarkable in the uterus. Phylogenetic relationship between oviductin and ovochymase 2 opens the possibility to understand the role of this enzyme in mammalian reproduction.
Control of total GFP expression by alterations to the 3′ region nucleotide sequence

PubMed Central

2013-01-01

Background Previously, we distinguished the Escherichia coli type II cytoplasmic membrane translocation pathways of Tat, Yid, and Sec for unfolded and folded soluble target proteins. The translocation of folded protein to the periplasm for soluble expression via the Tat pathway was controlled by an N-terminal hydrophilic leader sequence. In this study, we investigated the effect of the hydrophilic C-terminal end and its nucleotide sequence on total and soluble protein expression. Results The native hydrophilic C-terminal end of GFP was obtained by deleting the C-terminal peptide LeuGlu-6×His, derived from pET22b(+). The corresponding clones induced total and soluble GFP expression that was either slightly increased or dramatically reduced, apparently through reconstruction of the nucleotide sequence around the stop codon in the 3′ region. In the expression-induced clones, the hydrophilic C-terminus showed increased Tat pathway specificity for soluble expression. However, in the expression-reduced clone, after analyzing the role of the 5′ poly(A) coding sequence with a substituted synonymous codon, we proved that the longer 5′ poly(A) coding sequence interacted with the reconstructed 3′ region nucleotide sequence to create a new mRNA tertiary structure between the 5′ and 3′ regions, which resulted in reduced total GFP expression. Further, to recover the reduced expression by changing the 3′ nucleotide sequence, after replacing selected C-terminal 5′ codons and the stop codon in the ORF with synonymous codons, total GFP expression in most of the clones was recovered to the undeleted control level. The insertion of trinucleotides after the stop codon in the 3′-UTR recovered or reduced total GFP expression. RT-PCR revealed that the level of total protein expression was controlled by changes in translational or transcriptional regulation, which were induced or reduced by the substitution or insertion of 3′ region nucleotides. Conclusions We found
Human ribosomal RNA gene: nucleotide sequence of the transcription initiation region and comparison of three mammalian genes.

PubMed Central

Financsek, I; Mizumoto, K; Mishima, Y; Muramatsu, M

1982-01-01

The transcription initiation site of the human ribosomal RNA gene (rDNA) was located by using the single-strand specific nuclease protection method and by determining the first nucleotide of the in vitro capped 45S preribosomal RNA. The sequence of 1,211 nucleotides surrounding the initiation site was determined. The sequenced region was found to consist of 75% G and C and to contain a number of short direct and inverted repeats and palindromes. By comparison of the corresponding initiation regions of three mammalian species, several conserved sequences were found upstream and downstream from the transcription starting point. Two short A + T-rich sequences are present on human, mouse, and rat ribosomal RNA genes between the initiation site and 40 nucleotides upstream, and a C + T cluster is located at a position around -60. At and downstream from the initiation site, a common sequence, T-AG-C-T-G-A-C-A-C-G-C-T-G-T-C-C-T-CT-T, was found in the three genes from position -1 through +18. The strong conservation of these sequences suggests their functional significance in rDNA. The S1 nuclease protection experiments with cloned rDNA fragments indicated the presence in human 45S RNA of molecules several hundred nucleotides shorter than the supposed primary transcript. The first 19 nucleotides of these molecules appear identical--except for one mismatch--to the nucleotide sequence of the 5' end of a supposed early processing product of the mouse 45S RNA. Images PMID:6954460
Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

PubMed

Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

2014-01-01

High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.
Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

PubMed Central

Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

2016-01-01

DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

Complete nucleotide sequence and genome organization of a novel allexivirus from alfalfa (Medicago sativa)

USDA-ARS?s Scientific Manuscript database

A new species of the family Alphaflexiviridae provisionally named Alfalfa virus S (AVS) was diagnosed in alfalfa samples originating from Sudan. A complete nucleotide sequence of the viral genome consisting of 8,349 nucleotides excluding the 3’ poly(A) tail was determined by Illumina NGS technology ...
Molecular cloning of a cDNA encoding the precursor of adenoregulin from frog skin. Relationships with the vertebrate defensive peptides, dermaseptins.

PubMed

Amiche, M; Ducancel, F; Lajeunesse, E; Boulain, J C; Ménez, A; Nicolas, P

1993-03-31

Adenoregulin has recently been isolated from Phyllomedusa skin as a 33 amino acid residues peptide which enhanced binding of agonists to the A1 adenosine receptor. In order to study the structure of the precursor of adenoregulin we constructed a cDNA library from mRNAs extracted from the skin of Phyllomedusa bicolor. We detected the complete nucleotide sequence of a cDNA encoding the adenoregulin biosynthetic precursor. The deduced sequence of the precursor is 81 amino acids long, exhibits a putative signal sequence at the NH2 terminus and contains a single copy of the biologically active peptide at the COOH terminus. Structural and conformational homologies that are observed between adenoregulin and the dermaseptins, antimicrobial peptides exhibiting strong membranolytic activities against various pathogenic agents, suggest that adenoregulin is an additional member of the growing family of cytotropic antimicrobial peptides that allow vertebrate animals to defend themselves against microorganisms. As such, the adenosine receptor regulating activity of adenoregulin could be due to its ability to interact with and disrupt membranes lipid bilayers.
Sequence of interleukin-2 isolated from human placental poly A+ RNA: possible role in maintenance of fetal allograft.

PubMed

Chernicky, C L; Tan, H; Burfeind, P; Ilan, J; Ilan, J

1996-02-01

There are several cell types within the placenta that produce cytokines which can contribute to the regulatory mechanisms that ensure normal pregnancy. The immunological milieu at the maternofetal interface is considered to be crucial for survival of the fetus. Interleukin-2 (IL-2) is expressed by the syncytiotrophoblast, the cell layer between the mother and the fetus. IL-2 appears to be a key factor in maintenance of pregnancy. Therefore, it was important to determine the sequence of human placental interleukin-2. Direct sequencing of human placental IL-2 cDNA was determined for the coding region. Subclone sequencing was carried out for the 5'- and 3'-untranslated regions (5'-UTR and 3'-UTR). The 5'-UTR for human placental IL-2 cDNA is 294 bp, which is 247 nucleotides longer than that reported for cDNA IL-2 derived from T cells. The sequence of the coding region is identical to that reported for T cell IL-2, while sequence analysis of the polymerase chain reaction (PCR) product showed that the cDNA from the 3' end was the same as that reported for cDNA from T cells. Human placental IL-2 cDNA is 1,028 base pairs (excluding the poly A tail), which is 247 bp longer at the 5' end than that reported for IL-2 T cell cDNA. Therefore, the extended 5'-UTR of the placental IL-2 cDNA may be a consequence of alternative promoter utilization in the placenta.
Sequence and pattern of expression of a bovine homologue of a human mitochondrial transport protein associated with Grave's disease.

PubMed

Fiermonte, G; Runswick, M J; Walker, J E; Palmieri, F

1992-01-01

A human cDNA has been isolated previously from a thyroid library with the aid of serum from a patient with Grave's disease. It encodes a protein belonging to the mitochondrial metabolite carrier family, referred to as the Grave's disease carrier protein (GDC). Using primers based on this sequence, overlapping cDNAs encoding the bovine homologue of the GDC have been isolated from total bovine heart poly(A)+ cDNA. The bovine protein is 18 amino acids shorter than the published human sequence, but if a frame shift requiring the removal of one nucleotide is introduced into the human cDNA sequence, the human and bovine proteins become identical in their C-terminal regions, and 308 out of 330 amino acids are conserved over their entire sequences. The bovine cDNA has been used to investigate the expression of the GDC in various bovine tissues. In the tissues that were examined, the GDC is most strongly expressed in the thyroid, but substantial amounts of its mRNA were also detected in liver, lung and kidney, and lesser amounts in heart and skeletal muscle.
Nucleotide sequence and proposed secondary structure of Columnea latent viroid: a natural mosaic of viroid sequences.

PubMed Central

Hammond, R; Smith, D R; Diener, T O

1989-01-01

The Columnea latent viroid (CLV) occurs latently in certain Columnea erythrophae plants grown commercially. In potato and tomato, CLV causes potato spindle tuber viroid (PSTV)-like symptoms. Its nucleotide sequence and proposed secondary structure reveal that CLV consists of a single-stranded circular RNA of 370 nucleotides which can assume a rod-like structure with extensive base-pairing characteristic of all known viroids. The electrophoretic mobility of circular CLV under nondenaturing conditions suggests a potential tertiary structure. CLV contains extensive sequence homologies to the PSTV group of viroids but contains a central conserved region identical to that of hop stunt viroid (HSV). CLV also shares some biological properties with each of the two types of viroids. Most probably, CLV is the result of intracellular RNA recombination between an HSV-type and one or more PSTV-type viroids replicating in the same plant. Images PMID:2602114
PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.

PubMed

Wimmer, Katharina; Wernstedt, Annekatrin

2014-01-01

The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.
A nucleotide sequence comparison of coxsackievirus B4 isolates from aquatic samples and clinical specimens.

PubMed Central

Hughes, M. S.; Hoey, E. M.; Coyle, P. V.

1993-01-01

Ten coxsackievirus B4 (CVB4) strains isolated from clinical and environmental sources in Northern Ireland in 1985-7, were compared at the nucleotide sequence level. Dideoxynucleotide sequencing of a polymerase chain reaction (PCR) amplified fragment, spanning the VP1/P2A genomic region, classified the isolates into two distinct groups or genotypes as defined by Rico-Hesse and colleagues for poliovirus type 1. Isolates within each group shared approximately 99% sequence identity at the nucleotide level whereas < or = 86% sequence identity was shared between groups. One isolate derived from a clinical specimen in 1987 was grouped with six CVB4 isolates recovered from the aquatic environment in 1986-7. The second group comprised CVB4 isolates from clinical specimens in 1985-6. Both groups were different at the nucleotide level from the prototype strain isolated in 1950. It was concluded that the method could be used to sub-type CVB4 isolates and would be of value in epidemiological studies of CVB4. Predicted amino acid sequences revealed non-conservation of the tyrosine residue at the VP1/P2A cleavage site but were of little value in distinguishing CVB4 variants. PMID:8386098
The human myelin oligodendrocyte glycoprotein (MOG) gene: Complete nucleotide sequence and structural characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paule Roth, M.; Malfroy, L.; Offer, C.

1995-07-20

Human myelin oligodendrocyte glycoprotein (MOG), a myelin component of the central nervous system, is a candidate target antigen for autoimmune-mediated demyelination. We have isolated and sequenced part of a cosmid clone that contains the entire human MOG gene. The primary nuclear transcript, extending from the putative start of transcription to the site of poly(A) addition, is 15,561 nucleotides in length. The human MOG gene contains 8 exons, separated by 7 introns; canonical intron/exon boundary sites are observed at each junction. The introns vary in size from 242 to 6484 bp and contain numerous repetitive DNA elements, including 14 Alu sequencesmore » within 3 introns. Another Alu element is located in the 3{prime}-untranslated region of the gene. Alu sequences were classified with respect to subfamily assignment. Seven hundred sixty-three nucleotides 5{prime} of the transcription start and 1214 nucleotides 3{prime} of the poly(A) addition sites were also sequenced. The 5{prime}-flanking region revealed the presence of several consensus sequences that could be relevant in the transcription of the MOG gene, in particular binding sites in common with other myelin gene promoters. Two polymorphic intragenic dinucleotide (CA){sub n} and tetranucleotide (TAAA){sub n} repeats were identified and may provide genetic marker tools for association and linkage studies. 50 refs., 3 figs., 3 tabs.« less
Nucleotide Sequence Diversity and Linkage Disequilibrium of Four Nuclear Loci in Foxtail Millet (Setaria italica).

PubMed

He, Shui-Lian; Yang, Yang; Morrell, Peter L; Yi, Ting-Shuang

2015-01-01

Foxtail millet (Setaria italica (L.) Beauv) is one of the earliest domesticated grains, which has been cultivated in northern China by 8,700 years before present (YBP) and across Eurasia by 4,000 YBP. Owing to a small genome and diploid nature, foxtail millet is a tractable model crop for studying functional genomics of millets and bioenergy grasses. In this study, we examined nucleotide sequence diversity, geographic structure, and levels of linkage disequilibrium at four nuclear loci (ADH1, G3PDH, IGS1 and TPI1) in representative samples of 311 landrace accessions across its cultivated range. Higher levels of nucleotide sequence and haplotype diversity were observed in samples from China relative to other sampled regions. Genetic assignment analysis classified the accessions into seven clusters based on nucleotide sequence polymorphisms. Intralocus LD decayed rapidly to half the initial value within ~1.2 kb or less.
New Approaches to Attenuated Hepatitis a Vaccine Development: Cloning and Sequencing of Cell-Culture Adapted Viral cDNA.

DTIC Science & Technology

1987-10-13

after multiple passages in vivo and in vitro. J. Gen. Virol. 67, 1741- 1744. Sabin , A.B. (1985). Oral poliovirus vaccine : history of its development...IN (N NEW APPROACHES TO ATTENUATED HEPATITIS A VACCINE DEVELOPMENT: Q) CLONING AND SEQUENCING OF CELL-CULTURE ADAPTED VIRAL cDNA I ANNUAL REPORT...6ll02Bsl0 A 055 11. TITLE (Include Security Classification) New Approaches to Attenuated Hepatitis A Vaccine Development: Cloning and Sequencing of Cell
Nucleotide sequence analysis of the L gene of Newcastle disease virus: homologies with Sendai and vesicular stomatitis viruses.

PubMed Central

Yusoff, K; Millar, N S; Chambers, P; Emmerson, P T

1987-01-01

The nucleotide sequence of the L gene of the Beaudette C strain of Newcastle disease virus (NDV) has been determined. The L gene is 6704 nucleotides long and encodes a protein of 2204 amino acids with a calculated molecular weight of 248822. Mung bean nuclease mapping of the 5' terminus of the L gene mRNA indicates that the transcription of the L gene is initiated 11 nucleotides upstream of the translational start site. Comparison with the amino acid sequences of the L genes of Sendai virus and vesicular stomatitis virus (VSV) suggests that there are several regions of homology between the sequences. These data provide further evidence for an evolutionary relationship between the Paramyxoviridae and the Rhabdoviridae. A non-coding sequence of 46 nucleotides downstream of the presumed polyadenylation site of the L gene may be part of a negative strand leader RNA. Images PMID:3035486
Proteome-wide Identification of Novel Ceramide-binding Proteins by Yeast Surface cDNA Display and Deep Sequencing.

PubMed

Bidlingmaier, Scott; Ha, Kevin; Lee, Nam-Kyung; Su, Yang; Liu, Bin

2016-04-01

Although the bioactive sphingolipid ceramide is an important cell signaling molecule, relatively few direct ceramide-interacting proteins are known. We used an approach combining yeast surface cDNA display and deep sequencing technology to identify novel proteins binding directly to ceramide. We identified 234 candidate ceramide-binding protein fragments and validated binding for 20. Most (17) bound selectively to ceramide, although a few (3) bound to other lipids as well. Several novel ceramide-binding domains were discovered, including the EF-hand calcium-binding motif, the heat shock chaperonin-binding motif STI1, the SCP2 sterol-binding domain, and the tetratricopeptide repeat region motif. Interestingly, four of the verified ceramide-binding proteins (HPCA, HPCAL1, NCS1, and VSNL1) and an additional three candidate ceramide-binding proteins (NCALD, HPCAL4, and KCNIP3) belong to the neuronal calcium sensor family of EF hand-containing proteins. We used mutagenesis to map the ceramide-binding site in HPCA and to create a mutant HPCA that does not bind to ceramide. We demonstrated selective binding to ceramide by mammalian cell-produced wild type but not mutant HPCA. Intriguingly, we also identified a fragment from prostaglandin D2synthase that binds preferentially to ceramide 1-phosphate. The wide variety of proteins and domains capable of binding to ceramide suggests that many of the signaling functions of ceramide may be regulated by direct binding to these proteins. Based on the deep sequencing data, we estimate that our yeast surface cDNA display library covers ∼60% of the human proteome and our selection/deep sequencing protocol can identify target-interacting protein fragments that are present at extremely low frequency in the starting library. Thus, the yeast surface cDNA display/deep sequencing approach is a rapid, comprehensive, and flexible method for the analysis of protein-ligand interactions, particularly for the study of non-protein ligands. �
Opsin cDNA sequences of a UV and green rhodopsin of the satyrine butterfly Bicyclus anynana.

PubMed

Vanhoutte, K J A; Eggen, B J L; Janssen, J J M; Stavenga, D G

2002-11-01

The cDNAs of an ultraviolet (UV) and long-wavelength (LW) (green) absorbing rhodopsin of the bush brown Bicyclus anynana were partially identified. The UV sequence, encoding 377 amino acids, is 76-79% identical to the UV sequences of the papilionids Papilio glaucus and Papilio xuthus and the moth Manduca sexta. A dendrogram derived from aligning the amino acid sequences reveals an equidistant position of Bicyclus between Papilio and Manduca. The sequence of the green opsin cDNA fragment, which encodes 242 amino acids, represents six of the seven transmembrane regions. At the amino acid level, this fragment is more than 80% identical to the corresponding LW opsin sequences of Dryas, Heliconius, Papilio (rhodopsin 2) and Manduca. Whereas three LW absorbing rhodopsins were identified in the papilionid butterflies, only one green opsin was found in B. anynana.
Molecular cloning of actin genes in Trichomonas vaginalis and phylogeny inferred from actin sequences.

PubMed

Bricheux, G; Brugerolle, G

1997-08-01

The parasitic protozoan Trichomonas vaginalis is known to contain the ubiquitous and highly conserved protein actin. A genomic library and a cDNA library have been screened to identify and clone the actin gene(s) of T. vaginalis. The nucleotide sequence of one gene and its flanking regions have been determined. The open reading frame encodes a protein of 376 amino acids. The sequence is not interrupted by any introns and the promoter could be represented by a 10 bp motif close to a consensus motif also found upstream of most sequenced T. vaginalis genes. The five different clones isolated from the cDNA library have similar sequences and encode three actin proteins differing only by one or two amino acids. A phylogenetic analysis of 31 actin sequences by distance matrix and parsimony methods, using centractin as outgroup, gives congruent trees with Parabasala branching above Diplomonadida.
cDNA cloning and heterologous expression of a wheat proteinase inhibitor of subtilisin and chymotrypsin (WSCI) that interferes with digestive enzymes of insect pests.

PubMed

Di Gennaro, Simone; Ficca, Anna G; Panichi, Daniela; Poerio, Elia

2005-04-01

A cDNA encoding the proteinase inhibitor WSCI (wheat subtilisin/chymotrypsin inhibitor) was isolated by RT-PCR. Degenerate oligonucleotide primers were designed based on the amino acid sequence of WSCI and on the nucleotide sequence of the two homologous inhibitors (CI-2A and CI-2B) isolated from barley. For large-scale production, wsci cDNA was cloned into the E. coli vector pGEX-2T. The fusion protein GST-WSCI was efficiently produced in the bacterial expression system and, as the native inhibitor, was capable of inhibiting bacterial subtilisin, mammalian chymotrypsins and chymotrypsin-like activities present in crude extracts of a number of insect larvae ( Helicoverpa armigera , Plodia interpunctella and Tenebrio molitor ). The recombinant protein produced was also able to interfere with chymotrypsin-like activity isolated from immature wheat caryopses. These findings support a physiological role for this inhibitor during grain maturation.
[Replication of Streptomyces plasmids: the DNA nucleotide sequence of plasmid pSB 24.2].

PubMed

Bolotin, A P; Sorokin, A V; Aleksandrov, N N; Danilenko, V N; Kozlov, Iu I

1985-11-01

The nucleotide sequence of DNA in plasmid pSB 24.2, a natural deletion derivative of plasmid pSB 24.1 isolated from S. cyanogenus was studied. The plasmid amounted by its size to 3706 nucleotide pairs. The G-C composition was equal to 73 per cent. The analysis of the DNA structure in plasmid pSB 24.2 revealed the protein-encoding sequence of DNA, the continuity of which was significant for replication of the plasmid containing more than 1300 nucleotide pairs. The analysis also revealed two A-T-rich areas of DNA, the G-C composition of which was less than 55 per cent and a DNA area with a branched pin structure. The results may be of value in investigation of plasmid replication in actinomycetes and experimental cloning of DNA with this plasmid as a vector.
Characterization and distribution of a maize cDNA encoding a peptide similar to the catalytic region of second messenger dependent protein kinases

NASA Technical Reports Server (NTRS)

Biermann, B.; Johnson, E. M.; Feldman, L. J.

1990-01-01

Maize (Zea mays) roots respond to a variety of environmental stimuli which are perceived by a specialized group of cells, the root cap. We are studying the transduction of extracellular signals by roots, particularly the role of protein kinases. Protein phosphorylation by kinases is an important step in many eukaryotic signal transduction pathways. As a first phase of this research we have isolated a cDNA encoding a maize protein similar to fungal and animal protein kinases known to be involved in the transduction of extracellular signals. The deduced sequence of this cDNA encodes a polypeptide containing amino acids corresponding to 33 out of 34 invariant or nearly invariant sequence features characteristic of protein kinase catalytic domains. The maize cDNA gene product is more closely related to the branch of serine/threonine protein kinase catalytic domains composed of the cyclic-nucleotide- and calcium-phospholipid-dependent subfamilies than to other protein kinases. Sequence identity is 35% or more between the deduced maize polypeptide and all members of this branch. The high structural similarity strongly suggests that catalytic activity of the encoded maize protein kinase may be regulated by second messengers, like that of all members of this branch whose regulation has been characterized. Northern hybridization with the maize cDNA clone shows a single 2400 base transcript at roughly similar levels in maize coleoptiles, root meristems, and the zone of root elongation, but the transcript is less abundant in mature leaves. In situ hybridization confirms the presence of the transcript in all regions of primary maize root tissue.
Cloning and expression of cDNA coding for bouganin.

PubMed

den Hartog, Marcel T; Lubelli, Chiara; Boon, Louis; Heerkens, Sijmie; Ortiz Buijsse, Antonio P; de Boer, Mark; Stirpe, Fiorenzo

2002-03-01

Bouganin is a ribosome-inactivating protein that recently was isolated from Bougainvillea spectabilis Willd. In this work, the cloning and expression of the cDNA encoding for bouganin is described. From the cDNA, the amino-acid sequence was deduced, which correlated with the primary sequence data obtained by amino-acid sequencing on the native protein. Bouganin is synthesized as a pro-peptide consisting of 305 amino acids, the first 26 of which act as a leader signal while the 29 C-terminal amino acids are cleaved during processing of the molecule. The mature protein consists of 250 amino acids. Using the cDNA sequence encoding the mature protein of 250 amino acids, a recombinant protein was expressed, purified and characterized. The recombinant molecule had similar activity in a cell-free protein synthesis assay and had comparable toxicity on living cells as compared to the isolated native bouganin.
Conditional poliovirus mutants made by random deletion mutagenesis of infectious cDNA.

PubMed Central

Kirkegaard, K; Nelsen, B

1990-01-01

Small deletions were introduced into DNA plasmids bearing cDNA copies of Mahoney type 1 poliovirus RNA. The procedure used was similar to that of P. Hearing and T. Shenk (J. Mol. Biol. 167:809-822, 1983), with modifications designed to introduce only one lesion randomly into each DNA molecule. Methods to map small deletions in either large DNA or RNA molecules were employed. Two poliovirus mutants, VP1-101 and VP1-102, were selected from mutagenized populations on the basis of their host range phenotype, showing a large reduction in the relative numbers of plaques on CV1 and HeLa cells compared with wild-type virus. The deletions borne by the mutant genomes were mapped to the region encoding the amino terminus of VP1. That these lesions were responsible for the mutant phenotypes was substantiated by reintroduction of the sequenced lesions into a wild-type poliovirus cDNA by deoxyoligonucleotide-directed mutagenesis. The deletion of nucleotides encoding amino acids 8 and 9 of VP1 was responsible for the VP1-101 phenotype; the VP1-102 defect was caused by the deletion of the sequences encoding the first four amino acids of VP1. The peptide sequence at the VP1-VP3 proteolytic cleavage site was altered from glutamine-glycine to glutamine-methionine in VP1-102; this apparently did not alter the proteolytic cleavage pattern. The biochemical defects resulting from these mutations are discussed in the accompanying report. Images PMID:2152811
Nucleotide sequence and regulatory studies of VGF, a nervous system-specific mRNA that is rapidly and relatively selectively induced by nerve growth factor.

PubMed

Salton, S R

1991-09-01

A nervous system-specific mRNA that is rapidly induced in PC12 cells to a greater extent by nerve growth factor (NGF) than by epidermal growth factor treatment has been cloned. The polypeptide deduced from the nucleic acid sequence of the NGF33.1 cDNA clone contains regions of amino acid sequence identity with that predicted by the cDNA clone VGF, and further analysis suggests that both NGF33.1 and VGF cDNA clones very likely correspond to the same mRNA (VGF). In this report both the nucleic acid sequence that corresponds to VGF mRNA and the polypeptide predicted by the NGF33.1 cDNA clone are presented. Genomic Southern analysis and database comparison did not detect additional sequences with high homology to the VGF gene. Induction of VGF mRNA by depolarization and phorbol 12-myristate 13-acetate treatment was greater than by serum stimulation or protein kinase A pathway activation. These studies suggest that VGF mRNA is induced to the greatest extent by NGF treatment and that VGF is one of the most rapidly regulated neuronal mRNAs identified in PC12 cells.

The complete nucleotide sequence and genome organization of a novel betaflexivirus infecting Citrullus lanatus.

PubMed

Xin, Min; Zhang, Peipei; Liu, Wenwen; Ren, Yingdang; Cao, Mengji; Wang, Xifeng

2017-10-01

The complete nucleotide sequence of a novel positive single-stranded (+ss) RNA virus, tentatively named watermelon virus A (WVA), was determined using a combination of three methods: RNA sequencing, small RNA sequencing, and Sanger sequencing. The full genome of WVA is comprised of 8,372 nucleotides (nt), excluding the poly (A) tail, and contains four open reading frames (ORFs). The largest ORF, ORF1 encodes a putative replication-associated polyprotein (RP) with three conserved domains. ORF2 and ORF4 encode a movement protein (MP) and coat protein (CP), respectively. The putative product encoded by ORF3, of an estimated molecular mass of 25 kDa, has no significant similarity with other proteins. Identity and phylogenetic analysis indicate that WVA is a new virus, closely related to members of the family Betaflexiviridae. However, the final taxonomic allocation of WVA within the family is yet to be determined.
Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

DOEpatents

McCutchen-Maloney, Sandra L.

2002-01-01

DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.
Characterization of a tandemly repeated DNA sequence family originally derived by retroposition of tRNA(Glu) in the newt.

PubMed

Nagahashi, S; Endoh, H; Suzuki, Y; Okada, N

1991-11-20

A previous report from this laboratory showed that in vitro transcription of total genomic DNA of the newt Cynopus pyrrhogaster resulted in a discrete sized 8 S RNA, which represented highly repetitive and transcribable sequences with a glutamic acid tRNA-like structure in the newt genome. We isolated four independent clones from a newt genomic library and determined the complete sequences of three 2000 to 2400 base-pair PstI fragments spanning the 8 S RNA gene. The glutamic acid tRNA-related segment in the 8 S RNA gene contains the CCA sequence expected as the 3' terminus of a tRNA molecule. Further, the 11 nucleotides located 13 nucleotides upstream from one of the two transcription initiation sites of the 8 S RNA were found to be repeated in the region upstream from the termination site, suggesting that the original unit, which is shorter than the 8 S RNA, was retrotransposed via cDNA intermediates from the PolIII transcript. In the upstream region of the 8 S RNA gene, a 360 nucleotide unit containing the glutamic acid tRNA-related segment was found to be duplicated (clones NE1 and NE10) or triplicated (clone NE3). Except for the difference in the number of the 360 nucleotide unit, the three sequences of the 2000 to 2400 base-pair PstI fragment were essentially the same with only a few mutations and minor deletions. Inverse polymerase chain reaction and sequence determination of the products, together with a Southern hybridization experiment, demonstrated that the family consists of a tandemly repeated unit of 3300, 3700 or 4100 base-pairs. Thus during evolution, this family in the newt was created by retroposition via cDNA intermediates, followed by duplication or triplication of the 360 nucleotide unit and multiplication of the 3300 to 4100 base-pair region at the DNA level.
Nucleotide sequence of the gene determining plasmid-mediated citrate utilization.

PubMed Central

Ishiguro, N; Sato, G

1985-01-01

The citrate utilization determinant from transposon Tn3411 has been cloned and sequenced, and its polypeptide products have been characterized in minicell experiments. The nucleotide sequence was determined for a 2,047-base-pair BglII restriction endonuclease fragment that includes the citrate determinant. This region contains an open reading frame that would encode a 431-amino-acid very hydrophobic polypeptide and which is preceded by a reasonable ribosomal binding site. However, the single polypeptide found in minicell experiments had an apparent molecular weight of 35,000 on sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Images PMID:2999087
Constructing and detecting a cDNA library for mites.

PubMed

Hu, Li; Zhao, YaE; Cheng, Juan; Yang, YuanJun; Li, Chen; Lu, ZhaoHui

2015-10-01

RNA extraction and construction of complementary DNA (cDNA) library for mites have been quite challenging due to difficulties in acquiring tiny living mites and breaking their hard chitin. The present study is to explore a better method to construct cDNA library for mites that will lay the foundation on transcriptome and molecular pathogenesis research. We selected Psoroptes cuniculi as an experimental subject and took the following steps to construct and verify cDNA library. First, we combined liquid nitrogen grinding with TRIzol for total RNA extraction. Then, switching mechanism at 5' end of the RNA transcript (SMART) technique was used to construct full-length cDNA library. To evaluate the quality of cDNA library, the library titer and recombination rate were calculated. The reliability of cDNA library was detected by sequencing and analyzing positive clones and genes amplified by specific primers. The results showed that the RNA concentration was 836 ng/μl and the absorbance ratio at 260/280 nm was 1.82. The library titer was 5.31 × 10(5) plaque-forming unit (PFU)/ml and the recombination rate was 98.21%, indicating that the library was of good quality. In the 33 expressed sequence tags (ESTs) of P. cuniculi, two clones of 1656 and 1658 bp were almost identical with only three variable sites detected, which had an identity of 99.63% with that of Psoroptes ovis, indicating that the cDNA library was reliable. Further detection by specific primers demonstrated that the 553-bp Pso c II gene sequences of P. cuniculi had an identity of 98.56% with those of P. ovis, confirming that the cDNA library was not only reliable but also feasible.
Comparison of Human and Guinea Pig Acetylcholinesterase Sequences and Rates of Oxime-Assisted Reactivation

DTIC Science & Technology

2010-01-01

of appropriate animal model systems. For OP poisoning, the guinea pig (Cavia porcellus) is a commonly used animal model because guinea pigs more...endogenous bioscavenger in vivo. Although guinea pigs historically have been used to test OP poisoning therapies, it has been found recently that guinea pig AChE...transcribed mRNA encoding guinea pig AChE, amplified the resulting cDNA, and sequenced this product. The nucleotide and deduced amino acid sequences of
The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus.

PubMed Central

Gustafson, G; Armour, S L

1986-01-01

The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus (BSMV) has been determined. The sequence is 3289 nucleotides in length and contains four open reading frames (ORFs) which code for proteins of Mr 22,147 (ORF1), Mr 58,098 (ORF2), Mr 17,378 (ORF3), and Mr 14,119 (ORF4). The predicted N-terminal amino acid sequence of the polypeptide encoded by the ORF nearest the 5'-end of the RNA (ORF1) is identical (after the initiator methionine) to the published N-terminal amino acid sequence of BSMV coat protein for 29 of the first 30 amino acids. ORF2 occupies the central portion of the coding region of RNA beta and ORF3 is located at the 3'-end. The ORF4 sequence overlaps the 3'-region of ORF2 and the 5'-region of ORF3 and differs in codon usage from the other three RNA beta ORFs. The coding region of RNA beta is followed by a poly(A) tract and a 238 nucleotide tRNA-like structure which are common to all three BSMV genomic RNAs. Images PMID:3754962
Construction of cDNA library and preliminary analysis of expressed sequence tags from Siberian tiger

PubMed Central

Liu, Chang-Qing; Lu, Tao-Feng; Feng, Bao-Gang; Liu, Dan; Guan, Wei-Jun; Ma, Yue-Hui

2010-01-01

In this study we successfully constructed a full-length cDNA library from Siberian tiger, Panthera tigris altaica, the most well-known wild Animal. Total RNA was extracted from cultured Siberian tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.30×106 pfu/ml and 1.62×109 pfu/ml respectively. The proportion of recombinants from unamplified library was 90.5% and average length of exogenous inserts was 1.13 kb. A total of 282 individual ESTs with sizes ranging from 328 to 1,142bps were then analyzed the BLASTX score revealed that 53.9% of the sequences were classified as strong match, 38.6% as nominal and 7.4% as weak match. 28.0% of them were found to be related to enzyme/catalytic protein, 20.9% ESTs to metabolism, 13.1% ESTs to transport, 12.1% ESTs to signal transducer/cell communication, 9.9% ESTs to structure protein, 3.9% ESTs to immunity protein/defense metabolism, 3.2% ESTs to cell cycle, and 8.9 ESTs classified as novel genes. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genomic research of Siberian tigers. PMID:20941376
Human retina-specific amine oxidase (RAO): cDNA cloning, tissue expression, and chromosomal mapping

DOE Office of Scientific and Technical Information (OSTI.GOV)

Imamura, Yutaka; Kubota, Ryo; Wang, Yimin

In search of candidate genes for hereditary retinal disease, we have employed a subtractive and differential cDNA cloning strategy and isolated a novel retina-specific cDNA. Nucleotide sequence analysis revealed an open reading frame of 2187 bp, which encodes a 729-amino-acid protein with a calculated molecular mass of 80,644 Da. The putative protein contained a conserved domain of copper amine oxidase, which is found in various species from bacteria to mammals. It showed the highest homology to bovine serum amine oxidase, which is believed to control the level of serum biogenic amines. Northern blot analysis of human adult and fetal tissuesmore » revealed that the protein is expressed abundantly and specifically in retina as a 2.7-kb transcript. Thus, we considered this protein a human retina-specific amine oxidase (RAO). The RAO gene (AOC2) was mapped by fluorescence in situ hybridization to human chromosome 17q21. We propose that AOC2 may be a candidate gene for hereditary ocular diseases. 38 refs., 4 figs.« less
Predicted stem-loop structures and variation in nucleotide sequence of 3' noncoding regions among animal calicivirus genomes.

PubMed

Seal, B S; Neill, J D; Ridpath, J F

1994-07-01

Caliciviruses are nonenveloped with a polyadenylated genome of approximately 7.6 kb and a single capsid protein. The "RNA Fold" computer program was used to analyze 3'-terminal noncoding sequences of five feline calicivirus (FCV), rabbit hemorrhagic disease virus (RHDV), and two San Miguel sea lion virus (SMSV) isolates. The FCV 3'-terminal sequences are 40-46 nucleotides in length and 72-91% similar. The FCV sequences were predicted to contain two possible duplex structures and one stem-loop structure with free energies of -2.1 to -18.2 kcal/mole. The RHDV genomic 3'-terminal RNA sequences are 54 nucleotides in length and share 49% sequence similarity to homologous regions of the FCV genome. The RHDV sequence was predicted to form two duplex structures in the 3'-terminal noncoding region with a single stem-loop structure, resembling that of FCV. In contrast, the SMSV 1 and 4 genomic 3'-terminal noncoding sequences were 185 and 182 nucleotides in length, respectively. Ten possible duplex structures were predicted with an average structural free energy of -35 kcal/mole. Sequence similarity between the two SMSV isolates was 75%. Furthermore, extensive cloverleaflike structures are predicted in the 3' noncoding region of the SMSV genome, in contrast to the predicted single stem-loop structures of FCV or RHDV.
Nucleotide sequence of an exceptionally long 5.8S ribosomal RNA from Crithidia fasciculata.

PubMed Central

Schnare, M N; Gray, M W

1982-01-01

In Crithidia fasciculata, a trypanosomatid protozoan, the large ribosomal subunit contains five small RNA species (e, f, g, i, j) in addition to 5S rRNA [Gray, M.W. (1981) Mol. Cell. Biol. 1, 347-357]. The complete primary sequence of species i is shown here to be pAACGUGUmCGCGAUGGAUGACUUGGCUUCCUAUCUCGUUGA ... AGAmACGCAGUAAAGUGCGAUAAGUGGUApsiCAAUUGmCAGAAUCAUUCAAUUACCGAAUCUUUGAACGAAACGG ... CGCAUGGGAGAAGCUCUUUUGAGUCAUCCCCGUGCAUGCCAUAUUCUCCAmGUGUCGAA(C)OH. This sequence establishes that species i is a 5.8S rRNA, despite its exceptional length (171-172 nucleotides). The extra nucleotides in C. fasciculata 5.8S rRNA are located in a region whose primary sequence and length are highly variable among 5.8S rRNAs, but which is capable of forming a stable hairpin loop structure (the "G+C-rich hairpin"). The sequence of C. fasciculata 5.8S rRNA is no more closely related to that of another protozoan, Acanthamoeba castellanii, than it is to representative 5.8S rRNA sequences from the other eukaryotic kingdoms, emphasizing the deep phylogenetic divisions that seem to exist within the Kingdom Protista. Images PMID:7079176
Nucleotide sequence of the gag gene and gag-pol junction of feline leukemia virus.

PubMed Central

Laprevotte, I; Hampe, A; Sherr, C J; Galibert, F

1984-01-01

The nucleotide sequence of the gag gene of feline leukemia virus and its flanking sequences were determined and compared with the corresponding sequences of two strains of feline sarcoma virus and with that of the Moloney strain of murine leukemia virus. A high degree of nucleotide sequence homology between the feline leukemia virus and murine leukemia virus gag genes was observed, suggesting that retroviruses of domestic cats and laboratory mice have a common, proximal evolutionary progenitor. The predicted structure of the complete feline leukemia virus gag gene precursor suggests that the translation of nonglycosylated and glycosylated gag gene polypeptides is initiated at two different AUG codons. These initiator codons fall in the same reading frame and are separated by a 222-base-pair segment which encodes an amino terminal signal peptide. The nucleotide sequence predicts the order of amino acids in each of the individual gag-coded proteins (p15, p12, p30, p10), all of which derive from the gag gene precursor. Stable stem-and-loop secondary structures are proposed for two regions of viral RNA. The first falls within sequences at the 5' end of the viral genome, together with adjacent palindromic sequences which may play a role in dimer linkage of RNA subunits. The second includes coding sequences at the gag-pol junction and is proposed to be involved in translation of the pol gene product. Sequence analysis of the latter region shows that the gag and pol genes are translated in different reading frames. Classical consensus splice donor and acceptor sequences could not be localized to regions which would permit synthesis of the expected gag-pol precursor protein. Alternatively, we suggest that the pol gene product (RNA-dependent DNA polymerase) could be translated by a frameshift suppressing mechanism which could involve cleavage modification of stems and loops in a manner similar to that observed in tRNA processing. PMID:6328019
Direct sequencing of hepatitis A virus and norovirus RT-PCR products from environmentally contaminated oyster using M13-tailed primers.

PubMed

Williams-Woods, Jacquelina; González-Escalona, Narjol; Burkhardt, William

2011-12-01

Human norovirus (HuNoV) and hepatitis A (HAV) are recognized as leading causes of non-bacterial foodborne associated illnesses in the United States. DNA sequencing is generally considered the standard for accurate viral genotyping in support of epidemiological investigations. Due to the genetic diversity of noroviruses (NoV), degenerate primer sets are often used in conventional reverse transcription (RT) PCR and real-time RT-quantitative PCR (RT-qPCR) for the detection of these viruses and cDNA fragments are generally cloned prior to sequencing. HAV detection methods that are sensitive and specific for real-time RT-qPCR yields small fragments sizes of 89-150bp, which can be difficult to sequence. In order to overcome these obstacles, norovirus and HAV primers were tailed with M13 forward and reverse primers. This modification increases the sequenced product size and allows for direct sequencing of the amplicons utilizing complementary M13 primers. HuNoV and HAV cDNA products from environmentally contaminated oysters were analyzed using this method. Alignments of the sequenced samples revealed ≥95% nucleotide identities. Tailing NoV and HAV primers with M13 sequence increases the cDNA product size, offers an alternative to cloning, and allows for rapid, accurate and direct sequencing of cDNA products produced by conventional or real time RT-qPCR assays. Published by Elsevier B.V.
Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology.

PubMed

Otto, Thomas D; Sanders, Mandy; Berriman, Matthew; Newbold, Chris

2010-07-15

The accuracy of reference genomes is important for downstream analysis but a low error rate requires expensive manual interrogation of the sequence. Here, we describe a novel algorithm (Iterative Correction of Reference Nucleotides) that iteratively aligns deep coverage of short sequencing reads to correct errors in reference genome sequences and evaluate their accuracy. Using Plasmodium falciparum (81% A + T content) as an extreme example, we show that the algorithm is highly accurate and corrects over 2000 errors in the reference sequence. We give examples of its application to numerous other eukaryotic and prokaryotic genomes and suggest additional applications. The software is available at http://icorn.sourceforge.net
Inferring epidemiological dynamics of infectious diseases using Tajima's D statistic on nucleotide sequences of pathogens.

PubMed

Kim, Kiyeon; Omori, Ryosuke; Ito, Kimihito

2017-12-01

The estimation of the basic reproduction number is essential to understand epidemic dynamics, and time series data of infected individuals are usually used for the estimation. However, such data are not always available. Methods to estimate the basic reproduction number using genealogy constructed from nucleotide sequences of pathogens have been proposed so far. Here, we propose a new method to estimate epidemiological parameters of outbreaks using the time series change of Tajima's D statistic on the nucleotide sequences of pathogens. To relate the time evolution of Tajima's D to the number of infected individuals, we constructed a parsimonious mathematical model describing both the transmission process of pathogens among hosts and the evolutionary process of the pathogens. As a case study we applied this method to the field data of nucleotide sequences of pandemic influenza A (H1N1) 2009 viruses collected in Argentina. The Tajima's D-based method estimated basic reproduction number to be 1.55 with 95% highest posterior density (HPD) between 1.31 and 2.05, and the date of epidemic peak to be 10th July with 95% HPD between 22nd June and 9th August. The estimated basic reproduction number was consistent with estimation by birth-death skyline plot and estimation using the time series of the number of infected individuals. These results suggested that Tajima's D statistic on nucleotide sequences of pathogens could be useful to estimate epidemiological parameters of outbreaks. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
The Complete Nucleotide Sequence of the Mitochondrial Genome of Bactrocera minax (Diptera: Tephritidae)

PubMed Central

Zhang, Bin; Nardi, Francesco; Hull-Sanders, Helen; Wan, Xuanwu; Liu, Yinghong

2014-01-01

The complete 16,043 bp mitochondrial genome (mitogenome) of Bactrocera minax (Diptera: Tephritidae) has been sequenced. The genome encodes 37 genes usually found in insect mitogenomes. The mitogenome information for B. minax was compared to the homologous sequences of Bactrocera oleae, Bactrocera tryoni, Bactrocera philippinensis, Bactrocera carambolae, Bactrocera papayae, Bactrocera dorsalis, Bactrocera correcta, Bactrocera cucurbitae and Ceratitis capitata. The analysis indicated the structure and organization are typical of, and similar to, the nine closely related species mentioned above, although it contains the lowest genome-wide A+T content (67.3%). Four short intergenic spacers with a high degree of conservation among the nine tephritid species mentioned above and B. minax were observed, which also have clear counterparts in the control regions (CRs). Correlation analysis among these ten tephritid species revealed close positive correlation between the A+T content of zero-fold degenerate sites (P0FD), the ratio of nucleotide substitution frequency at P0FD sites to all degenerate sites (zero-fold degenerate sites, two-fold degenerate sites and four-fold degenerate sites) and amino acid sequence distance (ASD) were found. Further, significant positive correlation was observed between the A+T content of four-fold degenerate sites (P4FD) and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites; however, we found significant negative correlation between ASD and the A+T content of P4FD, and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites. A higher nucleotide substitution frequency at non-synonymous sites compared to synonymous sites was observed in nad4, the first time that has been observed in an insect mitogenome. A poly(T) stretch at the 5′ end of the CR followed by a [TA(A)]n-like stretch was also found. In addition, a highly conserved G+A-rich sequence block was observed in front of the
A new single-nucleotide polymorphism database for rainbow trout generated through whole genome re-sequencing

USDA-ARS?s Scientific Manuscript database

Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...
Molecular identification and partial sequence analysis of an aryl hydrocarbon receptor from beluga (Delphinapterus leucas)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jensen, B.A.; Hahn, M.E.

1995-12-31

The aryl hydrocarbon receptor (AhR) mediates the effects of many common and potentially toxic organic hydrocarbons, including some polychlorinated biphenyls and dioxins. Since small cetaceans often inhabit industrially polluted coastal waters, comparison of the molecular structure and function of this protein in cetaeans with other marine and mammalian species is important for evaluating the sensitivity of cetaceans to these pollutants. An AhR protein has been identified in beluga liver by photoaffinity labeling. In the present study, the authors sought to clone and sequence an AhR cDNA from beluga as a prelude to studying its structure and function, using reverse-transcription polymerasemore » chain reaction (RT-PCR) and degenerate primers, a 515 base pair fragment was amplified, cloned and sequenced, revealing homology to the PAS domain (ligand binding and dimerization region) of AhRs from terrestrial mammals. This portion of the putative beluga AhR has 82% amino acid and 81% nucleotide sequence identity to the mouse AhR, and 63% amino acid and 64% nucleotide sequence identity to an AhR from the marine fish Fundulus heteroclitus. A beluga cDNA library was synthesized and is currently being screened with the PCR-generated fragment to obtain the complete coding sequence. This is the first molecular evidence of AhR presence in cetaceans.« less
Isolation of a cDNA for a Growth Factor of Vascular Endothelial Cells from Human Lung Cancer Cells: Its Identity with Insulin‐like Growth Factor II

PubMed Central

Hagiwara, Koichi; Kobayashi, Tatsuo; Tobita, Masato; Kikyo, Nobuaki; Yazaki, Yoshio

1995-01-01

We have found growth‐promoting activity for vascular endothelial cells in the conditioned medium of a human lung cancer cell line, T3M‐11. Purification and characterization of the growth‐promoting activity have been carried out using ammonium sulfate precipitation and gel‐exclusion chromatography. The activity migrated as a single peak just after ribonuclease. It did not bind to a heparin affinity column. These results suggest that the activity is not a heparin‐binding growth factor (including fibroblast growth factors) or a vascular endothelial growth factor. To identify the molecule exhibiting the growth‐promoting activity, a cDNA encoding the growth factor was isolated through functional expression cloning in COS‐1 cells from a cDNA library prepared from T3M‐11 cells. The nucleotide sequence encoded by the cDNA proved to be identical with that of insulin‐like growth factor II. PMID:7730145
An atypical topoisomerase II sequence from the slime mold Physarum polycephalum.

PubMed

Hugodot, Yannick; Dutertre, Murielle; Duguet, Michel

2004-01-21

We have determined the complete nucleotide sequence of the cDNA encoding DNA topoisomerase II from Physarum polycephalum. Using degenerate primers, based on the conserved amino acid sequences of other eukaryotic enzymes, a 250-bp fragment was polymerase chain reaction (PCR) amplified. This fragment was used as a probe to screen a Physarum cDNA library. A partial cDNA clone was isolated that was truncated at the 3' end. Rapid amplification of cDNA ends (RACE)-PCR was employed to isolate the remaining portion of the gene. The complete sequence of 4613 bp contains an open reading frame of 4494 bp that codes for 1498 amino acid residues with a theoretical molecular weight of 167 kDa. The predicted amino acid sequence shares similarity with those of other eukaryotes and shows the highest degree of identity with the enzyme of Dictyostelium discoideum. However, the enzyme of P. polycephalum contains an atypical amino-terminal domain very rich in serine and proline, whose function is unknown. Remarkably, both a mitochondrial targeting sequence and a nuclear localization signal were predicted respectively in the amino and carboxy-terminus of the protein, as in the case of human topoisomerase III alpha. At the Physarum genomic level, the topoisomerase II gene encompasses a region of about 16 kbp suggesting a large proportion of intronic sequences, an unusual situation for a gene of a lower eukaryote, often free of introns. Finally, expression of topoisomerase II mRNA does not appear significantly dependent on the plasmodium cycle stage, possibly due to the lack of G1 phase or (and) to a mitochondrial localization of the enzyme.

Method for construction of normalized cDNA libraries

DOEpatents

Soares, Marcelo B.; Efstratiadis, Argiris

1998-01-01

This invention provides a method to normalize a directional cDNA library constructed in a vector that allows propagation in single-stranded circle form comprising: (a) propagating the directional cDNA library in single-stranded circles; (b) generating fragments complementary to the 3' noncoding sequence of the single-stranded circles in the library to produce partial duplexes; (c) purifying the partial duplexes; (d) melting and reassociating the purified partial duplexes to appropriate Cot; and (e) purifying the unassociated single-stranded circles, thereby generating a normalized cDNA library. This invention also provides normalized cDNA libraries generated by the above-described method and uses of the generated libraries.
Method for construction of normalized cDNA libraries

DOEpatents

Soares, M.B.; Efstratiadis, A.

1998-11-03

This invention provides a method to normalize a directional cDNA library constructed in a vector that allows propagation in single-stranded circle form comprising: (a) propagating the directional cDNA library in single-stranded circles; (b) generating fragments complementary to the 3` noncoding sequence of the single-stranded circles in the library to produce partial duplexes; (c) purifying the partial duplexes; (d) melting and reassociating the purified partial duplexes to appropriate Cot; and (e) purifying the unassociated single-stranded circles, thereby generating a normalized cDNA library. This invention also provides normalized cDNA libraries generated by the above-described method and uses of the generated libraries. 19 figs.
Synthesis and evaluations of an acid-cleavable, fluorescently labeled nucleotide as a reversible terminator for DNA sequencing.

PubMed

Tan, Lianjiang; Liu, Yazhi; Li, Xiaowei; Wu, Xin-Yan; Gong, Bing; Shen, Yu-Mei; Shao, Zhifeng

2016-02-11

An acid-cleavable linker based on a dimethylketal moiety was synthesized and used to connect a nucleotide with a fluorophore to produce a 3'-OH unblocked nucleotide analogue as an excellent reversible terminator for DNA sequencing by synthesis.
Epitopes of human testis-specific lactate dehydrogenase deduced from a cDNA sequence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Millan, J.L.; Driscoll, C.E.; LeVan, K.M.

The sequence and structure of human testis-specific L-lactate dehydrogenase (LDHC/sub 4/, LDHX; (L)-lactate:NAD/sup +/ oxidoreductase, EC 1.1.1.27) has been derived from analysis of a complementary DNA (cDNA) clone comprising the complete protein coding region of the enzyme. From the deduced amino acid sequence, human LDHC/sub 4/ is as different from rodent LDHC/sub 4/ (73% homology) as it is from human LDHA/sub 4/ (76% homology) and porcine LDHB/sub 4/ (68% homology). Subunit homologies are consistent with the conclusion that the LDHC gene arose by at least two independent duplication events. Furthermore, the lower degree of homology between mouse and human LDHC/submore » 4/ and the appearance of this isozyme late in evolution suggests a higher rate of mutation in the mammalian LDHC genes than in the LDHA and -B genes. Comparison of exposed amino acid residues of discrete anti-genic determinants of mouse and human LDHC/sub 4/ reveals significant differences. Knowledge of the human LDHC/sub 4/ sequence will help design human-specific peptides useful in the development of a contraceptive vaccine.« less
Shark (Scyliorhinus torazame) metallothionein: cDNA cloning, genomic sequence, and expression analysis.

PubMed

Cho, Young Sun; Choi, Buyl Nim; Ha, En-Mi; Kim, Ki Hong; Kim, Sung Koo; Kim, Dong Soo; Nam, Yoon Kwon

2005-01-01

Novel metallothionein (MT) complementary DNA and genomic sequences were isolated from a cartilaginous shark species, Scyliorhinus torazame. The full-length open reading frame (ORF) of shark MT cDNA encoded 68 amino acids with a high cysteine content (29%). The genomic ORF sequence (932 bp) of shark MT isolated by polymerase chain reaction (PCR) comprised 3 exons with 2 interventing introns. Shark MT sequence shared many conserved features with other vertebrate MTs: overall amino acid identities of shark MT ranged from 47% to 57% with fish MTs, and 41% to 62% with mammalian MTs. However, in addition to these conserved characteristics, shark MT sequence exhibited some unique characteristics. It contained 4 extra amino acids (Lys-Ala-Gly-Arg) at the end of the beta-domain, which have not been reported in any other vertebrate MTs. The last amino acid residue at the C-terminus was Ser, which also has not been reported in fish and mammalian MTs. The MT messenger RNA levels in shark liver and kidney, assessed by semiquantitative reverse transcriptase PCR and RNA blot hybridization, were significantly affected by experimental exposures to heavy metals (cadmium, copper, and zinc). Generally, the transcriptional activation of shark MT gene was dependent on the dose (0-10 mg/kg body weight for injection and 0-20 microM for immersion) and duration (1-10 days); zinc was a more potent inducer than copper and cadmium.
Rapid and efficient cDNA library screening by self-ligation of inverse PCR products (SLIP).

PubMed

Hoskins, Roger A; Stapleton, Mark; George, Reed A; Yu, Charles; Wan, Kenneth H; Carlson, Joseph W; Celniker, Susan E

2005-12-02

cDNA cloning is a central technology in molecular biology. cDNA sequences are used to determine mRNA transcript structures, including splice junctions, open reading frames (ORFs) and 5'- and 3'-untranslated regions (UTRs). cDNA clones are valuable reagents for functional studies of genes and proteins. Expressed Sequence Tag (EST) sequencing is the method of choice for recovering cDNAs representing many of the transcripts encoded in a eukaryotic genome. However, EST sequencing samples a cDNA library at random, and it recovers transcripts with low expression levels inefficiently. We describe a PCR-based method for directed screening of plasmid cDNA libraries. We demonstrate its utility in a screen of libraries used in our Drosophila EST projects for 153 transcription factor genes that were not represented by full-length cDNA clones in our Drosophila Gene Collection. We recovered high-quality, full-length cDNAs for 72 genes and variously compromised clones for an additional 32 genes. The method can be used at any scale, from the isolation of cDNA clones for a particular gene of interest, to the improvement of large gene collections in model organisms and the human. Finally, we discuss the relative merits of directed cDNA library screening and RT-PCR approaches.
PCV: An Alignment Free Method for Finding Homologous Nucleotide Sequences and its Application in Phylogenetic Study.

PubMed

Kumar, Rajnish; Mishra, Bharat Kumar; Lahiri, Tapobrata; Kumar, Gautam; Kumar, Nilesh; Gupta, Rahul; Pal, Manoj Kumar

2017-06-01

Online retrieval of the homologous nucleotide sequences through existing alignment techniques is a common practice against the given database of sequences. The salient point of these techniques is their dependence on local alignment techniques and scoring matrices the reliability of which is limited by computational complexity and accuracy. Toward this direction, this work offers a novel way for numerical representation of genes which can further help in dividing the data space into smaller partitions helping formation of a search tree. In this context, this paper introduces a 36-dimensional Periodicity Count Value (PCV) which is representative of a particular nucleotide sequence and created through adaptation from the concept of stochastic model of Kolekar et al. (American Institute of Physics 1298:307-312, 2010. doi: 10.1063/1.3516320 ). The PCV construct uses information on physicochemical properties of nucleotides and their positional distribution pattern within a gene. It is observed that PCV representation of gene reduces computational cost in the calculation of distances between a pair of genes while being consistent with the existing methods. The validity of PCV-based method was further tested through their use in molecular phylogeny constructs in comparison with that using existing sequence alignment methods.
Intercalation of XR5944 with the estrogen response element is modulated by the tri-nucleotide spacer sequence between half-sites

PubMed Central

Sidell, Neil; Mathad, Raveendra I.; Shu, Feng-jue; Zhang, Zhenjiang; Kallen, Caleb B.; Yang, Danzhou

2011-01-01

DNA-intercalating molecules can impair DNA replication, DNA repair, and gene transcription. We previously demonstrated that XR5944, a DNA bis-intercalator, specifically blocks binding of estrogen receptor-α (ERα) to the consensus estrogen response element (ERE). The consensus ERE sequence is AGGTCAnnnTGACCT, where nnn is known as the tri-nucleotide spacer. Recent work has shown that the tri-nucleotide spacer can modulate ERα-ERE binding affinity and ligand-mediated transcriptional responses. To further understand the mechanism by which XR5944 inhibits ERα-ERE binding, we tested its ability to interact with consensus EREs with variable tri-nucleotide spacer sequences and with natural but non-consensus ERE sequences using one dimensional nuclear magnetic resonance (1D 1H NMR) titration studies. We found that the tri-nucleotide spacer sequence significantly modulates the binding of XR5944 to EREs. Of the sequences that were tested, EREs with CGG and AGG spacers showed the best binding specificity with XR5944, while those spaced with TTT demonstrated the least specific binding. The binding stoichiometry of XR5944 with EREs was 2:1, which can explain why the spacer influences the drug-DNA interaction; each XR5944 spans four nucleotides (including portions of the spacer) when intercalating with DNA. To validate our NMR results, we conducted functional studies using reporter constructs containing consensus EREs with tri-nucleotide spacers CGG, CTG, and TTT. Results of reporter assays in MCF-7 cells indicated that XR5944 was significantly more potent in inhibiting the activity of CGG- than TTT-spaced EREs, consistent with our NMR results. Taken together, these findings predict that the anti-estrogenic effects of XR5944 will depend not only on ERE half-site composition but also on the tri-nucleotide spacer sequence of EREs located in the promoters of estrogen-responsive genes. PMID:21333738
Amino acid sequence of a trypsin inhibitor from a Spirometra (Spirometra erinaceieuropaei).

PubMed

Sanda, A; Uchida, A; Itagaki, T; Kobayashi, H; Inokuchi, N; Koyama, T; Iwama, M; Ohgi, K; Irie, M

2001-12-01

A trypsin inhibitor that is highly homologous with bovine pancreatic trypsin inhibitor (BPTI) was co-purified along with RNase from Spirometra (Spirometra erinaceieuropaei). The amino acid sequence of this inhibitor (SETI) and the nucleotide sequence of the cDNA encoding this protein were determined by protein chemistry and gene technology. SETI contains 68 amino acid residues and has a molecular mass of 7,798 Da. SETI has 31 amino acid residues that are identical with BPTI's sequence, including 6 half-cystine and 5 aromatic amino acid residues. The active site Lys residue in BPTI is replaced by an Arg residue in SETI. SETI is an effective inhibitor of trypsin and moderately inhibits a-chymotrypsin, but less inhibits elastase or subtilisin. SETI was expressed by E. coli containing a PelB vector carrying the SETI encoding cDNA; an expression yield of 0.68 mg/l was obtained. The phylogenetic relationship of SETI and the other BPTI-like trypsin inhibitors was analyzed using most likelihood inference methods.
Method for construction of normalized cDNA libraries

DOEpatents

Soares, Marcelo B.; Efstratiadis, Argiris

1996-01-01

This invention provides a method to normalize a directional cDNA library constructed in a vector that allows propagation in single-stranded circle form comprising: (a) propagating the directional cDNA library in single-stranded circles; (b) generating fragments complementary to the 3' noncoding sequence of the single-stranded circles in the library to produce partial duplexes; (c) purifying the partial duplexes; (d) melting and reassociating the purified partial duplexes to moderate Cot; and (e) purifying the unassociated single-stranded circles, thereby generating a normalized cDNA library.
Method for construction of normalized cDNA libraries

DOEpatents

Soares, M.B.; Efstratiadis, A.

1996-01-09

This invention provides a method to normalize a directional cDNA library constructed in a vector that allows propagation in single-stranded circle form. The method comprises: (a) propagating the directional cDNA library in single-stranded circles; (b) generating fragments complementary to the 3` noncoding sequence of the single-stranded circles in the library to produce partial duplexes; (c) purifying the partial duplexes; (d) melting and reassociating the purified partial duplexes to moderate Cot; and (e) purifying the unassociated single-stranded circles, thereby generating a normalized cDNA library. 4 figs.
Analyses of chicken immunoglobulin light chain cDNA clones indicate a few germline V lambda genes and allotypes of the C lambda locus.

PubMed

Parvari, R; Ziv, E; Lentner, F; Tel-Or, S; Burstein, Y; Schechter, I

1987-01-01

cDNA libraries of chicken spleen and Harder gland (a gland enriched with immunocytes) constructed in pBR322 were screened by differential hybridization and by mRNA hybrid-selected translation. Eleven L-chain cDNA clones were identified from which VL probes were prepared and each was annealed with kidney DNA restriction digests. All VL probes revealed the same set of bands, corresponding to about 15 germline VL genes of one subgroup. The nucleotide sequences of six VL clones showed greater than or equal to 85% homology, and the predicted amino acid sequences were identical or nearly identical to the major N-terminal sequence of L-chains in chicken serum. These findings, and the fact that the VL clones were randomly selected from normal lymphoid tissues, strongly indicate that the bulk of chicken L-chains is encoded by a few germline VL genes, probably much less than 15 since many of the VL genes are known to be pseudogenes. Therefore, it is likely that somatic mechanisms operating prior to specific triggering by antigen play a major role in the generation of antibody diversity in chicken. Analysis of the constant region locus (sequencing of CL gene and cDNAs) demonstrate a single CL isotype and suggest the presence of CL allotypes.
Quantification of differential gene expression by multiplexed targeted resequencing of cDNA

PubMed Central

Arts, Peer; van der Raadt, Jori; van Gestel, Sebastianus H.C.; Steehouwer, Marloes; Shendure, Jay; Hoischen, Alexander; Albers, Cornelis A.

2017-01-01

Whole-transcriptome or RNA sequencing (RNA-Seq) is a powerful and versatile tool for functional analysis of different types of RNA molecules, but sample reagent and sequencing cost can be prohibitive for hypothesis-driven studies where the aim is to quantify differential expression of a limited number of genes. Here we present an approach for quantification of differential mRNA expression by targeted resequencing of complementary DNA using single-molecule molecular inversion probes (cDNA-smMIPs) that enable highly multiplexed resequencing of cDNA target regions of ∼100 nucleotides and counting of individual molecules. We show that accurate estimates of differential expression can be obtained from molecule counts for hundreds of smMIPs per reaction and that smMIPs are also suitable for quantification of relative gene expression and allele-specific expression. Compared with low-coverage RNA-Seq and a hybridization-based targeted RNA-Seq method, cDNA-smMIPs are a cost-effective high-throughput tool for hypothesis-driven expression analysis in large numbers of genes (10 to 500) and samples (hundreds to thousands). PMID:28474677
Plastid: nucleotide-resolution analysis of next-generation sequencing and genomics data.

PubMed

Dunn, Joshua G; Weissman, Jonathan S

2016-11-22

Next-generation sequencing (NGS) informs many biological questions with unprecedented depth and nucleotide resolution. These assays have created a need for analytical tools that enable users to manipulate data nucleotide-by-nucleotide robustly and easily. Furthermore, because many NGS assays encode information jointly within multiple properties of read alignments - for example, in ribosome profiling, the locations of ribosomes are jointly encoded in alignment coordinates and length - analytical tools are often required to extract the biological meaning from the alignments before analysis. Many assay-specific pipelines exist for this purpose, but there remains a need for user-friendly, generalized, nucleotide-resolution tools that are not limited to specific experimental regimes or analytical workflows. Plastid is a Python library designed specifically for nucleotide-resolution analysis of genomics and NGS data. As such, Plastid is designed to extract assay-specific information from read alignments while retaining generality and extensibility to novel NGS assays. Plastid represents NGS and other biological data as arrays of values associated with genomic or transcriptomic positions, and contains configurable tools to convert data from a variety of sources to such arrays. Plastid also includes numerous tools to manipulate even discontinuous genomic features, such as spliced transcripts, with nucleotide precision. Plastid automatically handles conversion between genomic and feature-centric coordinates, accounting for splicing and strand, freeing users of burdensome accounting. Finally, Plastid's data models use consistent and familiar biological idioms, enabling even beginners to develop sophisticated analytical workflows with minimal effort. Plastid is a versatile toolkit that has been used to analyze data from multiple NGS assays, including RNA-seq, ribosome profiling, and DMS-seq. It forms the genomic engine of our ORF annotation tool, ORF-RATER, and is readily
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2011 CFR

2011-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2013 CFR

2013-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2012 CFR

2012-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2010 CFR

2010-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2014 CFR

2014-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
Complete nucleotide sequence of pig (Sus scrofa) mitochondrial genome and dating evolutionary divergence within Artiodactyla.

PubMed

Lin, C S; Sun, Y L; Liu, C Y; Yang, P C; Chang, L C; Cheng, I C; Mao, S J; Huang, M C

1999-08-05

The complete nucleotide sequence of the pig (Sus scrofa) mitochondrial genome, containing 16613bp, is presented in this report. The genome is not a specific length because of the presence of the variable numbers of tandem repeats, 5'-CGTGCGTACA in the displacement loop (D-loop). Genes responsible for 12S and 16S rRNAs, 22 tRNAs, and 13 protein-coding regions are found. The genome carries very few intergenic nucleotides with several instances of overlap between protein-coding or tRNA genes, except in the D-loop region. For evaluating the possible evolutionary relationships between Artiodactyla and Cetacea, the nucleotide substitutions and amino acid sequences of 13 protein-coding genes were aligned by pairwise comparisons of the pig, cow, and fin whale. By comparing these sequences, we suggest that there is a closer relationship between the pig and cow than that between either of these species and fin whale. In addition, the accumulation of transversions and gaps in pig 12S and 16S rRNA genes was compared with that in other eutherian species, including cow, fin whale, human, horse, and harbor seal. The results also reveal a close phylogenetic relationship between pig and cow, as compared to fin whale and others. Thus, according to the sequence differences of mitochondrial rRNA genes in eutherian species, the evolutionary separation of pig and cow occurred about 53-60 million years ago.

Labeled nucleotide phosphate (NP) probes

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2009-02-03

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Nucleotide Sequence Database Comparison for Routine Dermatophyte Identification by Internal Transcribed Spacer 2 Genetic Region DNA Barcoding.

PubMed

Normand, A C; Packeu, A; Cassagne, C; Hendrickx, M; Ranque, S; Piarroux, R

2018-05-01

Conventional dermatophyte identification is based on morphological features. However, recent studies have proposed to use the nucleotide sequences of the rRNA internal transcribed spacer (ITS) region as an identification barcode of all fungi, including dermatophytes. Several nucleotide databases are available to compare sequences and thus identify isolates; however, these databases often contain mislabeled sequences that impair sequence-based identification. We evaluated five of these databases on a clinical isolate panel. We selected 292 clinical dermatophyte strains that were prospectively subjected to an ITS2 nucleotide sequence analysis. Sequences were analyzed against the databases, and the results were compared to clusters obtained via DNA alignment of sequence segments. The DNA tree served as the identification standard throughout the study. According to the ITS2 sequence identification, the majority of strains (255/292) belonged to the genus Trichophyton , mainly T. rubrum complex ( n = 184), T. interdigitale ( n = 40), T. tonsurans ( n = 26), and T. benhamiae ( n = 5). Other genera included Microsporum (e.g., M. canis [ n = 21], M. audouinii [ n = 10], Nannizzia gypsea [ n = 3], and Epidermophyton [ n = 3]). Species-level identification of T. rubrum complex isolates was an issue. Overall, ITS DNA sequencing is a reliable tool to identify dermatophyte species given that a comprehensive and correctly labeled database is consulted. Since many inaccurate identification results exist in the DNA databases used for this study, reference databases must be verified frequently and amended in line with the current revisions of fungal taxonomy. Before describing a new species or adding a new DNA reference to the available databases, its position in the phylogenetic tree must be verified. Copyright © 2018 American Society for Microbiology.
Intron loss from the NADH dehydrogenase subunit 4 gene of lettuce mitochondrial DNA: evidence for homologous recombination of a cDNA intermediate.

PubMed

Geiss, K T; Abbas, G M; Makaroff, C A

1994-04-01

The mitochondrial gene coding for subunit 4 of the NADH dehydrogenase complex I (nad4) has been isolated and characterized from lettuce, Lactuca sativa. Analysis of nad4 genes in a number of plants by Southern hybridization had previously suggested that the intron content varied between species. Characterization of the lettuce gene confirms this observation. Lettuce nad4 contains two exons and one group IIA intron, whereas previously sequenced nad4 genes from turnip and wheat contain three group IIA introns. Northern analysis identified a transcript of 1600 nucleotides, which represents the mature nad4 mRNA and a primary transcript of 3200 nucleotides. Sequence analysis of lettuce and turnip nad4 cDNAs was used to confirm the intron/exon border sequences and to examine RNA editing patterns. Editing is observed at the 5' and 3' ends of the lettuce transcript, but is absent from sequences that correspond to exons two, three and the 5' end of exon four in turnip and wheat. In contrast, turnip transcripts are highly edited in this region, suggesting that homologous recombination of an edited and spliced cDNA intermediate was involved in the loss of introns two and three from an ancestral lettuce nad4 gene.
LISTA, a comprehensive compilation of nucleotide sequences encoding proteins from the yeast Saccharomyces.

PubMed Central

Linder, P; Dölz, R; Mossé, M O; Lazowska, J; Slonimski, P P

1993-01-01

The amount of nucleotide sequence data is increasing exponentially. We therefore made an effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. Each sequence has been attributed a single genetic name and in the case of allelic duplicated sequences, synonyms are given, if necessary. For the nomenclature we have introduced a standard principle for naming gene sequences based on priority rules. We have also applied a simple method to distinguish duplicated sequences of one and the same gene from non-allelic sequences of duplicated genes. By using these principles we have sorted out a lot of confusion in the literature and databanks. Along with the genetic name, the mnemonic from the EMBL databank, the codon bias, reference of the publication of the sequence and the EMBL accession numbers are included in each entry. PMID:8332521
Complete nucleotide sequence of a novel Hibiscus-infecting Cilevirus from Florida and its relationship with closely associated Cileviruses

USDA-ARS?s Scientific Manuscript database

The complete nucleotide sequence of a recently discovered Florida (FL) isolate of Hibiscus infecting Cilevirus (HiCV) was determined by Sanger sequencing. The movement- and coat- protein gene sequences of the HiCV-FL isolate are more divergent than other genes of the previously sequenced HiCV-HA (Ha...
A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.

PubMed

Bruneau, Marine; Mottet, Thierry; Moulin, Serge; Kerbiriou, Maël; Chouly, Franz; Chretien, Stéphane; Guyeux, Christophe

2018-02-01

In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clusters is not required here. For the sake of illustration, this method is applied on a set of 100 DNA sequences taken from the mitochondrially encoded NADH dehydrogenase 3 (ND3) gene, extracted from a collection of Platyhelminthes and Nematoda species. The resulting clusters are tightly consistent with the phylogenetic tree computed using a maximum likelihood approach on gene alignment. They are coherent too with the NCBI taxonomy. Further test results based on synthesized data are then provided, showing that the proposed approach is better able to recover the clusters than the most widely used software, namely Cd-hit-est and BLASTClust. Copyright © 2017 Elsevier Ltd. All rights reserved.
An integrated PCR colony hybridization approach to screen cDNA libraries for full-length coding sequences.

PubMed

Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain

2011-01-01

cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.
Highly multiplexed subcellular RNA sequencing in situ

PubMed Central

Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Yang, Joyce L.; Terry, Richard; Jeanty, Sauveur S. F.; Li, Chao; Amamoto, Ryoji; Peters, Derek T.; Turczyk, Brian M.; Marblestone, Adam H.; Inverso, Samuel A.; Bernard, Amy; Mali, Prashant; Rios, Xavier; Aach, John; Church, George M.

2014-01-01

Understanding the spatial organization of gene expression with single nucleotide resolution requires localizing the sequences of expressed RNA transcripts within a cell in situ. Here we describe fluorescent in situ RNA sequencing (FISSEQ), in which stably cross-linked cDNA amplicons are sequenced within a biological sample. Using 30-base reads from 8,742 genes in situ, we examined RNA expression and localization in human primary fibroblasts using a simulated wound healing assay. FISSEQ is compatible with tissue sections and whole mount embryos, and reduces the limitations of optical resolution and noisy signals on single molecule detection. Our platform enables massively parallel detection of genetic elements, including gene transcripts and molecular barcodes, and can be used to investigate cellular phenotype, gene regulation, and environment in situ. PMID:24578530
The nucleotide sequence of 5S rRNA from a cellular slime mold Dictyostelium discoideum.

PubMed Central

Hori, H; Osawa, S; Iwabuchi, M

1980-01-01

The nucleotide sequence of ribosomal 5S rRNA from a cellular slime mold Dictyostelium discoideum is GUAUACGGCCAUACUAGGUUGGAAACACAUCAUCCCGUUCGAUCUGAUA AGUAAAUCGACCUCAGGCCUUCCAAGUACUCUGGUUGGAGACAACAGGGGAACAUAGGGUGCUGUAUACU. A model for the secondary structure of this 5S rRNA is proposed. The sequence is more similar to those of animals (62% similarity on the average) rather than those of yeasts (56%). Images PMID:7465421
High-resolution mapping and sequence analysis of 597 cDNA clones transcribed from the 1 Mb region in human chromosome 4q16.3 containing Huntington disease gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hadano, S.; Ishida, Y.; Tomiyasu, H.

1994-09-01

To complete a transcription map of the 1 Mb region in human chromosome 4p16.3 containing the Huntington disease (HD) gene, the isolation of cDNA clones are being performed throughout. Our method relies on a direct screening of the cDNA libraries probed with single copy microclones from 3 YAC clones spanning 1 Mbp of the HD gene region. AC-DNAs were isolated by a preparative pulsed-field gel electrophoresis, amplified by both a single unique primer (SUP)-PCR and a linker ligation PCR, and 6 microclone-DNA libraries were generated. Then, 8,640 microclones from these libraries were independently amplified by PCR, and arrayed onto themore » membranes. 800-900 microclones that were not cross-hybridized with total human and yeast genomic DNA, TAC vector DNA, and ribosomal cDNA on a dot hybridization (putatively carrying single copy sequences) were pooled to make 9 probe pools. A total of {approximately}1.8x10{sup 7} plaques from the human brain cDNA libraries was screened with 9 pool-probes, and then 672 positive cDNA clones were obtained. So far, 597 cDNA clones were defined and arrayed onto a map of the 1 Mbp of the HD gene region by hybridization with HD region-specific cosmid contigs and YAC clones. Further characterization including a DNA sequencing and Northern blot analysis is currently underway.« less
Molecular cloning, overexpression, purification, and sequence analysis of the giant panda (Ailuropoda melanoleuca) ferritin light polypeptide.

PubMed

Fu, L; Hou, Y L; Ding, X; Du, Y J; Zhu, H Q; Zhang, N; Hou, W R

2016-08-30

The complementary DNA (cDNA) of the giant panda (Ailuropoda melanoleuca) ferritin light polypeptide (FTL) gene was successfully cloned using reverse transcription-polymerase chain reaction technology. We constructed a recombinant expression vector containing FTL cDNA and overexpressed it in Escherichia coli using pET28a plasmids. The expressed protein was then purified by nickel chelate affinity chromatography. The cloned cDNA fragment was 580 bp long and contained an open reading frame of 525 bp. The deduced protein sequence was composed of 175 amino acids and had an estimated molecular weight of 19.90 kDa, with an isoelectric point of 5.53. Topology prediction revealed one N-glycosylation site, two casein kinase II phosphorylation sites, one N-myristoylation site, two protein kinase C phosphorylation sites, and one cell attachment sequence. Alignment indicated that the nucleotide and deduced amino acid sequences are highly conserved across several mammals, including Homo sapiens, Cavia porcellus, Equus caballus, and Felis catus, among others. The FTL gene was readily expressed in E. coli, which gave rise to the accumulation of a polypeptide of the expected size (25.50 kDa, including an N-terminal polyhistidine tag).
An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

PubMed Central

Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M

2004-01-01

Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051
Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus

PubMed Central

2013-01-01

Background Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. Results In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially yielding different numbers of haplotype-based quality SNPs. Sweet orange (SO) had the most (213,830) ESTs, generating 11,182 quality SNPs in 3,327 out of 4,228 usable contigs. Summed from all the individually mining results, a total of 25,417 quality SNPs were discovered – 15,010 (59.1%) were transitions (AG and CT), 9,114 (35.9%) were transversions (AC, GT, CG, and AT), and 1,293 (5.0%) were insertion/deletions (indels). A vast majority of SNP-containing contigs consisted of only 2 haplotypes, as expected, but the percentages of 2 haplotype contigs varied widely in these citrus cultivars. BLAST of the 25,417 25-mer SNP oligos to the Clementine reference genome scaffolds revealed 2,947 SNPs had “no hits found”, 19,943 had 1 unique hit / alignment, 1,571 had one hit and 2+ alignments per hit, and 956 had 2+ hits and 1+ alignment per hit. Of the total 24,293 scaffold hits, 23,955 (98.6%) were on the main scaffolds 1 to 9, and only 338 were on 87 minor scaffolds. Most alignments had 100% (25/25) or 96% (24/25) nucleotide identities, accounting for 93% of all the alignments. Considering almost all the nucleotide discrepancies in the 24/25 alignments were at the SNP sites, it served well as in silico validation of these SNPs, in addition to and consistent with the rate (81%) validated by sequencing and SNaPshot assay. Conclusions High-quality EST-SNPs from different
Nucleotide cleaving agents and method

DOEpatents

Que, Jr., Lawrence; Hanson, Richard S.; Schnaith, Leah M. T.

2000-01-01

The present invention provides a unique series of nucleotide cleaving agents and a method for cleaving a nucleotide sequence, whether single-stranded or double-stranded DNA or RNA, using and a cationic metal complex having at least one polydentate ligand to cleave the nucleotide sequence phosphate backbone to yield a hydroxyl end and a phosphate end.
ANCAC: amino acid, nucleotide, and codon analysis of COGs--a tool for sequence bias analysis in microbial orthologs.

PubMed

Meiler, Arno; Klinger, Claudia; Kaufmann, Michael

2012-09-08

The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC's NUCOCOG dataset as the largest one available for that purpose thus far. Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills.
The nucleotide sequences of 5S rRNAs from a rotifer, Brachionus plicatilis, and two nematodes, Rhabditis tokai and Caenorhabditis elegans.

PubMed Central

Kumazaki, T; Hori, H; Osawa, S; Ishii, N; Suzuki, K

1982-01-01

The nucleotide sequences of 5S rRNAs from a rotifer, Brachionus plicatilis, and two nematodes, Rhabditis tokai and Caenorhabditis elegans have been determined. The rotifer has two 5S rRNA species that are composed of 120 and 121 nucleotides, respectively. The sequences of these two 5S rRNAs are the same except that the latter has an additional base at its 3'-terminus. The 5S rRNAs from the two nematode species are both 119 nucleotides long. The sequence similarity percents are 79% (Brachionus/Rhabditis), 80% (Brachionus/Caenorhabditis), and 95% (Rhabditis/Caenorhabditis) among these three species. Brachionus revealed the highest similarity to Lingula (89%), but not to the nematodes (79%). PMID:6891053
The nucleotide sequences of 5S rRNAs from a rotifer, Brachionus plicatilis, and two nematodes, Rhabditis tokai and Caenorhabditis elegans.

PubMed

Kumazaki, T; Hori, H; Osawa, S; Ishii, N; Suzuki, K

1982-11-11

The nucleotide sequences of 5S rRNAs from a rotifer, Brachionus plicatilis, and two nematodes, Rhabditis tokai and Caenorhabditis elegans have been determined. The rotifer has two 5S rRNA species that are composed of 120 and 121 nucleotides, respectively. The sequences of these two 5S rRNAs are the same except that the latter has an additional base at its 3'-terminus. The 5S rRNAs from the two nematode species are both 119 nucleotides long. The sequence similarity percents are 79% (Brachionus/Rhabditis), 80% (Brachionus/Caenorhabditis), and 95% (Rhabditis/Caenorhabditis) among these three species. Brachionus revealed the highest similarity to Lingula (89%), but not to the nematodes (79%).
Kinetic Induction of Oat Shoot Pulvinus Invertase mRNA by Gravistimulation and Partial cDNA Cloning by the Polymerase Chain Reaction

NASA Technical Reports Server (NTRS)

Wu, Liu-Lai; Song, Il; Karuppiah, Nadarajah; Kaufman, Peter B.

1993-01-01

An asymmetric (top vs. bottom halves of pulvini) induction of invertase mRNA by gravistimulation was analyzed in oat shoot pulvini. Total RNA and poly(A)(+) RNA, isolated from oat pulvini, and two oli-gonucleotide primers, corresponding to two conserved amino acid sequences (NDPNG and WECPD) found in invertase from other species, were used for the polymerase chain reaction (PCR). A partial length cDNA (550 bp) was obtained and characterized. A 62% nucleotide sequence homology and 58% deduced amino acid sequence homology, as compared to beta-fructosidase of carrot cell wall, was found. Northern blot analysis showed that there was an obviously transient induction of invertase mRNA by gravistimulation in the oat pulvinus system. The mRNA was rapidly induced to a maximum level at 1 hour after gravistimulation treatment and gradually decreased afterwards. The mRNA level in the bottom half of the oat pulvinus was significantly higher than that in the top half of the pulvinus tissue. The kinetic induction of invertase mRNA was consistent with the transient accumulation of invertase activity during the graviresponse of the pulvinus. This indicates that the expression of the invertase gene(s) could be regulated by gravistimulation at the transcriptional level. Southern blot analysis showed that there were two to three genomic DNA fragments which hybridized with the partial-length invertase cDNA.
Csa-19, a radiation-responsive human gene, identified by an unbiased two-gel cDNA library screening method in human cancer cells

NASA Technical Reports Server (NTRS)

Balcer-Kubiczek, E. K.; Meltzer, S. J.; Han, L. H.; Zhang, X. F.; Shi, Z. M.; Harrison, G. H.; Abraham, J. M.

1997-01-01

A novel polymerase chain reaction (PCR)-based method was used to identify candidate genes whose expression is altered in cancer cells by ionizing radiation. Transcriptional induction of randomly selected genes in control versus irradiated human HL60 cells was compared. Among several complementary DNA (cDNA) clones recovered by this approach, one cDNA clone (CL68-5) was downregulated in X-irradiated HL60 cells but unaffected by 12-O-tetradecanoyl phorbol-13-acetate, forskolin, or cyclosporin-A. DNA sequencing of the CL68-5 cDNA revealed 100% nucleotide sequence homology to the reported human Csa-19 gene. Northern blot analysis of RNA from control and irradiated cells revealed the expression of a single 0.7-kilobase (kb) messenger RNA (mRNA) transcript. This 0.7-kb Csa-19 mRNA transcript was also expressed in a variety of human adult and corresponding fetal normal tissues. Moreover, when the effect of X- or fission neutron-irradiation on Csa-19 mRNA was compared in cultured human cells differing in p53 gene status (p53-/- versus p53+/+), downregulation of Csa-19 by X-rays or fission neutrons was similar in p53-wild type and p53-null cell lines. Our results provide the first known example of a radiation-responsive gene in human cancer cells whose expression is not associated with p53, adenylate cyclase or protein kinase C.
Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sakoyama, Y.; Hong, K.J.; Byun, S.M.

To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: themore » mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.« less

Mapping DNA methylation by transverse current sequencing: Reduction of noise from neighboring nucleotides

NASA Astrophysics Data System (ADS)

Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian

Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.
A large scale analysis of cDNA in Arabidopsis thaliana: generation of 12,028 non-redundant expressed sequence tags from normalized and size-selected cDNA libraries.

PubMed

Asamizu, E; Nakamura, Y; Sato, S; Tabata, S

2000-06-30

For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.
Nucleotide sequencing and serological evidence that the recently recognized deer tick virus is a genotype of Powassan virus.

PubMed

Beasley, D W; Suderman, M T; Holbrook, M R; Barrett, A D

2001-11-05

Deer tick virus (DTV) is a recently recognized North American virus isolated from Ixodes dammini ticks. Nucleotide sequencing of fragments of structural and non-structural protein genes suggested that this virus was most closely related to the tick-borne flavivirus Powassan (POW), which causes potentially fatal encephalitis in humans. To determine whether DTV represents a new and distinct member of the Flavivirus genus of the family Flaviviridae, we sequenced the structural protein genes and 5' and 3' non-coding regions of this virus. In addition, we compared the reactivity of DTV and POW in hemagglutination inhibition tests with a panel of polyclonal and monoclonal antisera, and performed cross-neutralization experiments using anti-DTV antisera. Nucleotide sequencing revealed a high degree of homology between DTV and POW at both nucleotide (>80% homology) and amino acid (>90% homology) levels, and the two viruses were indistinguishable in serological assays and mouse neuroinvasiveness. On the basis of these results, we suggest that DTV should be classified as a genotype of POW virus.
Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

PubMed Central

Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

2010-01-01

Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665
Nucleotide sequence analysis of the 3' terminal region of a wasabi strain of crucifer tobamovirus genomic RNA: subgrouping of crucifer tobamoviruses.

PubMed

Shimamoto, I; Sonoda, S; Vazquez, P; Minaka, N; Nishiguchi, M

1998-01-01

The 3' terminal 2378 nucleotides of a wasabi strain of crucifer tobamovirus (CTMV-W) infectious to crucifer plants was determined. This includes the 3' non-coding region of 235 nucleotides, coat protein (CP) gene (468 nucleotides), movement protein (MP) gene (798 nucleotides) and C-terminal partial readthrough portion of 180 K protein gene (940 nucleotides). Comparison of the sequence with homologous regions of thirteen other tobamovirus genomes showed that it had much higher identity to those of four other crucifer tobamoviruses, 85.2% to cr-TMV and turnip vein-clearing virus (TVCV), 87.4% to oilseed rape mosaic virus (ORMV) and 87.1% to TMV-Cg, than to those of other tobamoviruses. Thus CTMV-W was most similar to ORMV and TMV-Cg in sequence, but only marginally so, whereas the location and size of its MP gene was the same as cr-TMV amd TVCV. These results, together with other analyses, show that CTMV-W is a new crucifer tobamovirus, that the five crucifer tobamoviruses can be classified into two subgroups based on MP gene organization, and that the rate of sequence change is not the same in all lineages.
Molecular cloning and sequencing analysis of the interferon receptor (IFNAR-1) from Columba livia.

PubMed

Li, Chao; Chang, Wei Shan

2014-01-01

Partial sequence cloning of interferon receptor (IFNAR-1) of Columba livia. In order to obtain a certain length (630 bp) of gene, a pair of primers was designed according to the conserved nucleotide sequence of Gallus (EU477527.1) and Taeniopygia guttata (XM_002189232.1) IFNAR-1 gene fragment that was published by GenBank. Special primers were designed by the Race method to amplify the 3'terminal cDNA. The Columba livia IFNAR-1 displayed 88.5%, 80.5% and 73.8% nucleotide identity to Falco peregrinus, Gallus and Taeniopygia guttata, respectively. Phylogenetic analysis of the IFNAR1 gene showed that the relationship of Columba livia, Falco peregrinus and chicken had high homology. We successfully obtained a Columba livia IFNAR-1 gene partial sequence. Analysis of the genetic tree showed that the relationship of Columba livia and Falco peregrinus IFNAR-1 had high homology. This result can be used as reference for further research and practical application.
Molecular cloning and sequencing analysis of the interferon receptor (IFNAR-1) from Columba livia

PubMed Central

Chang, Wei Shan

2014-01-01

Objective Partial sequence cloning of interferon receptor (IFNAR-1) of Columba livia. Material and methods In order to obtain a certain length (630 bp) of gene, a pair of primers was designed according to the conserved nucleotide sequence of Gallus (EU477527.1) and Taeniopygia guttata (XM_002189232.1) IFNAR-1 gene fragment that was published by GenBank. Special primers were designed by the Race method to amplify the 3'terminal cDNA. Results The Columba livia IFNAR-1 displayed 88.5%, 80.5% and 73.8% nucleotide identity to Falco peregrinus, Gallus and Taeniopygia guttata, respectively. Phylogenetic analysis of the IFNAR1 gene showed that the relationship of Columba livia, Falco peregrinus and chicken had high homology. Conclusions We successfully obtained a Columba livia IFNAR-1 gene partial sequence. Analysis of the genetic tree showed that the relationship of Columba livia and Falco peregrinus IFNAR-1 had high homology. This result can be used as reference for further research and practical application. PMID:26155117
Molecular cloning, characterization, and expression of human ADP-ribosylation factors: Two guanine nucleotide-dependent activators of cholera toxin

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bobak, D.A.; Nightingale, M.S.; Murtagh, J.J.

1989-08-01

ADP-ribosylation factors (ARFs) are small guanine nucleotide-binding proteins that enhance the enzymatic activities of cholera toxin. Two ARF cDNAs, ARF1 and ARF3, were cloned from a human cerebellum library. Based on deduced amino acid sequences and patterns of hybridization of cDNA and oligonucleotide probes with mammalian brain poly(A){sup +} RNA, human ARF1 is the homologue of bovine ARF1. Human ARF3, which differs from bovine ARF1 and bovine ARF2, appears to represent a newly identified third type of ARF. Hybridization patterns of human ARF cDNA and clone-specific oligonucleotides with poly(A){sup +} RNA are consistent with the presence of at least two,more » and perhaps four, separate ARF messages in human brain. In vitro translation of ARF1, ARF2, and ARF3 produced proteins that behaved, by SDS/PAGE, similar to a purified soluble brain ARF. Deduced amino acid sequences of human ARF1 and ARF3 contain regions, similar to those in other G proteins, that are believed to be involved in GTP binding and hydrolysis. ARFS also exhibit a modest degree of homology with a bovine phospholipase C. The observations reported here support the conclusion that the ARFs are members of a multigene family of small guanine nucleotide-binding proteins. Definition of the regulation of ARF mRNAs and of function(s) of recombinant ARF proteins will aid in the elucidation of the physiologic role(s) of ARFs.« less
Complete Nucleotide Sequence of Watermelon Chlorotic Stunt Virus Originating from Oman

PubMed Central

Khan, Akhtar J.; Akhtar, Sohail; Briddon, Rob W.; Ammara, Um; Al-Matrooshi, Abdulrahman M.; Mansoor, Shahid

2012-01-01

Watermelon chlorotic stunt virus (WmCSV) is a bipartite begomovirus (genus Begomovirus, family Geminiviridae) that causes economic losses to cucurbits, particularly watermelon, across the Middle East and North Africa. Recently squash (Cucurbita moschata) grown in an experimental field in Oman was found to display symptoms such as leaf curling, yellowing and stunting, typical of a begomovirus infection. Sequence analysis of the virus isolated from squash showed 97.6–99.9% nucleotide sequence identity to previously described WmCSV isolates for the DNA A component and 93–98% identity for the DNA B component. Agrobacterium-mediated inoculation to Nicotiana benthamiana resulted in the development of symptoms fifteen days post inoculation. This is the first bipartite begomovirus identified in Oman. Overall the Oman isolate showed the highest levels of sequence identity to a WmCSV isolate originating from Iran, which was confirmed by phylogenetic analysis. This suggests that WmCSV present in Oman has been introduced from Iran. The significance of this finding is discussed. PMID:22852046
Complete nucleotide sequence of watermelon chlorotic stunt virus originating from Oman.

PubMed

Khan, Akhtar J; Akhtar, Sohail; Briddon, Rob W; Ammara, Um; Al-Matrooshi, Abdulrahman M; Mansoor, Shahid

2012-07-01

Watermelon chlorotic stunt virus (WmCSV) is a bipartite begomovirus (genus Begomovirus, family Geminiviridae) that causes economic losses to cucurbits, particularly watermelon, across the Middle East and North Africa. Recently squash (Cucurbita moschata) grown in an experimental field in Oman was found to display symptoms such as leaf curling, yellowing and stunting, typical of a begomovirus infection. Sequence analysis of the virus isolated from squash showed 97.6-99.9% nucleotide sequence identity to previously described WmCSV isolates for the DNA A component and 93-98% identity for the DNA B component. Agrobacterium-mediated inoculation to Nicotiana benthamiana resulted in the development of symptoms fifteen days post inoculation. This is the first bipartite begomovirus identified in Oman. Overall the Oman isolate showed the highest levels of sequence identity to a WmCSV isolate originating from Iran, which was confirmed by phylogenetic analysis. This suggests that WmCSV present in Oman has been introduced from Iran. The significance of this finding is discussed.
Heterologous Array Analysis in Pinaceae: Hybridization of Pinus Taeda cDNA Arrays With cDNA From Needles and Embryogenic Cultures of P. Taeda, P. Sylvestris or Picea Abies

PubMed Central

van Zyl, Leonel; von Arnold, Sara; Bozhkov, Peter; Chen, Yongzhong; Egertsdotter, Ulrika; MacKay, John; Sederoff, Ronald R.; Shen, Jing; Zelena, Lyubov

2002-01-01

Hybridization of labelled cDNA from various cell types with high-density arrays of expressed sequence tags is a powerful technique for investigating gene expression. Few conifer cDNA libraries have been sequenced. Because of the high level of sequence conservation between Pinus and Picea we have investigated the use of arrays from one genus for studies of gene expression in the other. The partial cDNAs from 384 identifiable genes expressed in differentiating xylem of Pinus taeda were printed on nylon membranes in randomized replicates. These were hybridized with labelled cDNA from needles or embryogenic cultures of Pinus taeda, P. sylvestris and Picea abies, and with labelled cDNA from leaves of Nicotiana tabacum. The Spearman correlation of gene expression for pairs of conifer species was high for needles (r2 = 0.78 − 0.86), and somewhat lower for embryogenic cultures (r2 = 0.68 − 0.83). The correlation of gene expression for tobacco leaves and needles of each of the three conifer species was lower but sufficiently high (r2 = 0.52 − 0.63) to suggest that many partial gene sequences are conserved in angiosperms and gymnosperms. Heterologous probing was further used to identify tissue-specific gene expression over species boundaries. To evaluate the significance of differences in gene expression, conventional parametric tests were compared with permutation tests after four methods of normalization. Permutation tests after Z-normalization provide the highest degree of discrimination but may enhance the probability of type I errors. It is concluded that arrays of cDNA from loblolly pine are useful for studies of gene expression in other pines or spruces. PMID:18629264
Cloning and High-Level Expression of α-Galactosidase cDNA from Penicillium purpurogenum

PubMed Central

Shibuya, Hajime; Nagasaki, Hiroaki; Kaneko, Satoshi; Yoshida, Shigeki; Park, Gwi Gun; Kusakabe, Isao; Kobayashi, Hideyuki

1998-01-01

The cDNA coding for Penicillium purpurogenum α-galactosidase (αGal) was cloned and sequenced. The deduced amino acid sequence of the α-Gal cDNA showed that the mature enzyme consisted of 419 amino acid residues with a molecular mass of 46,334 Da. The derived amino acid sequence of the enzyme showed similarity to eukaryotic αGals from plants, animals, yeasts, and filamentous fungi. The highest similarity observed (57% identity) was to Trichoderma reesei AGLI. The cDNA was expressed in Saccharomyces cerevisiae under the control of the yeast GAL10 promoter. Almost all of the enzyme produced was secreted into the culture medium, and the expression level reached was approximately 0.2 g/liter. The recombinant enzyme purified to homogeneity was highly glycosylated, showed slightly higher specific activity, and exhibited properties almost identical to those of the native enzyme from P. purpurogenum in terms of the N-terminal amino acid sequence, thermoactivity, pH profile, and mode of action on galacto-oligosaccharides. PMID:9797312
Evaluation of vector-primed cDNA library production from microgram quantities of total RNA.

PubMed

Kuo, Jonathan; Inman, Jason; Brownstein, Michael; Usdin, Ted B

2004-12-15

cDNA sequences are important for defining the coding region of genes, and full-length cDNA clones have proven to be useful for investigation of the function of gene products. We produced cDNA libraries containing 3.5-5 x 10(5) primary transformants, starting with 5 mug of total RNA prepared from mouse pituitary, adrenal, thymus, and pineal tissue, using a vector-primed cDNA synthesis method. Of approximately 1000 clones sequenced, approximately 20% contained the full open reading frames (ORFs) of known transcripts, based on the presence of the initiating methionine residue codon. The libraries were complex, with 94, 91, 83 and 55% of the clones from the thymus, adrenal, pineal and pituitary libraries, respectively, represented only once. Twenty-five full-length clones, not yet represented in the Mammalian Gene Collection, were identified. Thus, we have produced useful cDNA libraries for the isolation of full-length cDNA clones that are not yet available in the public domain, and demonstrated the utility of a simple method for making high-quality libraries from small amounts of starting material.
Nucleotide sequence analysis establishes the role of endogenous murine leukemia virus DNA segments in formation of recombinant mink cell focus-forming murine leukemia viruses.

PubMed Central

Khan, A S

1984-01-01

The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017
T box transcription antitermination riboswitch: Influence of nucleotide sequence and orientation on tRNA binding by the antiterminator element

PubMed Central

Fauzi, Hamid; Agyeman, Akwasi; Hines, Jennifer V.

2008-01-01

Many bacteria utilize riboswitch transcription regulation to monitor and appropriately respond to cellular levels of important metabolites or effector molecules. The T box transcription antitermination riboswitch responds to cognate uncharged tRNA by specifically stabilizing an antiterminator element in the 5′-untranslated mRNA leader region and precluding formation of a thermodynamically more stable terminator element. Stabilization occurs when the tRNA acceptor end base pairs with the first four nucleotides in the seven nucleotide bulge of the highly conserved antiterminator element. The significance of the conservation of the antiterminator bulge nucleotides that do not base pair with the tRNA is unknown, but they are required for optimal function. In vitro selection was used to determine if the isolated antiterminator bulge context alone dictates the mode in which the tRNA acceptor end binds the bulge nucleotides. No sequence conservation beyond complementarity was observed and the location was not constrained to the first four bases of the bulge. The results indicate that formation of a structure that recognizes the tRNA acceptor end in isolation is not the determinant driving force for the high phylogenetic sequence conservation observed within the antiterminator bulge. Additional factors or T box leader features more likely influenced the phylogenetic sequence conservation. PMID:19152843
The repeating nucleotide sequence in the repetitive mitochondrial DNA from a "low-density" petite mutant of yeast.

PubMed Central

Van Kreijl, C F; Bos, J L

1977-01-01

The repeating nucleotide sequence of 68 base pairs in the mtDNA from an ethidium-induced cytoplasmic petite mutant of yeast has been determined. For sequence analysis specifically primed and terminated RNA copies, obtained by in vitro transcription of the separated strands, were use. The sequence consists of 66 consecutive AT base pairs flanked by two GC pairs and comprises nearly all of the mutant mitochondrial genome. The sequence, moreover, also represents the first part of wild-type mtDNA sequence so far. Images PMID:198740
Unlinking the methylome pattern from nucleotide sequence, revealed by large-scale in vivo genome engineering and methylome editing in medaka fish

PubMed Central

Nakamura, Ryohei; Uno, Ayako; Kumagai, Masahiko; Fukushima, Hiroto S.; Morishita, Shinichi; Takeda, Hiroyuki

2017-01-01

The heavily methylated vertebrate genomes are punctuated by stretches of poorly methylated DNA sequences that usually mark gene regulatory regions. It is known that the methylation state of these regions confers transcriptional control over their associated genes. Given its governance on the transcriptome, cellular functions and identity, genome-wide DNA methylation pattern is tightly regulated and evidently predefined. However, how is the methylation pattern determined in vivo remains enigmatic. Based on in silico and in vitro evidence, recent studies proposed that the regional hypomethylated state is primarily determined by local DNA sequence, e.g., high CpG density and presence of specific transcription factor binding sites. Nonetheless, the dependency of DNA methylation on nucleotide sequence has not been carefully validated in vertebrates in vivo. Herein, with the use of medaka (Oryzias latipes) as a model, the sequence dependency of DNA methylation was intensively tested in vivo. Our statistical modeling confirmed the strong statistical association between nucleotide sequence pattern and methylation state in the medaka genome. However, by manipulating the methylation state of a number of genomic sequences and reintegrating them into medaka embryos, we demonstrated that artificially conferred DNA methylation states were predominantly and robustly maintained in vivo, regardless of their sequences and endogenous states. This feature was also observed in the medaka transgene that had passed across generations. Thus, despite the observed statistical association, nucleotide sequence was unable to autonomously determine its own methylation state in medaka in vivo. Our results apparently argue against the notion of the governance on the DNA methylation by nucleotide sequence, but instead suggest the involvement of other epigenetic factors in defining and maintaining the DNA methylation landscape. Further investigation in other vertebrate models in vivo will be needed
Identification and nucleotide sequence analysis of the repetitive DNA element in the genome of fish lymphocystis disease virus.

PubMed

Schnitzler, P; Delius, H; Scholz, J; Touray, M; Orth, E; Darai, G

1987-12-01

The genome of the fish lymphocystis disease virus (FLDV) was screened for the existence of repetitive DNA sequences using a defined and complete gene library of the viral genome (98 kbp) by DNA-DNA hybridization, heteroduplex analysis, and restriction fine mapping. A repetitive DNA sequence was detected at the coordinates 0.034 to 0.057 and 0.718 to 0.736 map units (m.u.) of the FLDV genome. The first region (0.034 to 0.057 m.u.) corresponds to the 5' terminus of the EcoRI FLDV DNA fragment B (0.034 to 0.165 m.u.) and the second region (0.718 to 0.736 m.u.) is identical to the EcoRI DNA fragment M of the viral genome. The DNA nucleotide sequence of the EcoRI FLDV DNA fragment M was determined. This analysis revealed the presence of many short direct and inverted repetitions, e.g., a 18-mer direct repetition (TTTAAAATTTAATTAA) that started at nucleotide positions 812 and 942 and a 14-mer inverted repeat (TTAAATTTAAATTT) at nucleotide positions 820 and 959. Only short open reading frames were detected within this region. The DNA repetitions are discussed as sequences that play a possible regulatory role for virus replication. Furthermore, hybridization experiments revealed that the repetitive DNA sequences are conserved in the genome of different strains of fish lymphocystis disease virus isolated from two species of Pleuronectidae (flounder and dab).
Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne): nucleotide sequences, genome location and potential for host-race identification.

PubMed Central

Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R

1991-01-01

Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769
Isolating Viral and Host RNA Sequences from Archival Material and Production of cDNA Libraries for High-Throughput DNA Sequencing

PubMed Central

Xiao, Yongli; Sheng, Zong-Mei; Taubenberger, Jeffery K.

2015-01-01

The vast majority of surgical biopsy and post-mortem tissue samples are formalin-fixed and paraffin-embedded (FFPE), but this process leads to RNA degradation that limits gene expression analysis. As an example, the viral RNA genome of the 1918 pandemic influenza A virus was previously determined in a 9-year effort by overlapping RT-PCR from post-mortem samples. Using the protocols described here, the full genome of the 1918 virus at high coverage was determined in one high-throughput sequencing run of a cDNA library derived from total RNA of a 1918 FFPE sample after duplex-specific nuclease treatments. This basic methodological approach should assist in the analysis of FFPE tissue samples isolated over the past century from a variety of infectious diseases. PMID:26344216

Sequencing analysis of 20,000 full-length cDNA clones from cassava reveals lineage specific expansions in gene families related to stress response

PubMed Central

Sakurai, Tetsuya; Plata, Germán; Rodríguez-Zapata, Fausto; Seki, Motoaki; Salcedo, Andrés; Toyoda, Atsushi; Ishiwata, Atsushi; Tohme, Joe; Sakaki, Yoshiyuki; Shinozaki, Kazuo; Ishitani, Manabu

2007-01-01

Background Cassava, an allotetraploid known for its remarkable tolerance to abiotic stresses is an important source of energy for humans and animals and a raw material for many industrial processes. A full-length cDNA library of cassava plants under normal, heat, drought, aluminum and post harvest physiological deterioration conditions was built; 19968 clones were sequence-characterized using expressed sequence tags (ESTs). Results The ESTs were assembled into 6355 contigs and 9026 singletons that were further grouped into 10577 scaffolds; we found 4621 new cassava sequences and 1521 sequences with no significant similarity to plant protein databases. Transcripts of 7796 distinct genes were captured and we were able to assign a functional classification to 78% of them while finding more than half of the enzymes annotated in metabolic pathways in Arabidopsis. The annotation of sequences that were not paired to transcripts of other species included many stress-related functional categories showing that our library is enriched with stress-induced genes. Finally, we detected 230 putative gene duplications that include key enzymes in reactive oxygen species signaling pathways and could play a role in cassava stress response features. Conclusion The cassava full-length cDNA library here presented contains transcripts of genes involved in stress response as well as genes important for different areas of cassava research. This library will be an important resource for gene discovery, characterization and cloning; in the near future it will aid the annotation of the cassava genome. PMID:18096061
Complete cDNA sequence of SAP-like pentraxin from Limulus polyphemus: implications for pentraxin evolution.

PubMed

Tharia, Hazel A; Shrive, Annette K; Mills, John D; Arme, Chris; Williams, Gwyn T; Greenhough, Trevor J

2002-02-22

The serum amyloid P component (SAP)-like pentraxin Limulus polyphemus SAP is a recently discovered, distinct pentraxin species, of known structure, which does not bind phosphocholine and whose N-terminal sequence has been shown to differ markedly from the highly conserved N terminus of all other known horseshoe crab pentraxins. The complete cDNA sequence of Limulus SAP, and the derived amino acid sequence, the first invertebrate SAP-like pentraxin sequence, have been determined. Two sequences were identified that differed only in the length of the 3' untranslated region. Limulus SAP is synthesised as a precursor protein of 234 amino acid residues, the first 17 residues encoding a signal peptide that is absent from the mature protein. Phylogenetic analysis clusters Limulus SAP pentraxin with the horseshoe crab C-reactive proteins (CRPs) rather than the mammalian SAPs, which are clustered with mammalian CRPs. The deduced amino acid sequence shares 22% identity with both human SAP and CRP, which are 51% identical, and 31-35% with horseshoe crab CRPs. These analyses indicate that gene duplication of CRP (or SAP), followed by sequence divergence and the evolution of CRP and/or SAP function, occurred independently along the chordate and arthropod evolutionary lines rather than in a common ancestor. They further indicate that the CRP/SAP gene duplication event in Limulus occurred before both the emergence of the Limulus CRP variants and the mammalian CRP/SAP gene duplication. Limulus SAP, which does not exhibit the CRP characteristic of calcium-dependent binding to phosphocholine, is established as a pentraxin species distinct from all other known horseshoe crab pentraxins that exist in many variant forms sharing a high level of sequence homology. Copyright 2002 Elsevier Science Ltd.
cap alpha. /sub i/-3 cDNA encodes the. cap alpha. subunit of G/sub k/, the stimulatory G protein of receptor-regulated K/sup +/ channels

DOE Office of Scientific and Technical Information (OSTI.GOV)

Codina, J.; Olate, J.; Abramowitz, J.

1988-05-15

cDNA cloning has identified the presence in the human genome of three genes encoding ..cap alpha.. subunits of pertussis toxin substrates, generically called G/sub i/. They are named ..cap alpha../sub i/-1, ..cap alpha../sub i/-2 and ..cap alpha../sub i/-3. However, none of these genes has been functionally identified with any of the ..cap alpha.. subunits of several possible G proteins, including pertussis toxin-sensitive G/sub p/'s, stimulatory to phospholipase C or A/sub 2/, G/sub i/, inhibitory to adenylyl cyclase, or G/sub k/, stimulatory to a type of K/sup +/ channels. The authors now report the nucleotide sequence and the complete predicted aminomore » acid sequence of human liver ..cap alpha../sub i/-3 and the partial amino acid sequence of proteolytic fragments of the ..cap alpha.. subunit of human erythrocyte G/sub k/. The amino acid sequence of the proteolytic fragment is uniquely encoded by the cDNA of ..cap alpha../sub i/-3, thus identifying it as ..cap alpha../sub k/. The probable identity of ..cap alpha../sub i/-1 with ..cap alpha../sub p/ and possible roles for ..cap alpha../sub i/-2, as well as additional roles for ..cap alpha../sub i/-1 and ..cap alpha../sub i/-3 (..cap alpha../sub k/) are discussed.« less
Nucleotide Sequence and Genetic Structure of a Novel Carbaryl Hydrolase Gene (cehA) from Rhizobium sp. Strain AC100

PubMed Central

Hashimoto, Masayuki; Fukui, Mitsuru; Hayano, Kouichi; Hayatsu, Masahito

2002-01-01

Rhizobium sp. strain AC100, which is capable of degrading carbaryl (1-naphthyl-N-methylcarbamate), was isolated from soil treated with carbaryl. This bacterium hydrolyzed carbaryl to 1-naphthol and methylamine. Carbaryl hydrolase from the strain was purified to homogeneity, and its N-terminal sequence, molecular mass (82 kDa), and enzymatic properties were determined. The purified enzyme hydrolyzed 1-naphthyl acetate and 4-nitrophenyl acetate indicating that the enzyme is an esterase. We then cloned the carbaryl hydrolase gene (cehA) from the plasmid DNA of the strain and determined the nucleotide sequence of the 10-kb region containing cehA. No homologous sequences were found by a database homology search using the nucleotide and deduced amino acid sequences of the cehA gene. Six open reading frames including the cehA gene were found in the 10-kb region, and sequencing analysis shows that the cehA gene is flanked by two copies of insertion sequence-like sequence, suggesting that it makes part of a composite transposon. PMID:11872471
ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs

PubMed Central

2012-01-01

Background The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Results Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC’s NUCOCOG dataset as the largest one available for that purpose thus far. Conclusions Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills. PMID:22958836
The nucleotide sequences of 5S rRNAs from a fern Dryopteris acuminata and a horsetail Equisetum arvense.

PubMed Central

Hori, H; Osawa, S; Takaiwa, F; Sugiura, M

1984-01-01

The nucleotide sequences from two Pteridophyta species, a fern Dryopteris acuminata and a horsetail Equisetum arvense have been determined. These two sequences are more related to those of the Bryophyta species (88% identity on average) than to those of seed plants (84% identity on average). PMID:6538332
3' rapid amplification of cDNA ends (RACE) walking for rapid structural analysis of large transcripts.

PubMed

Ozawa, Tatsuhiko; Kondo, Masato; Isobe, Masaharu

2004-01-01

The 3' rapid amplification of cDNA ends (3' RACE) is widely used to isolate the cDNA of unknown 3' flanking sequences. However, the conventional 3' RACE often fails to amplify cDNA from a large transcript if there is a long distance between the 5' gene-specific primer and poly(A) stretch, since the conventional 3' RACE utilizes 3' oligo-dT-containing primer complementary to the poly(A) tail of mRNA at the first strand cDNA synthesis. To overcome this problem, we have developed an improved 3' RACE method suitable for the isolation of cDNA derived from very large transcripts. By using the oligonucleotide-containing random 9mer together with the GC-rich sequence for the suppression PCR technology at the first strand of cDNA synthesis, we have been able to amplify the cDNA from a very large transcript, such as the microtubule-actin crosslinking factor 1 (MACF1) gene, which codes a transcript of 20 kb in size. When there is no splicing variant, our highly specific amplification allows us to perform the direct sequencing of 3' RACE products without requiring cloning in bacterial hosts. Thus, this stepwise 3' RACE walking will help rapid characterization of the 3' structure of a gene, even when it encodes a very large transcript.
cDNA cloning of Brassica napus malonyl-CoA:ACP transacylase (MCAT) (fab D) and complementation of an E. coli MCAT mutant.

PubMed

Simon, J W; Slabas, A R

1998-09-18

The GenBank database was searched using the E. coli malonyl CoA:ACP transacylase (MCAT) sequence, for plant protein/cDNA sequences corresponding to MCAT, a component of plant fatty acid synthetase (FAS), for which the plant cDNA has not been isolated. A 272-bp Zea mays EST sequence (GenBank accession number: AA030706) was identified which has strong homology to the E. coli MCAT. A PCR derived cDNA probe from Zea mays was used to screen a Brassica napus (rape) cDNA library. This resulted in the isolation of a 1200-bp cDNA clone which encodes an open reading frame corresponding to a protein of 351 amino acids. The protein shows 47% homology to the E. coli MCAT amino acid sequence in the coding region for the mature protein. Expression of a plasmid (pMCATrap2) containing the plant cDNA sequence in Fab D89, an E. coli mutant, in MCAT activity restores growth demonstrating functional complementation and direct function of the cloned cDNA. This is the first functional evidence supporting the identification of a plant cDNA for MCAT.
Nucleotide sequencing and characterization of the genes encoding benzene oxidation enzymes of Pseudomonas putida.

PubMed Central

Irie, S; Doi, S; Yorifuji, T; Takagi, M; Yano, K

1987-01-01

The nucleotide sequence of the genes from Pseudomonas putida encoding oxidation of benzene to catechol was determined. Five open reading frames were found in the sequence. Four corresponding protein molecules were detected by a DNA-directed in vitro translation system. Escherichia coli cells containing the fragment with the four open reading frames transformed benzene to cis-benzene glycol, which is an intermediate of the oxidation of benzene to catechol. The relation between the product of each cistron and the components of the benzene oxidation enzyme system is discussed. Images PMID:3667527
Large-Scale Concatenation cDNA Sequencing

PubMed Central

Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.

1997-01-01

A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Cloning and sequence analysis of a cDNA encoding the alpha-subunit of mouse beta-N-acetylhexosaminidase and comparison with the human enzyme.

PubMed Central

Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L

1992-01-01

cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046
Overproduction and nucleotide sequence of the respiratory D-lactate dehydrogenase of Escherichia coli.

PubMed Central

Rule, G S; Pratt, E A; Chin, C C; Wold, F; Ho, C

1985-01-01

Recombinant DNA plasmids containing the gene for the membrane-bound D-lactate dehydrogenase (D-LDH) of Escherichia coli linked to the promoter PL from lambda were constructed. After induction, the levels of D-LDH were elevated 300-fold over that of the wild type and amounted to 35% of the total cellular protein. The nucleotide sequence of the D-LDH gene was determined and shown to agree with the amino acid composition and the amino-terminal sequence of the purified enzyme. Removal of the amino-terminal formyl-Met from D-LDH was not inhibited in cells which contained these high levels of D-LDH. Images PMID:3882663
Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

PubMed Central

Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

2015-01-01

Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450
Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

PubMed

Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

2015-01-01

Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.
cDNA sequences and organization of IgM heavy chain genes in two holostean fish.

PubMed

Wilson, M R; van Ravenstein, E; Miller, N W; Clem, L W; Middleton, D L; Warr, G W

1995-01-01

Immunoglobulin M heavy chain (mu) sequences of two holostean fish, the bowfin, Amia calva, and the longnose gar, Lepisosteus osseus, were amplified from spleen mRNA by RACE-PCR, cloned, and sequenced. Each mu chain showed the conserved four constant domain structure typical of a secreted mu chain. Southern blot analyses with specific heavy chain variable (VH) and constant (CH) region probes suggest that both fish possess an IgH locus that resembles that of the teleosts, amphibians, and mammals in its organization. The overall sequence similarity of gar and bowfin mu chains was 60% and 48% at the nucleotide and amino acid levels, respectively, while similarity to the mu chains of teleosts and elasmobranchs was lower. The bowfin mu chain possesses a distinctive proline-rich sequence at the C mu 1/C mu 2 boundary; a shorter proline-rich sequence is present at this position in the gar mu chain. Both gar and bowfin show, in their C mu 4 sequences, motifs that could serve as cryptic splice donor sites for the production of mRNA encoding the membrane-bound form of the mu chains, and the bowfin also shows a potential cryptic splice donor site in the C mu 3 exon.
The Status, Quality, and Expansion of the NIH Full-Length cDNA Project: The Mammalian Gene Collection (MGC)

PubMed Central

2004-01-01

The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5′-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID:15489334
Isolation and characterization of a cDNA encoding a lipid transfer protein expressed in 'Valencia' orange during abscission.

PubMed

Wu, Zhencai; Burns, Jacqueline K

2003-04-01

The genetics and expression of a lipid transfer protein (LTP) gene was examined during abscission of mature fruit of 'Valencia' orange. A cDNA encoding an LTP, CsLTP, was isolated from a cDNA subtraction library constructed from mature fruit abscission zones 48 h after application of a mature fruit-specific abscission agent, 5-chloro-3-methyl-4-nitro-pyrazole (CMN-pyrazole). A full-length cDNA clone of 652 nucleotides was isolated using 5' and 3' RACE followed by cDNA library screening and PCR amplification. The cDNA clone encoded a protein of 155 amino acid residues with a molecular mass and isoelectric point of 9.18 kDa and 9.12, respectively. A partial genomic clone of 505 nucleotides containing one intron of 101 base pairs was amplified from leaf genomic DNA. Southern blot hybridization demonstrated that at least two closely related CsLTP genes are present in 'Valencia' orange. Temporal expression patterns in mature fruit abscission zones were examined by northern hybridization. Increased expression of CsLTP mRNA was detected in RNA of mature fruit abscission zones 6, 24, 48, and 72 h after application of a non-specific abscission agent, ethephon. Low expression of CsLTP transcripts was observed after treatment of CMN-pyrazole until 24 h after application. After this time, expression markedly increased. The results suggest that CsLTP has a role in the abscission process, possibly by assisting transport of cutin monomers to the fracture plane of the abscission zone or through its anti-microbial activity by reducing the potential of microbial attack.
Identifications of SUMO-1 cDNA and Its Expression Patterns in Pacific White Shrimp Litopeanaeus vannamei

PubMed Central

Laoong-u-thai, Yanisa; Zhao, Baoping; Phongdara, Amornrat; Ako, Harry; Yang, Jinzeng

2009-01-01

Small ubiquitin-like modifiers (SUMO) work in a similar way as ubiquitin to alter the biological properties of a target protein by conjugation. A shrimp SUMO cDNA named LvSUMO-1 was identified in Litopenaeus vannamei. LvSUMO-1 cDNA contains a coding sequence of 282 nucleotides with untranslated regions of 37 bp at 5'-end and 347 bp at 3'-end, respectively. The deduced 93 amino acids exhibit 83% identity with the Western Honeybee SUMO-1, and more than 65% homologies with human and mouse SUMO-1. LvSUMO-1 mRNA is expressed in most L. vannamei tissues with the highest level in hepatopancrease. The mRNA expression of LvSUMO-1 over development stages in L. Vammamei is distinguished by a low level in nauplius stage and relatively high level in postlarva stage with continuous expression until juvenile stage. The LvSUMO-1 protein and its conjugated proteins are detected in both cytoplasm and nucleus in several tissues. Interestingly, LvSUMO-1 mRNA levels are high in abdominal muscle during the premolt stage, wherein it has significant activities of protein degradation, suggesting its possible role in the regulation of shrimp muscle protein degradation. PMID:19240809
Partial bisulfite conversion for unique template sequencing

PubMed Central

Kumar, Vijay; Rosenbaum, Julie; Wang, Zihua; Forcier, Talitha; Ronemus, Michael; Wigler, Michael

2018-01-01

Abstract We introduce a new protocol, mutational sequencing or muSeq, which uses sodium bisulfite to randomly deaminate unmethylated cytosines at a fixed and tunable rate. The muSeq protocol marks each initial template molecule with a unique mutation signature that is present in every copy of the template, and in every fragmented copy of a copy. In the sequenced read data, this signature is observed as a unique pattern of C-to-T or G-to-A nucleotide conversions. Clustering reads with the same conversion pattern enables accurate count and long-range assembly of initial template molecules from short-read sequence data. We explore count and low-error sequencing by profiling 135 000 restriction fragments in a PstI representation, demonstrating that muSeq improves copy number inference and significantly reduces sporadic sequencer error. We explore long-range assembly in the context of cDNA, generating contiguous transcript clusters greater than 3,000 bp in length. The muSeq assemblies reveal transcriptional diversity not observable from short-read data alone. PMID:29161423
Cloning and characterization of transferrin cDNA and rapid detection of transferrin gene polymorphism in rainbow trout (Oncorhynchus mykiss).

PubMed

Tange, N; Jong-Young, L; Mikawa, N; Hirono, I; Aoki, T

1997-12-01

A cDNA clone of rainbow trout (Oncorhynchus mykiss) transferrin was obtained from a liver cDNA library. The 2537-bp cDNA sequence contained an open reading frame encoding 691 amino acids and the 5' and 3' noncoding regions. The amino acid sequences at the iron-binding sites and the two N-linked glycosylation sites, and the cysteine residues were consistent with known, conserved vertebrate transferrin cDNA sequences. Single N-linked glycosylation sites existed on the N- and C-lobe. The deduced amino acid sequence of the rainbow trout transferrin cDNA had 92.9% identities with transferrin of coho salmon (Oncorhynchus kisutch); 85%, Atlantic salmon (Salmo salar); 67.3%, medaka (Oryzias latipes); 61.3% Atlantic cod (Gadus morhua); and 59.7%, Japanese flounder (Paralichthys olivaceus). The long and accurate polymerase chain reaction (LA-PCR) was used to amplify approximately 6.5 kb of the transferrin gene from rainbow trout genomic DNA. Restriction fragment length polymorphisms (RFLPs) of the LA-PCR products revealed three digestion patterns in 22 samples.

[Preparation of the cDNA microarray on the differential expressed cDNA of senescence-accelerated mouse's hippocampus].

PubMed

Cheng, Xiao-Rui; Zhou, Wen-Xia; Zhang, Yong-Xiang

2006-05-01

Alzheimer' s disease (AD) is the most common form of dementia in the elderly. AD is an invariably fatal neurodegenerative disorder with no effective treatment. Senescence-accelerated mouse prone 8 (SAMP8) is a model for studying age-related cognitive impairments and also is a good model to study brain aging and one of mouse model of AD. The technique of cDNA microarray can monitor the expression levels of thousands of genes simultaneously and can be used to study AD with the character of multi-mechanism, multi-targets and multi-pathway. In order to disclose the mechanism of AD and find the drug targets of AD, cDNA microarray containing 3136 cDNAs amplified from the suppression subtracted cDNA library of hippocampus of SAMP8 and SAMR1 was prepared with 16 blocks and 14 x 14 pins, the housekeeping gene beta-actin and G3PDH as inner conference. The background of this microarray was low and unanimous, and dots divided evenly. The conditions of hybridization and washing were optimized during the hybridization of probe and target molecule. After the data of hybridization analysis, the differential expressed cDNAs were sequenced and analyzed by the bioinformatics, and some of genes were quantified by the real time RT-PCR and the reliability of this cDNA microarray were validated. This cDNA microarray may be the good means to select the differential expressed genes and disclose the molecular mechanism of SAMP8's brain aging and AD.
cDNA library construction of two human Demodexspecies.

PubMed

Niu, DongLing; Wang, RuiLing; Zhao, YaE; Yang, Rui; Hu, Li; Lei, YuYang; Dan, WeiChao

2017-06-01

The research of Demodex, a type of pathogen causing various dermatoses in animals and human beings, is lacking at RNA level. This study aims at extracting RNA and constructing cDNA library for Demodex. First, P. cuniculiand D. farinaewere mixed to establish homogenization method for RNA extraction. Second, D. folliculorumand D. breviswere collected and preserved in Trizol, which were mixed with D. farinaerespectively to extract RNA. Finally, cDNA library was constructed and its quality was assessed. The results indicated that for D. folliculorum& D. farinae, the recombination rate of cDNA library was 90.67% and the library titer was 7.50 × 104 pfu/ml. 17 of the 59 positive clones were predicted to be of D. folliculorum; For D. brevis& D. farinae, the recombination rate was 90.96% and the library titer was 7.85 x104 pfu/ml. 40 of the 59 positive clones were predicted to be of D. brevis. Further detection by specific primers demonstrated that mtDNA cox1, cox3and ATP6 detected from cDNA libraries had 96.52%-99.73% identities with the corresponding sequences in GenBank. In conclusion, the cDNA libraries constructed for Demodexmixed with D. farinaewere successful and could satisfy the requirements for functional genes detection.
Integrating De Novo Transcriptome Assembly and Cloning to Obtain Chicken Ovocleidin-17 Full-Length cDNA

PubMed Central

Ning, ZhongHua; Hincke, Maxwell T.; Yang, Ning; Hou, ZhuoCheng

2014-01-01

Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not ‘finished’. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full
Integrating de novo transcriptome assembly and cloning to obtain chicken Ovocleidin-17 full-length cDNA.

PubMed

Zhang, Quan; Liu, Long; Zhu, Feng; Ning, ZhongHua; Hincke, Maxwell T; Yang, Ning; Hou, ZhuoCheng

2014-01-01

Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not 'finished'. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full
Update on Pneumocystis carinii f. sp. hominis Typing Based on Nucleotide Sequence Variations in Internal Transcribed Spacer Regions of rRNA Genes

PubMed Central

Lee, Chao-Hung; Helweg-Larsen, Jannik; Tang, Xing; Jin, Shaoling; Li, Baozheng; Bartlett, Marilyn S.; Lu, Jang-Jih; Lundgren, Bettina; Lundgren, Jens D.; Olsson, Mats; Lucas, Sebastian B.; Roux, Patricia; Cargnel, Antonietta; Atzori, Chiara; Matos, Olga; Smith, James W.

1998-01-01

Pneumocystis carinii f. sp. hominis isolates from 207 clinical specimens from nine countries were typed based on nucleotide sequence variations in the internal transcribed spacer regions I and II (ITS1 and ITS2, respectively) of rRNA genes. The number of ITS1 nucleotides has been revised from the previously reported 157 bp to 161 bp. Likewise, the number of ITS2 nucleotides has been changed from 177 to 192 bp. The number of ITS1 sequence types has increased from 2 to 15, and that of ITS2 has increased from 3 to 14. The 15 ITS1 sequence types are designated types A through O, and the 14 ITS2 types are named types a through n. A total of 59 types of P. carinii f. sp. hominis were found in this study. PMID:9508304
A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotide distribution.

PubMed

Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme

2013-07-01

The design of RNA sequences folding into predefined secondary structures is a milestone for many synthetic biology and gene therapy studies. Most of the current software uses similar local search strategies (i.e. a random seed is progressively adapted to acquire the desired folding properties) and more importantly do not allow the user to control explicitly the nucleotide distribution such as the GC-content in their sequences. However, the latter is an important criterion for large-scale applications as it could presumably be used to design sequences with better transcription rates and/or structural plasticity. In this article, we introduce IncaRNAtion, a novel algorithm to design RNA sequences folding into target secondary structures with a predefined nucleotide distribution. IncaRNAtion uses a global sampling approach and weighted sampling techniques. We show that our approach is fast (i.e. running time comparable or better than local search methods), seedless (we remove the bias of the seed in local search heuristics) and successfully generates high-quality sequences (i.e. thermodynamically stable) for any GC-content. To complete this study, we develop a hybrid method combining our global sampling approach with local search strategies. Remarkably, our glocal methodology overcomes both local and global approaches for sampling sequences with a specific GC-content and target structure. IncaRNAtion is available at csb.cs.mcgill.ca/incarnation/. Supplementary data are available at Bioinformatics online.
alpha-Amylase gene of Streptomyces limosus: nucleotide sequence, expression motifs, and amino acid sequence homology to mammalian and invertebrate alpha-amylases.

PubMed Central

Long, C M; Virolle, M J; Chang, S Y; Chang, S; Bibb, M J

1987-01-01

The nucleotide sequence of the coding and regulatory regions of the alpha-amylase gene (aml) of Streptomyces limosus was determined. High-resolution S1 mapping was used to locate the 5' end of the transcript and demonstrated that the gene is transcribed from a unique promoter. The predicted amino acid sequence has considerable identity to mammalian and invertebrate alpha-amylases, but not to those of plant, fungal, or eubacterial origin. Consistent with this is the susceptibility of the enzyme to an inhibitor of mammalian alpha-amylases. The amino-terminal sequence of the extracellular enzyme was determined, revealing the presence of a typical signal peptide preceding the mature form of the alpha-amylase. Images PMID:3500166
Isolation of a cDNA Encoding a Granule-Bound 152-Kilodalton Starch-Branching Enzyme in Wheat1

PubMed Central

Båga, Monica; Nair, Ramesh B.; Repellin, Anne; Scoles, Graham J.; Chibbar, Ravindra N.

2000-01-01

Screening of a wheat (Triticum aestivum) cDNA library for starch-branching enzyme I (SBEI) genes combined with 5′-rapid amplification of cDNA ends resulted in isolation of a 4,563-bp composite cDNA, Sbe1c. Based on sequence alignment to characterized SBEI cDNA clones isolated from plants, the SBEIc predicted from the cDNA sequence was produced with a transit peptide directing the polypeptide into plastids. Furthermore, the predicted mature form of SBEIc was much larger (152 kD) than previously characterized plant SBEI (80–100 kD) and contained a partial duplication of SBEI sequences. The first SBEI domain showed high amino acid similarity to a 74-kD wheat SBEI-like protein that is inactive as a branching enzyme when expressed in Escherichia coli. The second SBEI domain on SBEIc was identical in sequence to a functional 87-kD SBEI produced in the wheat endosperm. Immunoblot analysis of proteins produced in developing wheat kernels demonstrated that the 152-kD SBEIc was, in contrast to the 87- to 88-kD SBEI, preferentially associated with the starch granules. Proteins similar in size and recognized by wheat SBEI antibodies were also present in Triticum monococcum, Triticum tauschii, and Triticum turgidum subsp. durum. PMID:10982440
Generation and analysis of expressed sequence tags from a cDNA library of the fruiting body of Ganoderma lucidum

PubMed Central

2010-01-01

Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644
HUNT: launch of a full-length cDNA database from the Helix Research Institute.

PubMed

Yudate, H T; Suwa, M; Irie, R; Matsui, H; Nishikawa, T; Nakamura, Y; Yamaguchi, D; Peng, Z Z; Yamamoto, T; Nagai, K; Hayashi, K; Otsuki, T; Sugiyama, T; Ota, T; Suzuki, Y; Sugano, S; Isogai, T; Masuho, Y

2001-01-01

The Helix Research Institute (HRI) in Japan is releasing 4356 HUman Novel Transcripts and related information in the newly established HUNT database. The institute is a joint research project principally funded by the Japanese Ministry of International Trade and Industry, and the clones were sequenced in the governmental New Energy and Industrial Technology Development Organization (NEDO) Human cDNA Sequencing Project. The HUNT database contains an extensive amount of annotation from advanced analysis and represents an essential bioinformatics contribution towards understanding of the gene function. The HRI human cDNA clones were obtained from full-length enriched cDNA libraries constructed with the oligo-capping method and have resulted in novel full-length cDNA sequences. A large fraction has little similarity to any proteins of known function and to obtain clues about possible function we have developed original analysis procedures. Any putative function deduced here can be validated or refuted by complementary analysis results. The user can also extract information from specific categories like PROSITE patterns, PFAM domains, PSORT localization, transmembrane helices and clones with GENIUS structure assignments. The HUNT database can be accessed at http://www.hri.co.jp/HUNT.
Nucleic acid analysis using terminal-phosphate-labeled nucleotides

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2008-04-22

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Pstl repeat: a family of short interspersed nucleotide element (SINE)-like sequences in the genomes of cattle, goat, and buffalo.

PubMed

Sheikh, Faruk G; Mukhopadhyay, Sudit S; Gupta, Prabhakar

2002-02-01

The PstI family of elements are short, highly repetitive DNA sequences interspersed throughout the genome of the Bovidae. We have cloned and sequenced some members of the PstI family from cattle, goat, and buffalo. These elements are approximately 500 bp, have a copy number of 2 x 10(5) - 4 x 10(5), and comprise about 4% of the haploid genome. Studies of nucleotide sequence homology indicate that the buffalo and goat PstI repeats (type II) are similar types of short interspersed nucleotide element (SINE) sequences, but the cattle PstI repeat (type I) is considerably more divergent. Additionally, the goat PstI sequence showed significant sequence homology with bovine serine tRNA, and is therefore likely derived from serine tRNA. Interestingly, Southern hybridization suggests that both types of SINEs (I and II) are present in all the species of Bovidae. Dendrogram analysis indicates that cattle PstI SINE is similar to bovine Alu-like SINEs. Goat and buffalo SINEs formed a separate cluster, suggesting that these two types of SINEs evolved separately in the genome of the Bovidae.
Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequences for Rapid Discovery of New Genes from Sisal (Agave sisalana Perr.) Different Developmental Stages

PubMed Central

Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

2012-01-01

To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944
Genomic resources for Myzus persicae: EST sequencing, SNP identification, and microarray design

PubMed Central

Ramsey, John S; Wilson, Alex CC; de Vos, Martin; Sun, Qi; Tamborindeguy, Cecilia; Winfield, Agnese; Malloch, Gaynor; Smith, Dawn M; Fenton, Brian; Gray, Stewart M; Jander, Georg

2007-01-01

Background The green peach aphid, Myzus persicae (Sulzer), is a world-wide insect pest capable of infesting more than 40 plant families, including many crop species. However, despite the significant damage inflicted by M. persicae in agricultural systems through direct feeding damage and by its ability to transmit plant viruses, limited genomic information is available for this species. Results Sequencing of 16 M. persicae cDNA libraries generated 26,669 expressed sequence tags (ESTs). Aphids for library construction were raised on Arabidopsis thaliana, Nicotiana benthamiana, Brassica oleracea, B. napus, and Physalis floridana (with and without Potato leafroll virus infection). The M. persicae cDNA libraries include ones made from sexual and asexual whole aphids, guts, heads, and salivary glands. In silico comparison of cDNA libraries identified aphid genes with tissue-specific expression patterns, and gene expression that is induced by feeding on Nicotiana benthamiana. Furthermore, 2423 genes that are novel to science and potentially aphid-specific were identified. Comparison of cDNA data from three aphid lineages identified single nucleotide polymorphisms that can be used as genetic markers and, in some cases, may represent functional differences in the protein products. In particular, non-conservative amino acid substitutions in a highly expressed gut protease may be of adaptive significance for M. persicae feeding on different host plants. The Agilent eArray platform was used to design an M. persicae oligonucleotide microarray representing over 10,000 unique genes. Conclusion New genomic resources have been developed for M. persicae, an agriculturally important insect pest. These include previously unknown sequence data, a collection of expressed genes, molecular markers, and a DNA microarray that can be used to study aphid gene expression. These resources will help elucidate the adaptations that allow M. persicae to develop compatible interactions with its
Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions.

PubMed

Nishizawa, M; Nishizawa, K

2000-10-01

The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the 'between gene' GC content heterogeneity, which is linked to 'isochores', is a principal factor associated with the bias in substitution patterns in human, 'within gene' heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed.
Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions

PubMed Central

Nishizawa, Manami; Nishizawa, Kazuhisa

2000-01-01

The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the ‘between gene’ GC content heterogeneity, which is linked to ‘isochores’, is a principal factor associated with the bias in substitution patterns in human, ‘within gene’ heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed. PMID:11000273
37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Form and format for... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... Code for Information Interchange (ASCII) text. No other formats shall be allowed. (3) The computer...
Developing Single Nucleotide Polymorphism (SNP) markers from transcriptome sequences for the identification of longan (Dimocarpus longan) germplasm

USDA-ARS?s Scientific Manuscript database

Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in...
Purification, cDNA cloning, and regulation of lysophospholipase from rat liver.

PubMed

Sugimoto, H; Hayashi, H; Yamashita, S

1996-03-29

A lysophospholipase was purified 506-fold from rat liver supernatant. The preparation gave a single 24-kDa protein band on SDS-polyacrylamide gel electrophoresis. The enzyme hydrolyzed lysophosphatidylcholine, lysophosphatidylethanolamine, lysophosphatidylinositol, lysophosphatidylserine, and 1-oleoyl-2-acetyl-sn-glycero-3-phosphocholine at pH 6-8. The purified enzyme was used for the preparation of antibody and peptide sequencing. A cDNA clone was isolated by screening a rat liver lambda gt11 cDNA library with the antibody, followed by the selection of further extended clones from a lambda gt10 library. The isolated cDNA was 2,362 base pairs in length and contained an open reading frame encoding 230 amino acids with a Mr of 24,708. The peptide sequences determined were found in the reading frame. When the cDNA was expressed in Escherichia coli cells as the beta-galactosidase fusion, lysophosphatidylcholine-hydrolyzing activity was markedly increased. The deduced amino acid sequence showed significant similarity to Pseudomonas fluorescence esterase A and Spirulina platensis esterase. The three sequences contained the GXSXG consensus at similar positions. The transcript was found in various tissues with the following order of abundance: spleen, heart, kidney, brain, lung, stomach, and testis = liver. In contrast, the enzyme protein was abundant in the following order: testis, liver, kidney, heart, stomach, lung, brain, and spleen. Thus the mRNA abundance disagreed with the level of the enzyme protein in liver, testis, and spleen. When HL-60 cells were induced to differentiate into granulocytes with dimethyl sulfoxide, the 24-kDa lysophospholipase protein increased significantly, but the mRNA abundance remained essentially unchanged. Thus a posttranscriptional control mechanism is present for the regulation of 24-kDa lysophospholipase.
Biosynthesis of Lipoic Acid in Arabidopsis: Cloning and Characterization of the cDNA for Lipoic Acid Synthase1

PubMed Central

Yasuno, Rie; Wada, Hajime

1998-01-01

Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide. PMID:9808738

Identification of Abundantly Expressed Novel and Conserved Genes from the Infective Larval Stage of Toxocara canis by an Expressed Sequence Tag Strategy

PubMed Central

Tetteh, Kevin K. A.; Loukas, Alex; Tripp, Cindy; Maizels, Rick M.

1999-01-01

Larvae of Toxocara canis, a nematode parasite of dogs, infect humans, causing visceral and ocular larva migrans. In noncanid hosts, larvae neither grow nor differentiate but endure in a state of arrested development. Reasoning that parasite protein production is orientated to immune evasion, we undertook a random sequencing project from a larval cDNA library to characterize the most highly expressed transcripts. In all, 266 clones were sequenced, most from both 3′ and 5′ ends, and similarity searches against GenBank protein and dbEST nucleotide databases were conducted. Cluster analyses showed that 128 distinct gene products had been found, all but 3 of which represented newly identified genes. Ninety-five genes were represented by a single clone, but seven transcripts were present at high frequencies, each composing >2% of all clones sequenced. These high-abundance transcripts include a mucin and a C-type lectin, which are both major excretory-secretory antigens released by parasites. Four highly expressed novel gene transcripts, termed ant (abundant novel transcript) genes, were found. Together, these four genes comprised 18% of all cDNA clones isolated, but no similar sequences occur in the Caenorhabditis elegans genome. While the coding regions of the four genes are dissimilar, their 3′ untranslated tracts have significant homology in nucleotide sequence. The discovery of these abundant, parasite-specific genes of newly identified lectins and mucins, as well as a range of conserved and novel proteins, provides defined candidates for future analysis of the molecular basis of immune evasion by T. canis. PMID:10456930
Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Torella, JP; Lienert, F; Boehm, CR

2014-08-07

Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less
Nucleotide sequence and phylogenetic analysis of Cucurbit yellow stunting disorder virus RNA 2.

PubMed

Livieratos, Ioannis C; Coutts, Robert H A

2002-06-01

The complete nucleotide sequence of Cucurbit yellow stunting disorder virus (CYSDV) RNA 2, a whitefly (Bemisia tabaci)-transmitted closterovirus with a bi-partite genome, is reported. CYSDV RNA 2 is 7,281 nucleotides long and contains the closterovirus hallmark gene array with a similar arrangement to the prototype member of the genus Crinivirus, Lettuce infectious yellows virus (LIYV). CYSDV RNA 2 contains open reading frames (ORFs) potentially encoding in a 5' to 3' direction for proteins of 5 kDa (ORF 1; hydrophobic protein), 62 kDa (ORF 2; heat shock protein 70 homolog, HSP70h), 59 kDa (ORF 3; protein of unknown function), 9 kDa (ORF 4; protein of unknown function), 28.5 kDa (ORF 5; coat protein, CP), 53 kDa (ORF 6; coat protein minor, CPm), and 26.5 kDa (ORF 7; protein of unknown function). Pairwise comparisons of CYSDV RNA 2-encoded proteins (HSP70h, p59 and CPm) among the closteroviruses showed that CYSDV is closely related to LIYV. Phylogenetic analysis based on the amino acid sequence of the HSP70h, indicated that CYSDV clusters with other members of the genus Crinivirus, and it is related to Little cherry virus-1 (LChV-1), but is distinct from the aphid- or mealybug-transmitted closteroviruses.
cDNA cloning, functional expression and cellular localization of rat liver mitochondrial electron-transfer flavoprotein-ubiquinone oxidoreductase protein.

PubMed

Huang, Shengbing; Song, Wei; Lin, Qishui

2005-08-01

A membrane-bound protein was purified from rat liver mitochondria. After being digested with V8 protease, two peptides containing identical 14 amino acid residue sequences were obtained. Using the 14 amino acid peptide derived DNA sequence as gene specific primer, the cDNA of correspondent gene 5'-terminal and 3'-terminal were obtained by RACE technique. The full-length cDNA that encoded a protein of 616 amino acids was thus cloned, which included the above mentioned peptide sequence. The full length cDNA was highly homologous to that of human ETF-QO, indicating that it may be the cDNA of rat ETF-QO. ETF-QO is an iron sulfur protein located in mitochondria inner membrane containing two kinds of redox center: FAD and [4Fe-4S] center. After comparing the sequence from the cDNA of the 616 amino acids protein with that of the mature protein of rat liver mitochondria, it was found that the N terminal 32 amino acid residues did not exist in the mature protein, indicating that the cDNA was that of ETF-QOp. When the cDNA was expressed in Saccharomyces cerevisiae with inducible vectors, the protein product was enriched in mitochondrial fraction and exhibited electron transfer activity (NBT reductase activity) of ETF-QO. Results demonstrated that the 32 amino acid peptide was a mitochondrial targeting peptide, and both FAD and iron-sulfur cluster were inserted properly into the expressed ETF-QO. ETF-QO had a high level expression in rat heart, liver and kidney. The fusion protein of GFP-ETF-QO co-localized with mitochondria in COS-7 cells.
miBLAST: scalable evaluation of a batch of nucleotide sequence queries with BLAST

PubMed Central

Kim, You Jung; Boyd, Andrew; Athey, Brian D.; Patel, Jignesh M.

2005-01-01

A common task in many modern bioinformatics applications is to match a set of nucleotide query sequences against a large sequence dataset. Exis-ting tools, such as BLAST, are designed to evaluate a single query at a time and can be unacceptably slow when the number of sequences in the query set is large. In this paper, we present a new algorithm, called miBLAST, that evaluates such batch workloads efficiently. At the core, miBLAST employs a q-gram filtering and an index join for efficiently detecting similarity between the query sequences and database sequences. This set-oriented technique, which indexes both the query and the database sets, results in substantial performance improvements over existing methods. Our results show that miBLAST is significantly faster than BLAST in many cases. For example, miBLAST aligned 247 965 oligonucleotide sequences in the Affymetrix probe set against the Human UniGene in 1.26 days, compared with 27.27 days with BLAST (an improvement by a factor of 22). The relative performance of miBLAST increases for larger word sizes; however, it decreases for longer queries. miBLAST employs the familiar BLAST statistical model and output format, guaranteeing the same accuracy as BLAST and facilitating a seamless transition for existing BLAST users. PMID:16061938
Nucleotide sequence analysis of the recA gene and discrimination of the three isolates of urease-positive thermophilic Campylobacter (UPTC) isolated from seagulls (Larus spp.) in Northern Ireland.

PubMed

Matsuda, M; Tai, K; Moore, J E; Millar, B C; Murayama, O

2004-01-01

Nucleotide sequencing after TA cloning of the amplicon of the almost-full length recA gene from three strains of UPTC (A1, A2, and A3) isolated from seagulls in Northern Ireland, the phenotypical and genotypical characteristics of which have been demonstrated to be indistinguishable, clarified nucleotide differences at three nucleotide positions among the three strains. In conclusion, the nucleotide sequences of the recA gene were found to discriminate among the three strains of UPTC, A1, A2, and A3, which are indistinguishable phenotypically and genotypically. Thus, the present study strongly suggests that nucleotide sequence data of the amplicon of a suitable gene or region could aid in discriminating among isolates of the UPTC group, which are indistinguishable phenotypically and genotypically. Copyright 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
3G vector-primer plasmid for constructing full-length-enriched cDNA libraries.

PubMed

Zheng, Dong; Zhou, Yanna; Zhang, Zidong; Li, Zaiyu; Liu, Xuedong

2008-09-01

We designed a 3G vector-primer plasmid for the generation of full-length-enriched complementary DNA (cDNA) libraries. By employing the terminal transferase activity of reverse transcriptase and the modified strand replacement method, this plasmid (assembled with a polydT end and a deoxyguanosine [dG] end) combines priming full-length cDNA strand synthesis and directional cDNA cloning. As a result, the number of steps involved in cDNA library preparation is decreased while simplifying downstream gene manipulation, sequencing, and subcloning. The 3G vector-primer plasmid method yields fully represented plasmid primed libraries that are equivalent to those made by the SMART (switching mechanism at 5' end of RNA transcript) approach.
Complete genome sequence of a novel Plum pox virus strain W isolate determined by 454 pyrosequencing.

PubMed

Sheveleva, Anna; Kudryavtseva, Anna; Speranskaya, Anna; Belenikin, Maxim; Melnikova, Natalia; Chirkov, Sergei

2013-10-01

The near-complete (99.7 %) genome sequence of a novel Russian Plum pox virus (PPV) isolate Pk, belonging to the strain Winona (W), has been determined by 454 pyrosequencing with the exception of the thirty-one 5'-terminal nucleotides. This region was amplified using 5'RACE kit and sequenced by the Sanger method. Genomic RNA released from immunocaptured PPV particles was employed for generation of cDNA library using TransPlex Whole transcriptome amplification kit (WTA2, Sigma-Aldrich). The entire Pk genome has identity level of 92.8-94.5 % when compared to the complete nucleotide sequences of other PPV-W isolates (W3174, LV-141pl, LV-145bt, and UKR 44189), confirming a high degree of variability within the PPV-W strain. The isolates Pk and LV-141pl are most closely related. The Pk has been found in a wild plum (Prunus domestica) in a new region of Russia indicating widespread dissemination of the PPV-W strain in the European part of the former USSR.
Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8)

PubMed Central

Russo, James J.; Bohenzky, Roy A.; Chien, Ming-Cheng; Chen, Jing; Yan, Ming; Maddalena, Dawn; Parry, J. Preston; Peruzzi, Daniela; Edelman, Isidore S.; Chang, Yuan; Moore, Patrick S.

1996-01-01

The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor. PMID:8962146
Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

PubMed Central

Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

2013-01-01

In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105
Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

PubMed

Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

2013-05-24

In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers.
Sequencing and characterization of asclepain f: the first cysteine peptidase cDNA cloned and expressed from Asclepias fruticosa latex.

PubMed

Trejo, Sebastián A; López, Laura M I; Caffini, Néstor O; Natalucci, Claudia L; Canals, Francesc; Avilés, Francesc X

2009-07-01

Asclepain f is a papain-like protease previously isolated and characterized from latex of Asclepias fruticosa. This enzyme is a member of the C1 family of cysteine proteases that are synthesized as preproenzymes. The enzyme belongs to the alpha + beta class of proteins, with two disulfide bridges (Cys22-Cys63 and Cys56-Cys95) in the alpha domain, and another one (Cys150-Cys201) in the beta domain, as was determined by molecular modeling. A full-length 1,152 bp cDNA was cloned by RT-RACE-PCR from latex mRNA. The sequence was predicted as an open reading frame of 340 amino acid residues, of which 16 residues belong to the signal peptide, 113 to the propeptide and 211 to the mature enzyme. The full-length cDNA was ligated to pPICZalpha vector and expressed in Pichia pastoris. Recombinant asclepain f showed endopeptidase activity on pGlu-Phe-Leu-p-nitroanilide and was identified by PMF-MALDI-TOF MS. Asclepain f is the first peptidase cloned and expressed from mRNA isolated from plant latex, confirming the presence of the preprocysteine peptidase in the latex.
Partial bisulfite conversion for unique template sequencing.

PubMed

Kumar, Vijay; Rosenbaum, Julie; Wang, Zihua; Forcier, Talitha; Ronemus, Michael; Wigler, Michael; Levy, Dan

2018-01-25

We introduce a new protocol, mutational sequencing or muSeq, which uses sodium bisulfite to randomly deaminate unmethylated cytosines at a fixed and tunable rate. The muSeq protocol marks each initial template molecule with a unique mutation signature that is present in every copy of the template, and in every fragmented copy of a copy. In the sequenced read data, this signature is observed as a unique pattern of C-to-T or G-to-A nucleotide conversions. Clustering reads with the same conversion pattern enables accurate count and long-range assembly of initial template molecules from short-read sequence data. We explore count and low-error sequencing by profiling 135 000 restriction fragments in a PstI representation, demonstrating that muSeq improves copy number inference and significantly reduces sporadic sequencer error. We explore long-range assembly in the context of cDNA, generating contiguous transcript clusters greater than 3,000 bp in length. The muSeq assemblies reveal transcriptional diversity not observable from short-read data alone. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Nucleotide sequence of a chickpea chlorotic stunt virus relative that infects pea and faba bean in China.

PubMed

Zhou, Cui-Ji; Xiang, Hai-Ying; Zhuo, Tao; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui

2012-07-01

We determined the genome sequence of a new polerovirus that infects field pea and faba bean in China. Its entire nucleotide sequence (6021 nt) was most closely related (83.3% identity) to that of an Ethiopian isolate of chickpea chlorotic stunt virus (CpCSV-Eth). With the exception of the coat protein (encoded by ORF3), amino acid sequence identities of all gene products of this virus to those of CpCSV-Eth and other poleroviruses were <90%. This suggests that it is a new member of the genus Polerovirus, and the name pea mild chlorosis virus is proposed.
Escaping introns in COI through cDNA barcoding of mushrooms: Pleurotus as a test case.

PubMed

Avin, Farhat A; Subha, Bhassu; Tan, Yee-Shin; Braukmann, Thomas W A; Vikineswary, Sabaratnam; Hebert, Paul D N

2017-09-01

DNA barcoding involves the use of one or more short, standardized DNA fragments for the rapid identification of species. A 648-bp segment near the 5' terminus of the mitochondrial cytochrome c oxidase subunit I (COI) gene has been adopted as the universal DNA barcode for members of the animal kingdom, but its utility in mushrooms is complicated by the frequent occurrence of large introns. As a consequence, ITS has been adopted as the standard DNA barcode marker for mushrooms despite several shortcomings. This study employed newly designed primers coupled with cDNA analysis to examine COI sequence diversity in six species of Pleurotus and compared these results with those for ITS. The ability of the COI gene to discriminate six species of Pleurotus , the commonly cultivated oyster mushroom, was examined by analysis of cDNA. The amplification success, sequence variation within and among species, and the ability to design effective primers was tested. We compared ITS sequences to their COI cDNA counterparts for all isolates. ITS discriminated between all six species, but some sequence results were uninterpretable, because of length variation among ITS copies. By comparison, a complete COI sequences were recovered from all but three individuals of Pleurotus giganteus where only the 5' region was obtained. The COI sequences permitted the resolution of all species when partial data was excluded for P. giganteus . Our results suggest that COI can be a useful barcode marker for mushrooms when cDNA analysis is adopted, permitting identifications in cases where ITS cannot be recovered or where it offers higher resolution when fresh tissue is. The suitability of this approach remains to be confirmed for other mushrooms.
The complete nucleotide sequences of the five genetically distinct plastid genomes of Oenothera, subsection Oenothera: I. sequence evaluation and plastome evolution.

PubMed

Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G

2008-04-01

The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome-genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I-III in one clade, while plastome IV appears to be closest to the common ancestor.
The complete nucleotide sequences of the five genetically distinct plastid genomes of Oenothera, subsection Oenothera: I. Sequence evaluation and plastome evolution†

PubMed Central

Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V.; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G.

2008-01-01

The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome–genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I–III in one clade, while plastome IV appears to be closest to the common ancestor. PMID:18299283
Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodes

PubMed Central

Shiroguchi, Katsuyuki; Jia, Tony Z.; Sims, Peter A.; Xie, X. Sunney

2012-01-01

RNA sequencing (RNA-Seq) is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable, even in the presence of multiple sequencing errors. This method allows counting with single-copy resolution despite sequence-dependent bias and PCR-amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of Escherichia coli with more accurate and reproducible quantification than conventional RNA-Seq. PMID:22232676
Canine adiponectin: cDNA structure, mRNA expression in adipose tissues and reduced plasma levels in obesity.

PubMed

Ishioka, K; Omachi, A; Sagawa, M; Shibata, H; Honjoh, T; Kimura, K; Saito, M

2006-04-01

Adiponectin is a protein synthesized and secreted by adipocytes. Decreased adiponectin is responsible for insulin resistance and atherosclerosis associated with human obesity. We obtained a cDNA clone corresponding to canine adiponectin, whose nucleotide and deduced amino acid sequences were highly identical to those of other species. Adiponectin mRNA was detected in adipose tissues, but not in other tissues, of dogs. When 22 adult beagles were given a high-energy diet for 14 weeks, they became obese, showing heavier body weights, higher plasma leptin concentrations, but lower plasma adiponectin concentrations. The adiponectin concentrations of plasma samples collected from 71 dogs visiting veterinary practices were negatively correlated to plasma leptin concentrations, being lower in obese than non-obese dogs. These results are compatible with those reported in other species, and suggest that adiponectin is an index of adiposity and a target molecule for studies on diseases associated with obesity in dogs.
Single nucleotide polymorphisms from Theobroma cacao expressed sequence tags associated with witches' broom disease in cacao.

PubMed

Lima, L S; Gramacho, K P; Carels, N; Novais, R; Gaiotto, F A; Lopes, U V; Gesteira, A S; Zaidan, H A; Cascardo, J C M; Pires, J L; Micheli, F

2009-07-14

In order to increase the efficiency of cacao tree resistance to witches' broom disease, which is caused by Moniliophthora perniciosa (Tricholomataceae), we looked for molecular markers that could help in the selection of resistant cacao genotypes. Among the different markers useful for developing marker-assisted selection, single nucleotide polymorphisms (SNPs) constitute the most common type of sequence difference between alleles and can be easily detected by in silico analysis from expressed sequence tag libraries. We report the first detection and analysis of SNPs from cacao-M. perniciosa interaction expressed sequence tags, using bioinformatics. Selection based on analysis of these SNPs should be useful for developing cacao varieties resistant to this devastating disease.

Extending Immunological Profiling in the Gilthead Sea Bream, Sparus aurata, by Enriched cDNA Library Analysis, Microarray Design and Initial Studies upon the Inflammatory Response to PAMPs.

PubMed

Boltaña, Sebastian; Castellana, Barbara; Goetz, Giles; Tort, Lluis; Teles, Mariana; Mulero, Victor; Novoa, Beatriz; Figueras, Antonio; Goetz, Frederick W; Gallardo-Escarate, Cristian; Planas, Josep V; Mackenzie, Simon

2017-02-03

This study describes the development and validation of an enriched oligonucleotide-microarray platform for Sparus aurata (SAQ) to provide a platform for transcriptomic studies in this species. A transcriptome database was constructed by assembly of gilthead sea bream sequences derived from public repositories of mRNA together with reads from a large collection of expressed sequence tags (EST) from two extensive targeted cDNA libraries characterizing mRNA transcripts regulated by both bacterial and viral challenge. The developed microarray was further validated by analysing monocyte/macrophage activation profiles after challenge with two Gram-negative bacterial pathogen-associated molecular patterns (PAMPs; lipopolysaccharide (LPS) and peptidoglycan (PGN)). Of the approximately 10,000 EST sequenced, we obtained a total of 6837 EST longer than 100 nt, with 3778 and 3059 EST obtained from the bacterial-primed and from the viral-primed cDNA libraries, respectively. Functional classification of contigs from the bacterial- and viral-primed cDNA libraries by Gene Ontology (GO) showed that the top five represented categories were equally represented in the two libraries: metabolism (approximately 24% of the total number of contigs), carrier proteins/membrane transport (approximately 15%), effectors/modulators and cell communication (approximately 11%), nucleoside, nucleotide and nucleic acid metabolism (approximately 7.5%) and intracellular transducers/signal transduction (approximately 5%). Transcriptome analyses using this enriched oligonucleotide platform identified differential shifts in the response to PGN and LPS in macrophage-like cells, highlighting responsive gene-cassettes tightly related to PAMP host recognition. As observed in other fish species, PGN is a powerful activator of the inflammatory response in S. aurata macrophage-like cells. We have developed and validated an oligonucleotide microarray (SAQ) that provides a platform enriched for the study of gene
Molecular characterization and phylogenetic analysis of a yak (Bos grunniens) κ-casein cDNA from lactating mammary gland.

PubMed

Bai, W L; Yin, R H; Dou, Q L; Jiang, W Q; Zhao, S J; Ma, Z J; Luo, G B; Zhao, Z H

2011-04-01

κ-Casein is one of the major proteins in the milk of mammals. It plays an important role in determining the size and specific function of milk micelles. We have previously identified and characterized a genetic variant of yak κ-casein by evaluating genomic DNA. Here, we isolate and characterize a yak κ-casein cDNA harboring the full-length open reading frame (ORF) from lactating mammary gland. Total RNA was extracted from mammary tissue of lactating female yak, and the κ-casein cDNA were synthesized by RT-PCR technique, then cloned and sequenced. The obtained cDNA of 660-bp contained an ORF sufficient to encode the entire amino acid sequence of κ-casein precursor protein consisting of 190 amino acids with a signal peptide of 21 amino acids. Yak κ-casein has a predicted molecular mass of 19,006.588 Da with a calculated isoelectric point of 7.245. Compared with the corresponding sequences in GenBank of cattle, buffalo, sheep, goat, Arabian camel, horse, and rabbit, yak κ-casein sequence had identity of 64.76-98.78% in cDNA, and identity of 44.79-98.42% and similarity of 53.65-98.42% in deduced amino acids, revealing a high homology with the other livestock species. Based on κ-casein cDNA sequences, the phylogenetic analysis indicated that yak κ-casein had a close relationship with that of cattle. This work might be useful in the genetic engineering researches for yak κ-casein.
Systematic and stochastic influences on the performance of the MinION nanopore sequencer across a range of nucleotide bias

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krishnakumar, Raga; Sinha, Anupama; Bird, Sara W.

Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed themore » quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.« less
Systematic and stochastic influences on the performance of the MinION nanopore sequencer across a range of nucleotide bias

DOE PAGES

Krishnakumar, Raga; Sinha, Anupama; Bird, Sara W.; ...

2018-02-16

Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed themore » quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.« less
A combined de novo protein sequencing and cDNA library approach to the venomic analysis of Chinese spider Araneus ventricosus.

PubMed

Duan, Zhigui; Cao, Rui; Jiang, Liping; Liang, Songping

2013-01-14

In past years, spider venoms have attracted increasing attention due to their extraordinary chemical and pharmacological diversity. The recently popularized proteomic method highly improved our ability to analyze the proteins in the venom. However, the lack of information about isolated venom proteins sequences dramatically limits the ability to confidently identify venom proteins. In the present paper, the venom from Araneus ventricosus was analyzed using two complementary approaches: 2-DE/Shotgun-LC-MS/MS coupled to MASCOT search and 2-DE/Shotgun-LC-MS/MS coupled to manual de novo sequencing followed by local venom protein database (LVPD) search. The LVPD was constructed with toxin-like protein sequences obtained from the analysis of cDNA library from A. ventricosus venom glands. Our results indicate that a total of 130 toxin-like protein sequences were unambiguously identified by manual de novo sequencing coupled to LVPD search, accounting for 86.67% of all toxin-like proteins in LVPD. Thus manual de novo sequencing coupled to LVPD search was proved an extremely effective approach for the analysis of venom proteins. In addition, the approach displays impeccable advantage in validating mutant positions of isoforms from the same toxin-like family. Intriguingly, methyl esterifcation of glutamic acid was discovered for the first time in animal venom proteins by manual de novo sequencing. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
cDNA cloning and sequence determination of the pheromone biosynthesis activating neuropeptide from the seabuckthorn carpenterworm, Holcocerus hippophaecolus (Lepidoptera: Cossidae).

PubMed

Li, Juan; Zhou, Jiao; Sun, Rongbo; Zhang, Haolin; Zong, Shixiang; Luo, Youqing; Sheng, Xia; Weng, Qiang

2013-04-01

The PBAN (pheromone biosynthesis activating neuropeptide)/pyrokinin peptides comprise a major neuropeptide family characterized by a common FXPRL amide at the C-terminus. These peptides are actively involved in many essential endocrine functions. For the first time, we reported the cDNA cloning and sequence determination of the PBAN from the seabuckthorn carpenterworm, Holcocerus hippophaecolus, by using rapid amplification of cDNA ends. The full-length cDNA of Hh-DH-PBAN contained five peptides: diapause hormone (DH) homolog, α-neuropeptide (NP), β-NP, PBAN, and γ-NP. All of the peptides were amidated at their C-terminus and shared a conserved motif, FXPR (or K) L. Moreover, Hh-DH-PBAN had high homology to the other members of the PBAN peptide family: 56% with Manduca sexta, 66% with Bombyx mori, 77% with Helicoverpa zea, and 47% with Plutella xylostella. Phylogenetic analysis revealed that Hh-DH-PBAN was closely related to PBANs from Noctuidae, demonstrated by the relatively higher similarity compared with H. zea. In addition, real-time quantitative PCR (qRT-PCR) analysis showed that Hh-DH-PBAN mRNA expression peaked in the brain-subesophageal ganglion (Br-SOG) complex, and was also detected at high levels during larval and adult stages. The expression decreased significantly after pupation. These results provided information concerning molecular structure characteristics of Hh-DH-PBAN, whose expression profile suggested that the Hh-DH-PBAN gene might be correlated with larval development and sex pheromone biosynthesis in females of the H. hippophaecolus. 2013 Wiley Periodicals, Inc
The construction of cDNA library and the screening of related antigen of ascitic tumor cells of ovarian cancer.

PubMed

Hou, Q; Chen, K; Shan, Z

2015-01-01

To construct the cDNA library of the ascites tumor cells of ovarian cancer, which can be used to screen the related antigen for the early diagnosis of ovarian cancer and therapeutic targets of immune treatment. Four cases of ovarian serous cystadenocarcinoma, two cases of ovarian mucinous cystadenocarcinoma, and two cases of ovarian endometrial carcinoma in patients with ascitic tumor cells which were used to construct the cDNA library. To screen the ovarian cancer antigen gene, evaluate the enzyme, and analyze nucleotide sequence, serological analysis of recombinant tumor cDNA expression libraries (SEREX) and suppression subtractive hybridization technique (SSH) techniques were utilized. The detection method of recombinant expression-based serological mini-arrays (SMARTA) was used to detect the ovarian cancer antigen and the positive reaction of 105 cases of ovarian cancer patients and 105 normal women's autoantibodies correspondingly in serum. After two rounds of serologic screening and glycosides sequencing analysis, 59 candidates of ovarian cancer antigen gene fragments were finally identified, which corresponded to 50 genes. They were then divided into six categories: (1) the homologous genes which related to the known ovarian cancer genes, such as BARD 1 gene, etc; (2) the homologous genes which were associated with other tumors, such as TM4SFI gene, etc; (3) the genes which were expressed in a special organization, such as ILF3, FXR1 gene, etc; (4) the genes which were the same with some protein genes of special function, such as TIZ, ClD gene; (5) the homologous genes which possessed the same source with embryonic genes, such as PKHD1 gene, etc; (6) the remaining genes were the unknown genes without the homologous sequence in the gene pool, such as OV-189 genes. SEREX technology combined with SSH method is an effective research strategy which can filter tumor antigen with high specific character; the corresponding autoantibodies of TM4SFl, ClD, TIZ, BARDI
Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling

PubMed Central

Morin, Ryan D.; Chang, Elbert; Petrescu, Anca; Liao, Nancy; Griffith, Malachi; Kirkpatrick, Robert; Butterfield, Yaron S.; Young, Alice C.; Stott, Jeffrey; Barber, Sarah; Babakaiff, Ryan; Dickson, Mark C.; Matsuo, Corey; Wong, David; Yang, George S.; Smailus, Duane E.; Wetherby, Keith D.; Kwong, Peggy N.; Grimwood, Jane; Brinkley, Charles P.; Brown-John, Mabel; Reddix-Dugue, Natalie D.; Mayo, Michael; Schmutz, Jeremy; Beland, Jaclyn; Park, Morgan; Gibson, Susan; Olson, Teika; Bouffard, Gerard G.; Tsai, Miranda; Featherstone, Ruth; Chand, Steve; Siddiqui, Asim S.; Jang, Wonhee; Lee, Ed; Klein, Steven L.; Blakesley, Robert W.; Zeeberg, Barry R.; Narasimhan, Sudarshan; Weinstein, John N.; Pennacchio, Christa Prange; Myers, Richard M.; Green, Eric D.; Wagner, Lukas; Gerhard, Daniela S.; Marra, Marco A.; Jones, Steven J.M.; Holt, Robert A.

2006-01-01

Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization. PMID:16672307
Molecular cloning and nucleotide sequence of the alpha and beta subunits of allophycocyanin from the cyanelle genome of Cyanophora paradoxa.

PubMed Central

Bryant, D A; de Lorimier, R; Lambert, D H; Dubbs, J M; Stirewalt, V L; Stevens, S E; Porter, R D; Tam, J; Jay, E

1985-01-01

The genes for the alpha- and beta-subunit apoproteins of allophycocyanin (AP) were isolated from the cyanelle genome of Cyanophora paradoxa and subjected to nucleotide sequence analysis. The AP beta-subunit apoprotein gene was localized to a 7.8-kilobase-pair Pst I restriction fragment from cyanelle DNA by hybridization with a tetradecameric oligonucleotide probe. Sequence analysis using that oligonucleotide and its complement as primers for the dideoxy chain-termination sequencing method confirmed the presence of both AP alpha- and beta-subunit genes on this restriction fragment. Additional oligonucleotide primers were synthesized as sequencing progressed and were used to determine rapidly the nucleotide sequence of a 1336-base-pair region of this cloned fragment. This strategy allowed the sequencing to be completed without a detailed restriction map and without extensive and time-consuming subcloning. The sequenced region contains two open reading frames whose deduced amino acid sequences are 81-85% homologous to cyanobacterial and red algal AP subunits whose amino acid sequences have been determined. The two open reading frames are in the same orientation and are separated by 39 base pairs. AP alpha is 5' to AP beta and both coding sequences are preceded by a polypurine, Shine-Dalgarno-type sequence. Sequences upstream from AP alpha closely resemble the Escherichia coli consensus promoter sequences and also show considerable homology to promoter sequences for several chloroplast-encoded psbA genes. A 56-base-pair palindromic sequence downstream from the AP beta gene could play a role in the termination of transcription or translation. The allophycocyanin apoprotein subunit genes are located on the large single-copy region of the cyanelle genome. PMID:2987916
Cloning and expression of UDP-glucose: flavonoid 7-O-glucosyltransferase from hairy root cultures of Scutellaria baicalensis.

PubMed

Hirotani, M; Kuroda, R; Suzuki, H; Yoshikawa, T

2000-05-01

A cDNA encoding UDP-glucose: baicalein 7-O-glucosyltransferase (UBGT) was isolated from a cDNA library from hairy root cultures of Scutellaria baicalensis Georgi probed with a partial-length cDNA clone of a UDP-glucose: flavonoid 3-O-glucosyltransferase (UFGT) from grape (Vitis vinifera L.). The heterologous probe contained a glucosyltransferase consensus amino acid sequence which was also present in the Scutellaria cDNA clones. The complete nucleotide sequence of the 1688-bp cDNA insert was determined and the deduced amino acid sequences are presented. The nucleotide sequence analysis of UBGT revealed an open reading frame encoding a polypeptide of 476 amino acids with a calculated molecular mass of 53,094 Da. The reaction product for baicalein and UDP-glucose catalyzed by recombinant UBGT in Escherichia coli was identified as authentic baicalein 7-O-glucoside using high-performance liquid chromatography and proton nuclear magnetic resonance spectroscopy. The enzyme activities of recombinant UBGT expressed in E. coli were also detected towards flavonoids such as baicalein, wogonin, apigenin, scutellarein, 7,4'-dihydroxyflavone and kaempferol, and phenolic compounds. The accumulation of UBGT mRNA in hairy roots was in response to wounding or salicylic acid treatments.
Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

PubMed Central

Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete

2007-01-01

Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547
Display of a maize cDNA library on baculovirus infected insect cells.

PubMed

Meller Harel, Helene Y; Fontaine, Veronique; Chen, Hongying; Jones, Ian M; Millner, Paul A

2008-08-12

Maize is a good model system for cereal crop genetics and development because of its rich genetic heritage and well-characterized morphology. The sequencing of its genome is well advanced, and new technologies for efficient proteomic analysis are needed. Baculovirus expression systems have been used for the last twenty years to express in insect cells a wide variety of eukaryotic proteins that require complex folding or extensive posttranslational modification. More recently, baculovirus display technologies based on the expression of foreign sequences on the surface of Autographa californica (AcMNPV) have been developed. We investigated the potential of a display methodology for a cDNA library of maize young seedlings. We constructed a full-length cDNA library of young maize etiolated seedlings in the transfer vector pAcTMVSVG. The library contained a total of 2.5 x 10(5) independent clones. Expression of two known maize proteins, calreticulin and auxin binding protein (ABP1), was shown by western blot analysis of protein extracts from insect cells infected with the cDNA library. Display of the two proteins in infected insect cells was shown by selective biopanning using magnetic cell sorting and demonstrated proof of concept that the baculovirus maize cDNA display library could be used to identify and isolate proteins. The maize cDNA library constructed in this study relies on the novel technology of baculovirus display and is unique in currently published cDNA libraries. Produced to demonstrate proof of principle, it opens the way for the development of a eukaryotic in vivo display tool which would be ideally suited for rapid screening of the maize proteome for binding partners, such as proteins involved in hormone regulation or defence.
Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

PubMed Central

Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

2016-01-01

Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822
Phosducin-like protein: an ethanol-responsive potential modulator of guanine nucleotide-binding protein function.

PubMed

Miles, M F; Barhite, S; Sganga, M; Elliott, M

1993-11-15

Acute and chronic exposure to ethanol produces specific changes in several signal transduction cascades. Such alterations in signaling are thought to be a crucial aspect of the central nervous system's adaptive response, which occurs with chronic exposure to ethanol. We have recently identified and isolated several genes whose expression is specifically induced by ethanol in neural cell cultures. The product of one of these genes has extensive sequence homology to phosducin, a phosphoprotein expressed in retina and pineal gland that modulates trimeric guanine nucleotide-binding protein (G protein) function by binding to G-protein beta gamma subunits. We identified from a rat brain cDNA library an isolate encoding the phosducin-like protein (PhLP), which has 41% identity and 65% amino acid homology to phosducin. PhLP cDNA is expressed in all tissues screened by RNA blot-hybridization analysis and shows marked evolutionary conservation on Southern hybridization. We have identified four forms of PhLP cDNA varying only in their 5' ends, probably due to alternative splicing. This 5'-end variation generates two predicted forms of PhLP protein that differ by 79 aa at the NH2 terminus. Treatment of NG108-15 cells for 24 hr with concentrations of ethanol seen in actively drinking alcoholics (25-100 mM) causes up to a 3-fold increase in PhLP mRNA levels. Induction of PhLP by ethanol could account for at least some of the widespread alterations in signal transduction and G-protein function that are known to occur with chronic exposure to ethanol.
A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays.

PubMed

Lee, Mei-Ling Ting; Bulyk, Martha L; Whitmore, G A; Church, George M

2002-12-01

There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.
Two RNAs or DNAs May Artificially Fuse Together at a Short Homologous Sequence (SHS) during Reverse Transcription or Polymerase Chain Reactions, and Thus Reporting an SHS-Containing Chimeric RNA Requires Extra Caution

PubMed Central

Xie, Bingkun; Yang, Wei; Ouyang, Yongchang; Chen, Lichan; Jiang, Hesheng; Liao, Yuying; Liao, D. Joshua

2016-01-01

Tens of thousands of chimeric RNAs have been reported. Most of them contain a short homologous sequence (SHS) at the joining site of the two partner genes but are not associated with a fusion gene. We hypothesize that many of these chimeras may be technical artifacts derived from SHS-caused mis-priming in reverse transcription (RT) or polymerase chain reactions (PCR). We cloned six chimeric complementary DNAs (cDNAs) formed by human mitochondrial (mt) 16S rRNA sequences at an SHS, which were similar to several expression sequence tags (ESTs).These chimeras, which could not be detected with cDNA protection assay, were likely formed because some regions of the 16S rRNA are reversely complementary to another region to form an SHS, which allows the downstream sequence to loop back and anneal at the SHS to prime the synthesis of its complementary strand, yielding a palindromic sequence that can form a hairpin-like structure.We identified a 16S rRNA that ended at the 4th nucleotide(nt) of the mt-tRNA-leu was dominant and thus should be the wild type. We also cloned a mouse Bcl2-Nek9 chimeric cDNA that contained a 5-nt unmatchable sequence between the two partners, contained two copies of the reverse primer in the same direction but did not contain the forward primer, making it unclear how this Bcl2-Nek9 was formed and amplified. Moreover, a cDNA was amplified because one primer has 4 nts matched to the template, suggesting that there may be many more artificial cDNAs than we have realized, because the nuclear and mt genomes have many more 4-nt than 5-nt or longer homologues. Altogether, the chimeric cDNAs we cloned are good examples suggesting that many cDNAs may be artifacts due to SHS-caused mis-priming and thus greater caution should be taken when new sequence is obtained from a technique involving DNA polymerization. PMID:27148738
Deep sequencing is an appropriate tool for the selection of unique Hepatitis C virus (HCV) variants after single genomic amplification.

PubMed

Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc'h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine

2017-01-01

Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus's but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies.
Direct bisulfite sequencing for examination of DNA methylation with gene and nucleotide resolution from brain tissues.

PubMed

Parrish, R Ryley; Day, Jeremy J; Lubin, Farah D

2012-07-01

DNA methylation is an epigenetic modification that is essential for the development and mature function of the central nervous system. Due to the relevance of this modification to the transcriptional control of gene expression, it is often necessary to examine changes in DNA methylation patterns with both gene and single-nucleotide resolution. Here, we describe an in-depth basic protocol for direct bisulfite sequencing of DNA isolated from brain tissue, which will permit direct assessment of methylation status at individual genes as well as individual cytosine molecules/nucleotides within a genomic region. This method yields analysis of DNA methylation patterns that is robust, accurate, and reproducible, thereby allowing insights into the role of alterations in DNA methylation in brain tissue.
A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

PubMed

Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

2006-04-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.
In silico Analysis of 2085 Clones from a Normalized Rat Vestibular Periphery 3′ cDNA Library

PubMed Central

Roche, Joseph P.; Cioffi, Joseph A.; Kwitek, Anne E.; Erbe, Christy B.; Popper, Paul

2005-01-01

The inserts from 2400 cDNA clones isolated from a normalized Rattus norvegicus vestibular periphery cDNA library were sequenced and characterized. The Wackym-Soares vestibular 3′ cDNA library was constructed from the saccular and utricular maculae, the ampullae of all three semicircular canals and Scarpa's ganglia containing the somata of the primary afferent neurons, microdissected from 104 male and female rats. The inserts from 2400 randomly selected clones were sequenced from the 5′ end. Each sequence was analyzed using the BLAST algorithm compared to the Genbank nonredundant, rat genome, mouse genome and human genome databases to search for high homology alignments. Of the initial 2400 clones, 315 (13%) were found to be of poor quality and did not yield useful information, and therefore were eliminated from the analysis. Of the remaining 2085 sequences, 918 (44%) were found to represent 758 unique genes having useful annotations that were identified in databases within the public domain or in the published literature; these sequences were designated as known characterized sequences. 1141 sequences (55%) aligned with 1011 unique sequences had no useful annotations and were designated as known but uncharacterized sequences. Of the remaining 26 sequences (1%), 24 aligned with rat genomic sequences, but none matched previously described rat expressed sequence tags or mRNAs. No significant alignment to the rat or human genomic sequences could be found for the remaining 2 sequences. Of the 2085 sequences analyzed, 86% were singletons. The known, characterized sequences were analyzed with the FatiGO online data-mining tool (http://fatigo.bioinfo.cnio.es/) to identify level 5 biological process gene ontology (GO) terms for each alignment and to group alignments with similar or identical GO terms. Numerous genes were identified that have not been previously shown to be expressed in the vestibular system. Further characterization of the novel cDNA sequences may lead

Molecular characterisation and nucleotide sequence analysis of canine parvovirus strains in vaccines in India.

PubMed

Nandi, Sukdeb; Anbazhagan, Rajendra; Kumar, Manoj

2010-01-01

Canine parvovirus 2 (CPV-2) is one of the most important viruses that causes haemorrhagic gastroenteritis and myocarditis of dogs worldwide. The picture has been complicated further due to the emergence of new mutants of CPV, namely: CPV-2a, CPV-2b and CPV-2c. In this study, the molecular characterisation of strains present in the CPV vaccines available on the Indian market was performed using polymerase chain reaction and DNA sequencing. The VP1/VP2 genes of two vaccine strains and a field strain (Bhopal) were sequenced and the nucleotide and the deduced amino acid sequences were compared. The results indicated that the isolate belonged to CPV type 2b and the strains in the vaccines belonged to type CPV-2. From the study, it is inferred that the CPV strain used in commercially available vaccine preparation differed from the strains present in CPV infection in dogs in India.
Identification of cDNAs encoding viper venom hyaluronidases: cross-generic sequence conservation of full-length and unusually short variant transcripts.

PubMed

Harrison, Robert A; Ibison, Frances; Wilbraham, Davina; Wagstaff, Simon C

2007-05-01

The immobilisation of prey by snakes is most efficiently achieved by the rapid dissemination of venom from its site of injection into the blood stream. Hyaluronidase is a common component of snake venoms and has been termed the "venom spreading factor". In the absence of nucleotide or protein sequence data to confirm the functional identity of this venom component, we interrogated a venom gland EST database for the saw-scaled viper, Echis ocellatus (Nigeria), using the gene ontology (GO) term "carbohydrate metabolism". A single hyalurononglucosaminadase-activity matching sequence (EOC00242) was found and used to design PCR primers to acquire the full-length cDNA sequence. Although very different from the bee venom and mammalian hyaluronidase sequences, the E. ocellatus sequence retained all the catalytic, positional and structural residues that characterise this class of carbohydrate metabolising hydrolases. An extraordinarily high level of sequence identity (>95%) was observed in analogous venom gland cDNA sequences isolated (by PCR) from another saw-scaled viper species, E. pyramidum leakeyi (Kenya), and from the sahara horned viper, Cerastes cerastes cerastes (Egypt) and the puff adder, Bitis arietans (Nigeria). Smaller amplicons, lacking hyaluronidase catalytic residues because of 768 bp or 855 bp central deletions, appear to encode either truncated peptides without hyaluronidase activity, or are non-translated transcripts because they lack consensus translation initiating motifs.
Construction of a cDNA library for miniature pig mandibular deciduous molars

PubMed Central

2014-01-01

Background The miniature pig provides an excellent experimental model for tooth morphogenesis because its diphyodont and heterodont dentition resembles that of humans. However, little information is available on the process of tooth development or the exact molecular mechanisms controlling tooth development in miniature pigs or humans. Thus, the analysis of gene expression related to each stage of tooth development is very important. Results In our study, after serial sections were made, the development of the crown of the miniature pigs’ mandibular deciduous molar could be divided into five main phases: dental lamina stage (E33-E35), bud stage (E35-E40), cap stage (E40-E50), early bell stage (E50-E60), and late bell stage (E60-E65). Total RNA was isolated from the tooth germ of miniature pig embryos at E35, E45, E50, and E60, and a cDNA library was constructed. Then, we identified cDNA sequences on a large scale screen for cDNA profiles in the developing mandibular deciduous molars (E35, E45, E50, and E60) of miniature pigs using Illumina Solexa deep sequencing. Microarray assay was used to detect the expression of genes. Lastly, through Unigene sequence analysis and cDNA expression pattern analysis at E45 and E60, we found that 12 up-regulated and 15 down-regulated genes during the four periods are highly conserved genes homologous with known Homo sapiens genes. Furthermore, there were 6 down-regulated and 2 up-regulated genes in the miniature pig that were highly homologous to Homo sapiens genes compared with those in the mouse. Conclusion Our results not only identify the specific transcriptome and cDNA profile in developing mandibular deciduous molars of the miniature pig, but also provide useful information for investigating the molecular mechanism of tooth development in the miniature pig. PMID:24750690
Design and screening of M13 phage display cDNA libraries.

PubMed

Georgieva, Yuliya; Konthur, Zoltán

2011-02-17

The last decade has seen a steady increase in screening of cDNA expression product libraries displayed on the surface of filamentous bacteriophage. At the same time, the range of applications extended from the identification of novel allergens over disease markers to protein-protein interaction studies. However, the generation and selection of cDNA phage display libraries is subjected to intrinsic biological limitations due to their complex nature and heterogeneity, as well as technical difficulties regarding protein presentation on the phage surface. Here, we review the latest developments in this field, discuss a number of strategies and improvements anticipated to overcome these challenges making cDNA and open reading frame (ORF) libraries more readily accessible for phage display. Furthermore, future trends combining phage display with next generation sequencing (NGS) will be presented.
cDNA, deduced polypeptide structure and chromosomal assignment of human pulmonary surfactant proteolipid, SPL(pVal)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.

1988-01-05

In hyaline membrane disease of premature infants, lack of surfactant leads to pulmonary atelectasis and respiratory distress. Hydrophobic surfactant proteins of M/sub r/ = 5000-14,000 have been isolated from mammalian surfactants which enhance the rate of spreading and the surface tension lowering properties of phospholipids during dynamic compression. The authors have characterized the amino-terminal amino acid sequence of pulmonary proteolipids from ether/ethanol extracts of bovine, canine, and human surfactant. Two distinct peptides were identified and termed SPL(pVal) and SPL(Phe). An oligonucleotide probe based on the valine-rich amino-terminal amino acid sequence of SPL(pVal) was utilized to isolate cDNA and genomic DNAmore » encoding the human protein, termed surfactant proteolipid SPL(pVal) on the basis of its unique polyvaline domain. The primary structure of a precursor protein of 20,870 daltons, containing the SPL(pVal) peptide, was deduced from the nucleotide sequence of the cDNAs. Hybrid-arrested translation and immunoprecipitation of labeled translation products of human mRNA demonstrated a precursor protein, the active hydrophobic peptide being produced by proteolytic processing. Two classes of cDNAs encoding SPL(pVal) were identified. Human SPL(pVal) mRNA was more abundant in the adult than in fetal lung. The SPL(pVal) gene locus was assigned to chromosome 8.« less
RNA circularization reveals terminal sequence heterogeneity in a double-stranded RNA virus.

PubMed

Widmer, G

1993-03-01

Double-stranded RNA viruses (dsRNA), termed LRV1, have been found in several strains of the protozoan parasite Leishmania. With the aim of constructing a full-length cDNA copy of the viral genome, including its terminal sequences, a protocol based on PCR amplification across the 3'-5' junction of circularized RNA was developed. This method proved to be applicable to dsRNA. It provided a relatively simple alternative to one-sided PCR, without loss of specificity inherent in the use of generic primers. LRV1 terminal nucleotide sequences obtained by this method showed a considerable variation in length, particularly at the 5' end of the positive strand, as well as the potential for forming 3' overhangs. The opposite genomic end terminates in 0, 1, or 2 TCA trinucleotide repeats. These results are compared with terminal sequences derived from one-sided PCR experiments.
[Identification and phylogenetic application of unique nucleotide sequence of nad7 intron2 in Rhodiola (Crassulaceae) species].

PubMed

Deng, Ke-Jun; Yang, Zu-Jun; Liu, Cheng; Zhao, Wei; Liu, Chang; Feng, Juan; Ren, Zheng-Long

2007-03-01

Genetic characterization of 9 populations of Rhodiola crenulata, R. fastigiata and R. sachalinensis (Crassulaceae) species from Sichuan and Jilin Provinces of China, was investigated using the conserved primer of nad7 intron 2. All PCR products about 800 bp long were shorter than other Crassulaceae plants, which were used as molecular markers to identify the Rhodiola species. The sequence of the products indicated that total exon of 53 bp and intron of 738 bp exhibit only 9 nucleotide variations. Blasting the nad7 sequences to GenBank and the phylogenetic analysis showed that the sequence of Rhodiola species was clusted independently, and the length was smaller than all the registered sequences of higher plants. The result suggests that the Rhiodola species had a unique sequence in this gene region, which might be related to the special growth condition.
Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

PubMed

Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

2018-01-10

Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing
Molecular cloning and nucleotide sequence of a transforming gene detected by transfection of chicken B-cell lymphoma DNA

NASA Astrophysics Data System (ADS)

Goubin, Gerard; Goldman, Debra S.; Luce, Judith; Neiman, Paul E.; Cooper, Geoffrey M.

1983-03-01

A transforming gene detected by transfection of chicken B-cell lymphoma DNA has been isolated by molecular cloning. It is homologous to a conserved family of sequences present in normal chicken and human DNAs but is not related to transforming genes of acutely transforming retroviruses. The nucleotide sequence of the cloned transforming gene suggests that it encodes a protein that is partially homologous to the amino terminus of transferrin and related proteins although only about one tenth the size of transferrin.
Nucleotide Sequence of the blaRTG-2 (CARB-5) Gene and Phylogeny of a New Group of Carbenicillinases

PubMed Central

Choury, Daniele; Szajnert, Marie-France; Joly-Guillou, Marie-Laure; Azibi, Kemal; Delpech, Marc; Paul, Gérard

2000-01-01

We determined the nucleotide sequence of the bla gene for the Acinetobacter calcoaceticus β-lactamase previously described as CARB-5. Alignment of the deduced amino acid sequence with those of known β-lactamases revealed that CARB-5 possesses an RTG triad in box VII, as described for the Proteus mirabilis GN79 enzyme, instead of the RSG consensus characteristic of the other carbenicillinases. Phylogenetic studies showed that these RTG enzymes constitute a new, separate group, possibly ancestors of the carbenicillinase family. PMID:10722515
Nucleotide sequence analysis of the gene encoding the Deinococcus radiodurans surface protein, derived amino acid sequence, and complementary protein chemical studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peters, J.; Peters, M.; Lottspeich, F.

1987-11-01

The complete nucleotide sequence of the gene encoding the surface (hexagonally packed intermediate (HPI))-layer polypeptide of Deinococcus radiodurans Sark was determined and found to encode a polypeptide of 1036 amino acids. Amino acid sequence analysis of about 30% of the residues revealed that the mature polypeptide consists of at least 978 amino acids. The N terminus was blocked to Edman degradation. The results of proteolytic modification of the HPI layer in situ and M/sub r/ estimations of the HPI polypeptide expressed in Escherichia coli indicated that there is a leader sequence. The N-terminal region contained a very high percentage (29%)more » of threonine and serine, including a cluster of nine consecutive serine or threonine residues, whereas a stretch near the C terminus was extremely rich in aromatic amino acids (29%). The protein contained at least two disulfide bridges, as well as tightly bound reducing sugars and fatty acids.« less
NADH:ubiquinone oxidoreductase from bovine heart mitochondria. cDNA sequences of the import precursors of the nuclear-encoded 39 kDa and 42 kDa subunits.

PubMed Central

Fearnley, I M; Finel, M; Skehel, J M; Walker, J E

1991-01-01

The 39 kDa and 42 kDa subunits of NADH:ubiquinone oxidoreductase from bovine heart mitochondria are nuclear-coded components of the hydrophobic protein fraction of the enzyme. Their amino acid sequences have been deduced from the sequences of overlapping cDNA clones. These clones were amplified from total bovine heart cDNA by means of the polymerase chain reaction, with the use of complex mixtures of oligonucleotide primers based upon fragments of protein sequence determined at the N-terminals of the proteins and at internal sites. The protein sequences of the 39 kDa and 42 kDa subunits are 345 and 320 amino acid residues long respectively, and their calculated molecular masses are 39,115 Da and 36,693 Da. Both proteins are predominantly hydrophilic, but each contains one or two hydrophobic segments that could possibly be folded into transmembrane alpha-helices. The bovine 39 kDa protein sequence is related to that of a 40 kDa subunit from complex I from Neurospora crassa mitochondria; otherwise, it is not related significantly to any known sequence, including redox proteins and two polypeptides involved in import of proteins into mitochondria, known as the mitochondrial processing peptidase and the processing-enhancing protein. Therefore the functions of the 39 kDa and 42 kDa subunits of complex I are unknown. The mitochondrial gene product, ND4, a hydrophobic component of complex I with an apparent molecular mass of about 39 kDa, has been identified in preparations of the enzyme. This subunit stains faintly with Coomassie Blue dye, and in many gel systems it is not resolved from the nuclearcoded 36 kDa subunit. Images Fig. 1. PMID:1832859
A resource of single-nucleotide polymorphisms for rainbow trout generated by restriction-site associated DNA sequencing of doubled haploids

USDA-ARS?s Scientific Manuscript database

Salmonid genomes are considered to be in a pseudo-tetraploid state as a result of an evolutionarily recent genome duplication event. This situation complicates single nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and ...
Thermostable group II intron reverse transcriptase fusion proteins and their use in cDNA synthesis and next-generation RNA sequencing.

PubMed

Mohr, Sabine; Ghanem, Eman; Smith, Whitney; Sheeter, Dennis; Qin, Yidan; King, Olga; Polioudakis, Damon; Iyer, Vishwanath R; Hunicke-Smith, Scott; Swamy, Sajani; Kuersten, Scott; Lambowitz, Alan M

2013-07-01

Mobile group II introns encode reverse transcriptases (RTs) that function in intron mobility ("retrohoming") by a process that requires reverse transcription of a highly structured, 2-2.5-kb intron RNA with high processivity and fidelity. Although the latter properties are potentially useful for applications in cDNA synthesis and next-generation RNA sequencing (RNA-seq), group II intron RTs have been difficult to purify free of the intron RNA, and their utility as research tools has not been investigated systematically. Here, we developed general methods for the high-level expression and purification of group II intron-encoded RTs as fusion proteins with a rigidly linked, noncleavable solubility tag, and we applied them to group II intron RTs from bacterial thermophiles. We thus obtained thermostable group II intron RT fusion proteins that have higher processivity, fidelity, and thermostability than retroviral RTs, synthesize cDNAs at temperatures up to 81°C, and have significant advantages for qRT-PCR, capillary electrophoresis for RNA-structure mapping, and next-generation RNA sequencing. Further, we find that group II intron RTs differ from the retroviral enzymes in template switching with minimal base-pairing to the 3' ends of new RNA templates, making it possible to efficiently and seamlessly link adaptors containing PCR-primer binding sites to cDNA ends without an RNA ligase step. This novel template-switching activity enables facile and less biased cloning of nonpolyadenylated RNAs, such as miRNAs or protein-bound RNA fragments. Our findings demonstrate novel biochemical activities and inherent advantages of group II intron RTs for research, biotechnological, and diagnostic methods, with potentially wide applications.
Improved coverage of cDNA-AFLP by sequential digestion of immobilized cDNA.

PubMed

Weiberg, Arne; Pöhler, Dirk; Morgenstern, Burkhard; Karlovsky, Petr

2008-10-13

cDNA-AFLP is a transcriptomics technique which does not require prior sequence information and can therefore be used as a gene discovery tool. The method is based on selective amplification of cDNA fragments generated by restriction endonucleases, electrophoretic separation of the products and comparison of the band patterns between treated samples and controls. Unequal distribution of restriction sites used to generate cDNA fragments negatively affects the performance of cDNA-AFLP. Some transcripts are represented by more than one fragment while other escape detection, causing redundancy and reducing the coverage of the analysis, respectively. With the goal of improving the coverage of cDNA-AFLP without increasing its redundancy, we designed a modified cDNA-AFLP protocol. Immobilized cDNA is sequentially digested with several restriction endonucleases and the released DNA fragments are collected in mutually exclusive pools. To investigate the performance of the protocol, software tool MECS (Multiple Enzyme cDNA-AFLP Simulation) was written in Perl. cDNA-AFLP protocols described in the literature and the new sequential digestion protocol were simulated on sets of cDNA sequences from mouse, human and Arabidopsis thaliana. The redundancy and coverage, the total number of PCR reactions, and the average fragment length were calculated for each protocol and cDNA set. Simulation revealed that sequential digestion of immobilized cDNA followed by the partitioning of released fragments into mutually exclusive pools outperformed other cDNA-AFLP protocols in terms of coverage, redundancy, fragment length, and the total number of PCRs. Primers generating 30 to 70 amplicons per PCR provided the highest fraction of electrophoretically distinguishable fragments suitable for normalization. For A. thaliana, human and mice transcriptome, the use of two marking enzymes and three sequentially applied releasing enzymes for each of the marking enzymes is recommended.
Identification of single nucleotide polymorphism in ginger using expressed sequence tags

PubMed Central

Chandrasekar, Arumugam; Riju, Aikkal; Sithara, Kandiyl; Anoop, Sahadevan; Eapen, Santhosh J

2009-01-01

Ginger (Zingiber officinale Rosc) (Family: Zingiberaceae) is a herbaceous perennial, the rhizomes of which are used as a spice. Ginger is a plant which is well known for its medicinal applications. Recently EST-derived SNPs are a free by-product of the currently expanding EST (Expressed Sequence Tag) databases. The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion/deletion) has led to a revolution in their use as molecular markers. Available (38139) Ginger EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script AutoSNP version 1.0 which has used 31905 ESTs for detecting SNPs and Indel sites. We found 64026 SNP sites and 7034 indel polymorphisms with frequency of 0.84 SNPs / 100 bp. Among the three tissues from which the EST libraries had been generated, Rhizomes had high frequency of 1.08 SNPs/indels per 100 bp whereas the leaves had lowest frequency of 0.63 per 100 bp and root is showing relative frequency 0.82/100bp. Transitions and transversion ratio is 0.90. In overall detected SNP, transversion is high when compare to transition. These detected SNPs can be used as markers for genetic studies. Availability The results of the present study hosted in our webserver www.spices.res.in/spicesnip PMID:20198184
Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition

PubMed Central

Alberti, Adriana; Poulain, Julie; Engelen, Stefan; Labadie, Karine; Romac, Sarah; Ferrera, Isabel; Albini, Guillaume; Aury, Jean-Marc; Belser, Caroline; Bertrand, Alexis; Cruaud, Corinne; Da Silva, Corinne; Dossat, Carole; Gavory, Frédérick; Gas, Shahinaz; Guy, Julie; Haquelle, Maud; Jacoby, E'krame; Jaillon, Olivier; Lemainque, Arnaud; Pelletier, Eric; Samson, Gaëlle; Wessner, Mark; Bazire, Pascal; Beluche, Odette; Bertrand, Laurie; Besnard-Gonnet, Marielle; Bordelais, Isabelle; Boutard, Magali; Dubois, Maria; Dumont, Corinne; Ettedgui, Evelyne; Fernandez, Patricia; Garcia, Espérance; Aiach, Nathalie Giordanenco; Guerin, Thomas; Hamon, Chadia; Brun, Elodie; Lebled, Sandrine; Lenoble, Patricia; Louesse, Claudine; Mahieu, Eric; Mairey, Barbara; Martins, Nathalie; Megret, Catherine; Milani, Claire; Muanga, Jacqueline; Orvain, Céline; Payen, Emilie; Perroud, Peggy; Petit, Emmanuelle; Robert, Dominique; Ronsin, Murielle; Vacherie, Benoit; Acinas, Silvia G.; Royo-Llonch, Marta; Cornejo-Castillo, Francisco M.; Logares, Ramiro; Fernández-Gómez, Beatriz; Bowler, Chris; Cochrane, Guy; Amid, Clara; Hoopen, Petra Ten; De Vargas, Colomban; Grimsley, Nigel; Desgranges, Elodie; Kandels-Lewis, Stefanie; Ogata, Hiroyuki; Poulton, Nicole; Sieracki, Michael E.; Stepanauskas, Ramunas; Sullivan, Matthew B.; Brum, Jennifer R.; Duhaime, Melissa B.; Poulos, Bonnie T.; Hurwitz, Bonnie L.; Acinas, Silvia G.; Bork, Peer; Boss, Emmanuel; Bowler, Chris; De Vargas, Colomban; Follows, Michael; Gorsky, Gabriel; Grimsley, Nigel; Hingamp, Pascal; Iudicone, Daniele; Jaillon, Olivier; Kandels-Lewis, Stefanie; Karp-Boss, Lee; Karsenti, Eric; Not, Fabrice; Ogata, Hiroyuki; Pesant, Stéphane; Raes, Jeroen; Sardet, Christian; Sieracki, Michael E.; Speich, Sabrina; Stemmann, Lars; Sullivan, Matthew B.; Sunagawa, Shinichi; Wincker, Patrick; Pesant, Stéphane; Karsenti, Eric; Wincker, Patrick

2017-01-01

A unique collection of oceanic samples was gathered by the Tara Oceans expeditions (2009–2013), targeting plankton organisms ranging from viruses to metazoans, and providing rich environmental context measurements. Thanks to recent advances in the field of genomics, extensive sequencing has been performed for a deep genomic analysis of this huge collection of samples. A strategy based on different approaches, such as metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for analysis of size-fractionated plankton communities. Here, we provide detailed procedures applied for genomic data generation, from nucleic acids extraction to sequence production, and we describe registries of genomics datasets available at the European Nucleotide Archive (ENA, www.ebi.ac.uk/ena). The association of these metadata to the experimental procedures applied for their generation will help the scientific community to access these data and facilitate their analysis. This paper complements other efforts to provide a full description of experiments and open science resources generated from the Tara Oceans project, further extending their value for the study of the world’s planktonic ecosystems. PMID:28763055
Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition.

PubMed

Alberti, Adriana; Poulain, Julie; Engelen, Stefan; Labadie, Karine; Romac, Sarah; Ferrera, Isabel; Albini, Guillaume; Aury, Jean-Marc; Belser, Caroline; Bertrand, Alexis; Cruaud, Corinne; Da Silva, Corinne; Dossat, Carole; Gavory, Frédérick; Gas, Shahinaz; Guy, Julie; Haquelle, Maud; Jacoby, E'krame; Jaillon, Olivier; Lemainque, Arnaud; Pelletier, Eric; Samson, Gaëlle; Wessner, Mark; Acinas, Silvia G; Royo-Llonch, Marta; Cornejo-Castillo, Francisco M; Logares, Ramiro; Fernández-Gómez, Beatriz; Bowler, Chris; Cochrane, Guy; Amid, Clara; Hoopen, Petra Ten; De Vargas, Colomban; Grimsley, Nigel; Desgranges, Elodie; Kandels-Lewis, Stefanie; Ogata, Hiroyuki; Poulton, Nicole; Sieracki, Michael E; Stepanauskas, Ramunas; Sullivan, Matthew B; Brum, Jennifer R; Duhaime, Melissa B; Poulos, Bonnie T; Hurwitz, Bonnie L; Pesant, Stéphane; Karsenti, Eric; Wincker, Patrick

2017-08-01

A unique collection of oceanic samples was gathered by the Tara Oceans expeditions (2009-2013), targeting plankton organisms ranging from viruses to metazoans, and providing rich environmental context measurements. Thanks to recent advances in the field of genomics, extensive sequencing has been performed for a deep genomic analysis of this huge collection of samples. A strategy based on different approaches, such as metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for analysis of size-fractionated plankton communities. Here, we provide detailed procedures applied for genomic data generation, from nucleic acids extraction to sequence production, and we describe registries of genomics datasets available at the European Nucleotide Archive (ENA, www.ebi.ac.uk/ena). The association of these metadata to the experimental procedures applied for their generation will help the scientific community to access these data and facilitate their analysis. This paper complements other efforts to provide a full description of experiments and open science resources generated from the Tara Oceans project, further extending their value for the study of the world's planktonic ecosystems.
Characterization of infectious Murray Valley encephalitis virus derived from a stably cloned genome-length cDNA.

PubMed

Hurrelbrink, R J; Nestorowicz, A; McMinn, P C

1999-12-01

An infectious cDNA clone of Murray Valley encephalitis virus prototype strain 1-51 (MVE-1-51) was constructed by stably inserting genome-length cDNA into the low-copy-number plasmid vector pMC18. Designated pMVE-1-51, the clone consisted of genome-length cDNA of MVE-1-51 under the control of a T7 RNA polymerase promoter. The clone was constructed by using existing components of a cDNA library, in addition to cDNA of the 3' terminus derived by RT-PCR of poly(A)-tailed viral RNA. Upon comparison with other flavivirus sequences, the previously undetermined sequence of the 3' UTR was found to contain elements conserved throughout the genus FLAVIVIRUS: RNA transcribed from pMVE-1-51 and subsequently transfected into BHK-21 cells generated infectious virus. The plaque morphology, replication kinetics and antigenic profile of clone-derived virus (CDV-1-51) was similar to the parental virus in vitro. Furthermore, the virulence properties of CDV-1-51 and MVE-1-51 (LD(50) values and mortality profiles) were found to be identical in vivo in the mouse model. Through site-directed mutagenesis, the infectious clone should serve as a valuable tool for investigating the molecular determinants of virulence in MVE virus.
Subtraction of cap-trapped full-length cDNA libraries to select rare transcripts.

PubMed

Hirozane-Kishikawa, Tomoko; Shiraki, Toshiyuki; Waki, Kazunori; Nakamura, Mari; Arakawa, Takahiro; Kawai, Jun; Fagiolini, Michela; Hensch, Takao K; Hayashizaki, Yoshihide; Carninci, Piero

2003-09-01

The normalization and subtraction of highly expressed cDNAs from relatively large tissues before cloning dramatically enhanced the gene discovery by sequencing for the mouse full-length cDNA encyclopedia, but these methods have not been suitable for limited RNA materials. To normalize and subtract full-length cDNA libraries derived from limited quantities of total RNA, here we report a method to subtract plasmid libraries excised from size-unbiased amplified lambda phage cDNA libraries that avoids heavily biasing steps such as PCR and plasmid library amplification. The proportion of full-length cDNAs and the gene discovery rate are high, and library diversity can be validated by in silico randomization.

Development and Application of a Salmonid EST Database and cDNA Microarray: Data Mining and Interspecific Hybridization Characteristics

PubMed Central

Rise, Matthew L.; von Schalburg, Kristian R.; Brown, Gordon D.; Mawer, Melanie A.; Devlin, Robert H.; Kuipers, Nathanael; Busby, Maura; Beetz-Sargent, Marianne; Alberto, Roberto; Gibbs, A. Ross; Hunt, Peter; Shukin, Robert; Zeznik, Jeffrey A.; Nelson, Colleen; Jones, Simon R.M.; Smailus, Duane E.; Jones, Steven J.M.; Schein, Jacqueline E.; Marra, Marco A.; Butterfield, Yaron S.N.; Stott, Jeff M.; Ng, Siemon H.S.; Davidson, William S.; Koop, Ben F.

2004-01-01

We report 80,388 ESTs from 23 Atlantic salmon (Salmo salar) cDNA libraries (61,819 ESTs), 6 rainbow trout (Oncorhynchus mykiss) cDNA libraries (14,544 ESTs), 2 chinook salmon (Oncorhynchus tshawytscha) cDNA libraries (1317 ESTs), 2 sockeye salmon (Oncorhynchus nerka) cDNA libraries (1243 ESTs), and 2 lake whitefish (Coregonus clupeaformis) cDNA libraries (1465 ESTs). The majority of these are 3′ sequences, allowing discrimination between paralogs arising from a recent genome duplication in the salmonid lineage. Sequence assembly reveals 28,710 different S. salar, 8981 O. mykiss, 1085 O. tshawytscha, 520 O. nerka, and 1176 C. clupeaformis putative transcripts. We annotate the submitted portion of our EST database by molecular function. Higher- and lower-molecular-weight fractions of libraries are shown to contain distinct gene sets, and higher rates of gene discovery are associated with higher-molecular weight libraries. Pyloric caecum library group annotations indicate this organ may function in redox control and as a barrier against systemic uptake of xenobiotics. A microarray is described, containing 7356 salmonid elements representing 3557 different cDNAs. Analyses of cross-species hybridizations to this cDNA microarray indicate that this resource may be used for studies involving all salmonids. PMID:14962987
Single nucleotide polymorphism analysis of Korean native chickens using next generation sequencing data.

PubMed

Seo, Dong-Won; Oh, Jae-Don; Jin, Shil; Song, Ki-Duk; Park, Hee-Bok; Heo, Kang-Nyeong; Shin, Younhee; Jung, Myunghee; Park, Junhyung; Jo, Cheorun; Lee, Hak-Kyo; Lee, Jun-Heon

2015-02-01

There are five native chicken lines in Korea, which are mainly classified by plumage colors (black, white, red, yellow, gray). These five lines are very important genetic resources in the Korean poultry industry. Based on a next generation sequencing technology, whole genome sequence and reference assemblies were performed using Gallus_gallus_4.0 (NCBI) with whole genome sequences from these lines to identify common and novel single nucleotide polymorphisms (SNPs). We obtained 36,660,731,136 ± 1,257,159,120 bp of raw sequence and average 26.6-fold of 25-29 billion reference assembly sequences representing 97.288 % coverage. Also, 4,006,068 ± 97,534 SNPs were observed from 29 autosomes and the Z chromosome and, of these, 752,309 SNPs are the common SNPs across lines. Among the identified SNPs, the number of novel- and known-location assigned SNPs was 1,047,951 ± 14,956 and 2,948,648 ± 81,414, respectively. The number of unassigned known SNPs was 1,181 ± 150 and unassigned novel SNPs was 8,238 ± 1,019. Synonymous SNPs, non-synonymous SNPs, and SNPs having character changes were 26,266 ± 1,456, 11,467 ± 604, 8,180 ± 458, respectively. Overall, 443,048 ± 26,389 SNPs in each bird were identified by comparing with dbSNP in NCBI. The presently obtained genome sequence and SNP information in Korean native chickens have wide applications for further genome studies such as genetic diversity studies to detect causative mutations for economic and disease related traits.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Leong, JoAnn Ching

The nucleotide sequence of the IHNV glycoprotein gene has been determined from a cDNA clone containing the entire coding region. The glycoprotein cDNA clone contained a leader sequence of 48 bases, a coding region of 1524 nucleotides, and 39 bases at the 3 foot end. The entire cDNA clone contains 1609 nucleodites and encodes a protein of 508 amino acids. The deduced amino acid sequence gave a translated molecular weight of 56,795 daltons. A hydropathicity profile of the deduced amino acid sequence indicated that there were two major hydrophobic domains: one,at the N-terminus,delineating a signal peptide of 18 amino acidsmore » and the other, at the C-terminus,delineating the region of the transmembrane. Five possible sites of N-linked glyscoylation were identified. Although no nucleic acid homology existed between the IHNV glycoprotein gene and the glycoprotein genes of rabies and VSV, there was significant homology at the amino acid level between all three rhabdovirus glycoproteins.« less
Empirical Bayes Estimation of Coalescence Times from Nucleotide Sequence Data.

PubMed

King, Leandra; Wakeley, John

2016-09-01

We demonstrate the advantages of using information at many unlinked loci to better calibrate estimates of the time to the most recent common ancestor (TMRCA) at a given locus. To this end, we apply a simple empirical Bayes method to estimate the TMRCA. This method is both asymptotically optimal, in the sense that the estimator converges to the true value when the number of unlinked loci for which we have information is large, and has the advantage of not making any assumptions about demographic history. The algorithm works as follows: we first split the sample at each locus into inferred left and right clades to obtain many estimates of the TMRCA, which we can average to obtain an initial estimate of the TMRCA. We then use nucleotide sequence data from other unlinked loci to form an empirical distribution that we can use to improve this initial estimate. Copyright © 2016 by the Genetics Society of America.
Characterization of the venom from the Australian scorpion Urodacus yaschenkoi: Molecular mass analysis of components, cDNA sequences and peptides with antimicrobial activity.

PubMed

Luna-Ramírez, Karen; Quintero-Hernández, Veronica; Vargas-Jaimes, Leonel; Batista, Cesar V F; Winkel, Kenneth D; Possani, Lourival D

2013-03-01

The Urodacidae scorpions are the most widely distributed of the four families in Australia and represent half of the species in the continent, yet their venoms remain largely unstudied. This communication reports the first results of a proteome analysis of the venom of the scorpion Urodacus yaschenkoi performed by mass fingerprinting, after high performance liquid chromatography (HPLC) separation. A total of 74 fractions were obtained by HPLC separation allowing the identification of approximately 274 different molecular masses with molecular weights varying from 287 to 43,437 Da. The most abundant peptides were those from 1 K Da and 4-5 K Da representing antimicrobial peptides and putative potassium channel toxins, respectively. Three such peptides were chemically synthesized and tested against Gram-positive and Gram-negative bacteria showing minimum inhibitory concentration in the low micromolar range, but with moderate hemolytic activity. It also reports a transcriptome analysis of the venom glands of the same scorpion species, undertaken by constructing a cDNA library and conducting random sequencing screening of the transcripts. From the resultant cDNA library 172 expressed sequence tags (ESTs) were analyzed. These transcripts were further clustered into 120 unique sequences (23 contigs and 97 singlets). The identified putative proteins can be assorted in several groups, such as those implicated in common cellular processes, putative neurotoxins and antimicrobial peptides. The scorpion U. yaschenkoi is not known to be dangerous to humans and its venom contains peptides similar to those of Opisthacanthus cayaporum (antibacterial), Scorpio maurus palmatus (maurocalcin), Opistophthalmus carinatus (opistoporines) and Hadrurus gerstchi (scorpine-like molecules), amongst others. Copyright © 2012 Elsevier Ltd. All rights reserved.
Evaluation and Adaptation of a Laboratory-Based cDNA Library Preparation Protocol for Retrospective Sequencing of Archived MicroRNAs from up to 35-Year-Old Clinical FFPE Specimens

PubMed Central

Loudig, Olivier; Wang, Tao; Ye, Kenny; Lin, Juan; Wang, Yihong; Ramnauth, Andrew; Liu, Christina; Stark, Azadeh; Chitale, Dhananjay; Greenlee, Robert; Multerer, Deborah; Honda, Stacey; Daida, Yihe; Spencer Feigelson, Heather; Glass, Andrew; Couch, Fergus J.; Rohan, Thomas; Ben-Dov, Iddo Z.

2017-01-01

Formalin-fixed paraffin-embedded (FFPE) specimens, when used in conjunction with patient clinical data history, represent an invaluable resource for molecular studies of cancer. Even though nucleic acids extracted from archived FFPE tissues are degraded, their molecular analysis has become possible. In this study, we optimized a laboratory-based next-generation sequencing barcoded cDNA library preparation protocol for analysis of small RNAs recovered from archived FFPE tissues. Using matched fresh and FFPE specimens, we evaluated the robustness and reproducibility of our optimized approach, as well as its applicability to archived clinical specimens stored for up to 35 years. We then evaluated this cDNA library preparation protocol by performing a miRNA expression analysis of archived breast ductal carcinoma in situ (DCIS) specimens, selected for their relation to the risk of subsequent breast cancer development and obtained from six different institutions. Our analyses identified six miRNAs (miR-29a, miR-221, miR-375, miR-184, miR-363, miR-455-5p) differentially expressed between DCIS lesions from women who subsequently developed an invasive breast cancer (cases) and women who did not develop invasive breast cancer within the same time interval (control). Our thorough evaluation and application of this laboratory-based miRNA sequencing analysis indicates that the preparation of small RNA cDNA libraries can reliably be performed on older, archived, clinically-classified specimens. PMID:28335433
Nucleotide sequence and transcriptional start site of the Methylobacterium organophilum XX methanol dehydrogenase structural gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Machlin, S.M.; Hanson, R.S.

The nucleotide sequence of a cloned 2.5-kilobase-pair SmaI fragment containing the methanol dehydrogenase (MDH) structural gene from Methylobacterium organophilum XX was determined. A single open reading frame with a coding capacity of 626 amino acids (molecular weight, 66,000) was identified on one stand, and N-terminal sequencing of purified MDH revealed that 27 of these residues constituted a putative signal peptide. Primer extension mapping of in vivo transcripts indicated that the start of mRNA synthesis was 160 to 170 base pairs upstream of the ATG codon. Northern (RNA) blot analysis further demonstrated that the transcript was 2.1 kilobase pairs in lengthmore » and therefore appeared to encode only MDH.« less
A Bayesian hierarchical model to detect differentially methylated loci from single nucleotide resolution sequencing data

PubMed Central

Feng, Hao; Conneely, Karen N.; Wu, Hao

2014-01-01

DNA methylation is an important epigenetic modification that has essential roles in cellular processes including gene regulation, development and disease and is widely dysregulated in most types of cancer. Recent advances in sequencing technology have enabled the measurement of DNA methylation at single nucleotide resolution through methods such as whole-genome bisulfite sequencing and reduced representation bisulfite sequencing. In DNA methylation studies, a key task is to identify differences under distinct biological contexts, for example, between tumor and normal tissue. A challenge in sequencing studies is that the number of biological replicates is often limited by the costs of sequencing. The small number of replicates leads to unstable variance estimation, which can reduce accuracy to detect differentially methylated loci (DML). Here we propose a novel statistical method to detect DML when comparing two treatment groups. The sequencing counts are described by a lognormal-beta-binomial hierarchical model, which provides a basis for information sharing across different CpG sites. A Wald test is developed for hypothesis testing at each CpG site. Simulation results show that the proposed method yields improved DML detection compared to existing methods, particularly when the number of replicates is low. The proposed method is implemented in the Bioconductor package DSS. PMID:24561809
Effects of transcriptional start site sequence and position on nucleotide-sensitive selection of alternative start sites at the pyrC promoter in Escherichia coli.

PubMed Central

Liu, J; Turnbough, C L

1994-01-01

In Escherichia coli, expression of the pyrC gene is regulated primarily by a translational control mechanism based on nucleotide-sensitive selection of transcriptional start sites at the pyrC promoter. When intracellular levels of CTP are high, pyrC transcripts are initiated predominantly with CTP at a site 7 bases downstream of the Pribnow box. These transcripts form a stable hairpin at their 5' ends that blocks ribosome binding. When the CTP level is low and the GTP level is high, conditions found in pyrimidine-limited cells, transcripts are initiated primarily with GTP at a site 9 bases downstream of the Pribnow box. These shorter transcripts are unable to form a hairpin at their 5' ends and are readily translated. In this study, we examined the effects of nucleotide sequence and position on the selection of transcriptional start sites at the pyrC promoter. We characterized promoter mutations that systematically alter the sequence at position 7 or 9 downstream of the Pribnow box or vary the spacing between the Pribnow box and wild-type transcriptional initiation region. The results reveal preferences for particular initiating nucleotides (ATP > or = GTP > UTP >> CTP) and for starting positions downstream of the Pribnow box (7 >> 6 and 8 > 9 > 10). The results indicate that optimal nucleotide-sensitive start site switching at the wild-type pyrC promoter is the result of competition between the preferred start site (position 7) that uses the poorest initiating nucleotide (CTP) and a weak start site (position 9) that uses a good initiating nucleotide (GTP). The sequence of the pyrC promoter also minimizes the synthesis of untranslatable transcripts and provides for maximum stability of the regulatory transcript hairpin. In addition, the results show that the effects of the mutations on pyrC expression and regulation are consistent with the current model for translational control. Possible effects of preferences for initiating nucleotides and start sites on the
Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

PubMed Central

Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

1985-01-01

The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815
Identification of three wheat globulin genes by screening a Triticum aestivum BAC genomic library with cDNA from a diabetes-associated globulin

PubMed Central

Loit, Evelin; Melnyk, Charles W; MacFarlane, Amanda J; Scott, Fraser W; Altosaar, Illimar

2009-01-01

Background Exposure to dietary wheat proteins in genetically susceptible individuals has been associated with increased risk for the development of Type 1 diabetes (T1D). Recently, a wheat protein encoded by cDNA WP5212 has been shown to be antigenic in mice, rats and humans with autoimmune T1D. To investigate the genomic origin of the identified wheat protein cDNA, a hexaploid wheat genomic library from Glenlea cultivar was screened. Results Three unique wheat globulin genes, Glo-3A, Glo3-B and Glo-3C, were identified. We describe the genomic structure of these genes and their expression pattern in wheat seeds. The Glo-3A gene shared 99% identity with the cDNA of WP5212 at the nucleotide and deduced amino acid level, indicating that we have identified the gene(s) encoding wheat protein WP5212. Southern analysis revealed the presence of multiple copies of Glo-3-like sequences in all wheat samples, including hexaploid, tetraploid and diploid species wheat seed. Aleurone and embryo tissue specificity of WP5212 gene expression, suggested by promoter region analysis, which demonstrated an absence of endosperm specific cis elements, was confirmed by immunofluorescence microscopy using anti-WP5212 antibodies. Conclusion Taken together, the results indicate that a diverse group of globulins exists in wheat, some of which could be associated with the pathogenesis of T1D in some susceptible individuals. These data expand our knowledge of specific wheat globulins and will enable further elucidation of their role in wheat biology and human health. PMID:19615078
Triazole-linked DNA as a primer surrogate in the synthesis of first-strand cDNA.

PubMed

Fujino, Tomoko; Yasumoto, Ken-ichi; Yamazaki, Naomi; Hasome, Ai; Sogawa, Kazuhiro; Isobe, Hiroyuki

2011-11-04

A phosphate-eliminated nonnatural oligonucleotide serves as a primer surrogate in reverse transcription reaction of mRNA. Despite of the nonnatural triazole linkages in the surrogate, the reverse transcriptase effectively elongated cDNA sequences on the 3'-downstream of the primer by transcription of the complementary sequence of mRNA. A structure-activity comparison with the reference natural oligonucleotides shows the superior priming activity of the surrogate containing triazole-linkages. The nonnatural linkages also protect the transcribed cDNA from digestion reactions with 5'-exonuclease and enable us to remove noise transcripts of unknown origins. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comparison of the nucleotide and amino acid sequences of the RsrI and EcoRI restriction endonucleases.

PubMed

Stephenson, F H; Ballard, B T; Boyer, H W; Rosenberg, J M; Greene, P J

1989-12-21

The RsrI endonuclease, a type-II restriction endonuclease (ENase) found in Rhodobacter sphaeroides, is an isoschizomer of the EcoRI ENase. A clone containing an 11-kb BamHI fragment was isolated from an R. sphaeroides genomic DNA library by hybridization with synthetic oligodeoxyribonucleotide probes based on the N-terminal amino acid (aa) sequence of RsrI. Extracts of E. coli containing a subclone of the 11-kb fragment display RsrI activity. Nucleotide sequence analysis reveals an 831-bp open reading frame encoding a polypeptide of 277 aa. A 50% identity exists within a 266-aa overlap between the deduced aa sequences of RsrI and EcoRI. Regions of 75-100% aa sequence identity correspond to key structural and functional regions of EcoRI. The type-II ENases have many common properties, and a common origin might have been expected. Nevertheless, this is the first demonstration of aa sequence similarity between ENases produced by different organisms.
Sequence and RT-PCR expression analysis of two peroxidases from Arabidopsis thaliana belonging to a novel evolutionary branch of plant peroxidases.

PubMed

Kjaersgård, I V; Jespersen, H M; Rasmussen, S K; Welinder, K G

1997-03-01

cDNA clones encoding two new Arabidopsis thaliana peroxidases, ATP 1a and ATP 2a, have been identified by searching the Arabidopsis database of expressed sequence tags (dbEST). They represent a novel branch of hitherto uncharacterized plant peroxidases which is only 35% identical in amino acid sequence to the well characterized group of basic plant peroxidases represented by the horseradish (Armoracia rusticana) isoperoxidases HRP C, HRP E5 and the similar Arabidopsis isoperoxidases ATP Ca, ATP Cb, and ATP Ea. However ATP 1a is 87% identical in amino acid sequence to a peroxidase encoded by an mRNA isolated from cotton (Gossypium hirsutum). As cotton and Arabidopsis belong to rather diverse families (Malvaceae and Crucifereae, respectively), in contrast with Arabidopsis and horseradish (both Crucifereae), the high degree of sequence identity indicates that this novel type of peroxidase, albeit of unknown function, is likely to be widespread in plant species. The atp 1 and atp 2 types of cDNA sequences were the most redundant among the 28 different isoperoxidases identified among about 200 peroxidase encoding ESTs. Interestingly, 8 out of totally 38 EST sequences coding for ATP 1 showed three identical nucleotide substitutions. This variant form is designated ATP 1b. Similarly, six out of totally 16 EST sequences coding for ATP 2 showed a number of deletions and nucleotide changes. This variant form is designated ATP 2b. The selected EST clones are full-length and contain coding regions of 993 nucleotides for atp 1a, and 984 nucleotides for atp 2a. These regions show 61% DNA sequence identity. The predicted mature proteins ATP 1a, and ATP 2a are 57% identical in sequence and contain the structurally and functionally important residues, characteristic of the plant peroxidase superfamily. However, they do show two differences of importance to peroxidase catalysis: (1) the asparagine residue linked with the active site distal histidine via hydrogen bonding is absent
The TGA codons are present in the open reading frame of selenoprotein P cDNA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hill, K.E.; Lloyd, R.S.; Read, R.

1991-03-11

The TGA codon in DNA has been shown to direct incorporation of selenocysteine into protein. Several proteins from bacteria and animals contain selenocysteine in their primary structures. Each of the cDNA clones of these selenoproteins contains one TGA codon in the open reading frame which corresponds to the selenocysteine in the protein. A cDNA clone for selenoprotein P (SeP), obtained from a {gamma}ZAP rat liver library, was sequenced by the dideoxy termination method. The correct reading frame was determined by comparison of the deduced amino acid sequence with the amino acid sequence of several peptides from SeP. Using SeP labelledmore » with {sup 75}Se in vivo, the selenocysteine content of the peptides was verified by the collection of carboxymethylated {sup 77}Se-selenocysteine as it eluted from the amino acid analyzer and determination of the radioactivity contained in the collected samples. Ten TGA codons are present in the open reading frame of the cDNA. Peptide fragmentation studies and the deduced sequence indicate that selenium-rich regions are located close to the carboxy terminus. Nine of the 10 selenocysteines are located in the terminal 26% of the sequence with four in the terminal 15 amino acids. The deduced sequence codes for a protein of 385 amino acids. Cleavage of the signal peptide gives the mature protein with 366 amino acids and a calculated mol wt of 41,052 Da. Searches of PIR and SWISSPROT protein databases revealed no similarity with glutathione peroxidase or other selenoproteins.« less
The complete nucleotide sequence of the domestic dog (Canis familiaris) mitochondrial genome.

PubMed

Kim, K S; Lee, S E; Jeong, H W; Ha, J H

1998-10-01

The complete nucleotide sequence of the mitochondrial genome of the domestic dog, Canis familiaris, was determined. The length of the sequence was 16,728 bp; however, the length was not absolute due to the variation (heteroplasmy) caused by differing numbers of the repetitive motif, 5'-GTACACGT(A/G)C-3', in the control region. The genome organization, gene contents, and codon usage conformed to those of other mammalian mitochondrial genomes. Although its features were unknown, the "CTAGA" duplication event which followed the translational stop codon of the COII gene was not observed in other mammalian mitochondrial genomes. In order to determine the possible differences between mtDNAs in carnivores, two rRNA and 13 protein-coding genes from the cat, dog, and seal were compared. The combined molecular differences, in two rRNA genes as well as in the inferred amino acid sequences of the mitochondrial 13 protein-coding genes, suggested that there is a closer relationship between the dog and the seal than there is between either of these species and the cat. Based on the molecular differences of the mtDNA, the evolutionary divergence between the cat, the dog, and the seal was dated to approximately 50 +/- 4 million years ago. The degree of difference between carnivore mtDNAs varied according to the individual protein-coding gene applied, showing that the evolutionary relationships of distantly related species should be presented in an extended study based on ample sequence data like complete mtDNA molecules. Copyright 1998 Academic Press.
A cDNA from a mouse pancreatic beta cell encoding a putative transcription factor of the insulin gene.

PubMed Central

Walker, M D; Park, C W; Rosen, A; Aronheim, A

1990-01-01

Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Deep sequencing is an appropriate tool for the selection of unique Hepatitis C virus (HCV) variants after single genomic amplification

PubMed Central

Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc’h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine

2017-01-01

Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus’s but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies. PMID:28362878
Molecular cloning and sequence analysis of stearoyl-CoA desaturase in milkfish, Chanos chanos.

PubMed

Hsieh, S L; Liao, W L; Kuo, C M

2001-12-01

Stearoyl-CoA desaturase (EC 1.14.99.5) is a key enzyme in the biosynthesis of polyunsaturated fatty acids and the maintenance of the homeoviscous fluidity of biological membranes. The stearoyl-CoA desaturase cDNA in milkfish (Chanos chanos) was cloned by RT-PCR and RACE, and it was compared with the stearoyl-CoA desaturase in cold-tolerant teleosts, common carp and grass carp. Nucleotide sequence analysis revealed that the cDNA clone has a 972-bp open reading frame encoding 323 amino acid residues. Alignments of the deduced amino acid sequence showed that the milkfish stearoyl-CoA desaturase shares 79% and 75% identity with common carp and grass carp, and 63%-64% with other vertebrates such as sheep, hamsters, rats, mice, and humans. Like common carp and grass carp, the deduced amino acid sequence in milkfish well conserves three histidine cluster motifs (one HXXXXH and two HXXHH) that are essential for catalysis of stearoyl-CoA desaturase activity. However, RT-PCR analysis showed that stearoyl-CoA desaturase expression in milkfish is detected in the tissues of liver, muscle, kidney, brain, and gill, and more expression sites were found in milkfish than in common carp and grass carp. Phylogenic relationships among the deduced stearoyl-CoA desaturase amino acid sequence in milkfish and those in other vertebrates showed that the milkfish stearoyl-CoA desaturase amino acid sequence is phylogenetically closer to those of common carp and grass carp than to other higher vertebrates.
Molecular cloning of a cDNA encoding the glycoprotein of hen oviduct microsomal signal peptidase.

PubMed Central

Newsome, A L; McLean, J W; Lively, M O

1992-01-01

Detergent-solubilized hen oviduct signal peptidase has been characterized previously as an apparent complex of a 19 kDa protein and a 23 kDa glycoprotein (GP23) [Baker & Lively (1987) Biochemistry 26, 8561-8567]. A cDNA clone encoding GP23 from a chicken oviduct lambda gt11 cDNA library has now been characterized. The cDNA encodes a protein of 180 amino acid residues with a single site for asparagine-linked glycosylation that has been directly identified by amino acid sequence analysis of a tryptic-digest peptide containing the glycosylated site. Immunoblot analysis reveals cross-reactivity with a dog pancreas protein. Comparison of the deduced amino acid sequence of GP23 with the 22/23 kDa glycoprotein of dog microsomal signal peptidase [Shelness, Kanwar & Blobel (1988) J. Biol. Chem. 263, 17063-17070], one of five proteins associated with this enzyme, reveals that the amino acid sequences are 90% identical. Thus the signal peptidase glycoprotein is as highly conserved as the sequences of cytochromes c and b from these same species and is likely to be found in a similar form in many, if not all, vertebrate species. The data also show conclusively that the dog and avian signal peptidases have at least one protein subunit in common. Images Fig. 1. PMID:1546959

cDNA cloning of carrot extracellular beta-fructosidase and its expression in response to wounding and bacterial infection.

PubMed

Sturm, A; Chrispeels, M J

1990-11-01

We isolated a full-length cDNA for apoplastic (extracellular or cell wall-bound) beta-fructosidase (invertase), determined its nucleotide sequence, and used it as a probe to measure changes in mRNA as a result of wounding of carrot storage roots and infection of carrot plants with the bacterial pathogen Erwinia carotovora. The derived amino acid sequence of extracellular beta-fructosidase shows that it is a basic protein (pl 9.9) with a signal sequence for entry into the endoplasmic reticulum and a propeptide at the N terminus that is not present in the mature protein. Amino acid sequence comparison with yeast and bacterial invertases shows that the overall homology is only about 28%, but that there are short conserved motifs, one of which is at the active site. Maturing carrot storage roots contain barely detectable levels of mRNA for extracellular beta-fructosidase and these levels rise slowly but dramatically after wounding with maximal expression after 12 hours. Infection of roots and leaves of carrot plants with E. carotovora results in a very fast increase in the mRNA levels with maximal expression after 1 hour. These results indicate that apoplastic beta-fructosidase is probably a new and hitherto unrecognized pathogenesis-related protein [Van Loon, L.C. (1985). Plant Mol. Biol. 4, 111-116]. Suspension-cultured carrot cells contain high levels of mRNA for extracellular beta-fructosidase and these levels remain the same whether the cells are grown on sucrose, glucose, or fructose.
Correlations of nucleotide substitution rates and base composition of mammalian coding sequences with protein structure.

PubMed

Chiusano, M L; D'Onofrio, G; Alvarez-Valin, F; Jabbari, K; Colonna, G; Bernardi, G

1999-09-30

We investigated the relationships between the nucleotide substitution rates and the predicted secondary structures in the three states representation (alpha-helix, beta-sheet, and coil). The analysis was carried out on 34 alignments, each of which comprised sequences belonging to at least four different mammalian orders. The rates of synonymous substitution were found to be significantly different in regions predicted to be alpha-helix, beta-sheet, or coil. Likewise, the nonsynonymous rates also differ, although expectedly at a lower extent, in the three types of secondary structure, suggesting that different selective constraints associated with the different structures are affecting in a similar way the synonymous and nonsynonymous rates. Moreover, the base composition of the third codon positions is different in coding sequence regions corresponding to different secondary structures of proteins.
Identification and functional analysis of a new glyphosate resistance gene from a fungus cDNA library.

PubMed

Tao, Bo; Shao, Bai-Hui; Qiao, Yu-Xin; Wang, Xiao-Qin; Chang, Shu-Jun; Qiu, Li-Juan

2017-08-01

Glyphosate is a widely used broad spectrum herbicide; however, this limits its use once crops are planted. If glyphosate-resistant crops are grown, glyphosate can be used for weed control in crops. While several glyphosate resistance genes are used in commercial glyphosate tolerant crops, there is interest in identifying additional genes for glyphosate tolerance. This research constructed a high-quality cDNA library form the glyphosate-resistant fungus Aspergillus oryzae RIB40 to identify genes that may confer resistance to glyphosate. Using a medium containing glyphosate (120mM), we screened several clones from the library. Based on a nucleotide sequence analysis, we identified a gene of unknown function (GenBank accession number: XM_001826835.2) that encoded a hypothetical 344-amino acid protein. The gene was named MFS40. Its ORF was amplified to construct an expression vector, pGEX-4T-1-MFS40, to express the protein in Escherichia coli BL21. The gene conferred glyphosate tolerance to E. coli ER2799 cells. Copyright © 2017 Elsevier B.V. All rights reserved.
The complete nucleotide sequence of the barley yellow dwarf GPV isolate from China shows that it is a new member of the genus Polerovirus.

PubMed

Zhang, Wenwei; Cheng, Zhuomin; Xu, Lei; Wu, Maosen; Waterhouse, Peter; Zhou, Guanghe; Li, Shifang

2009-01-01

The complete nucleotide sequence of the ssRNA genome of a Chinese GPV isolate of barley yellow dwarf virus (BYDV) was determined. It comprised 5673 nucleotides, and the deduced genome organization resembled that of members of the genus Polerovirus. It was most closely related to cereal yellow dwarf virus-RPV (77% nt identity over the entire genome; coat protein amino acid identity 79%). The GPV isolate also differs in vector specificity from other BYDV strains. Biological properties, phylogenetic analyses and detailed sequence comparisons suggest that GPV should be considered a member of a new species within the genus, and the name Wheat yellow dwarf virus-GPV is proposed.
A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

PubMed Central

Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

2006-01-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031
Erwinia carotovora subsp. carotovora extracellular protease: characterization and nucleotide sequence of the gene.

PubMed Central

Kyöstiö, S R; Cramer, C L; Lacy, G H

1991-01-01

The prt1 gene encoding extracellular protease from Erwinia carotovora subsp. carotovora EC14 in cosmid pCA7 was subcloned to create plasmid pSK1. The partial nucleotide sequence of the insert in pSK1 (1,878 bp) revealed a 1,041-bp open reading frame (ORF1) that correlated with protease activity in deletion mutants. ORF1 encodes a polypeptide of 347 amino acids with a calculated molecular mass of 38,826 Da. Escherichia coli transformed with pSK1 or pSK23, a subclone of pSK1, produces a protease (Prt1) intracellularly with a molecular mass of 38 kDa and a pI of 4.8. Prt1 activity was inhibited by phenanthroline, suggesting that it is a metalloprotease. The prt1 promoter was localized between 173 and 1,173 bp upstream of ORF1 by constructing transcriptional lacZ fusions. Primer extension identified the prt1 transcription start site 205 bp upstream of ORF1. The deduced amino acid sequence of ORF1 showed significant sequence identity to metalloproteases from Bacillus thermoproteolyticus (thermolysin), B. subtilis (neutral protease), Legionella pneumophila (metalloprotease), and Pseudomonas aeruginosa (elastase). It has less sequence similarity to metalloproteases from Serratia marcescens and Erwinia chrysanthemi. Locations for three zinc ligands and the active site for E. carotovora subsp. carotovora protease were predicted from thermolysin. Images FIG. 2 FIG. 5 FIG. 6 FIG. 8 FIG. 9 PMID:1917878
Isolation and characterization of adrenoleukodystrophy protein (ALDP) related sequences in the human genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Geraghty, M.T.; Stetten, G.; Kearns, W.

1994-09-01

X-linked adrenoleukodystrophy (ALD) is a disorder of peroxisomal {beta}-oxidation of very long chain fatty acids. It presents either as progressive dementia in childhood or as progressive paraparesis in later years. Adrenal insufficiency occurs in both phenotypes. The gene of the ALD protein has been mapped to Xq28 and has recently been cloned and characterized. The ALD protein has significant homology to the peroxisomal membrane protein, PMP70 and belongs to the ATP binding cassette superfamily of transporters. We screened a human genomic library with an ALDP cDNA and isolated 5 different but highly similar clones containing sequences corresponding to the 3{prime}more » end of the ALDP gene. Comparison of the sequences over the region corresponding to exon 9 through the 3{prime} end of the ALDP gene reveals {approximately}96% nucleotide identity in both exonic and intronic regions. Splice sites and open reading frames are maintained. Using both FISH and human-rodent DNA mapping panels, we positively assign these ALDP-related sequences to chromosomes 2, 16 and 22, and provisionally to 1 and 20. Southern blot of primate DNA probed with a partial ALDP cDNA (exon 2-10) shows that expansion of ALDP-related sequences occurred in higher primates (chimp, gorilla and human). Although Northern blots show multiple ALDP-hybridizing transcripts in certain tissues, we have no evidence to date for expression of these ALDP-related sequences. In conclusion, our data show there has been an unusual and recent dispersal to multiple chromosomes of structural gene sequences related to the ALDP gene. The functional significance of these sequences remains to be determined but their existence complicates PCR and mutation analysis of the ALDP gene.« less
De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum)

PubMed Central

2011-01-01

Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141
The Coding of Biological Information: From Nucleotide Sequence to Protein Recognition

NASA Astrophysics Data System (ADS)

Štambuk, Nikola

The paper reviews the classic results of Swanson, Dayhoff, Grantham, Blalock and Root-Bernstein, which link genetic code nucleotide patterns to the protein structure, evolution and molecular recognition. Symbolic representation of the binary addresses defining particular nucleotide and amino acid properties is discussed, with consideration of: structure and metric of the code, direct correspondence between amino acid and nucleotide information, and molecular recognition of the interacting protein motifs coded by the complementary DNA and RNA strands.
Identification of protein-interacting nucleotides in a RNA sequence using composition profile of tri-nucleotides.

PubMed

Panwar, Bharat; Raghava, Gajendra P S

2015-04-01

The RNA-protein interactions play a diverse role in the cells, thus identification of RNA-protein interface is essential for the biologist to understand their function. In the past, several methods have been developed for predicting RNA interacting residues in proteins, but limited efforts have been made for the identification of protein-interacting nucleotides in RNAs. In order to discriminate protein-interacting and non-interacting nucleotides, we used various classifiers (NaiveBayes, NaiveBayesMultinomial, BayesNet, ComplementNaiveBayes, MultilayerPerceptron, J48, SMO, RandomForest, SMO and SVM(light)) for prediction model development using various features and achieved highest 83.92% sensitivity, 84.82 specificity, 84.62% accuracy and 0.62 Matthew's correlation coefficient by SVM(light) based models. We observed that certain tri-nucleotides like ACA, ACC, AGA, CAC, CCA, GAG, UGA, and UUU preferred in protein-interaction. All the models have been developed using a non-redundant dataset and are evaluated using five-fold cross validation technique. A web-server called RNApin has been developed for the scientific community (http://crdd.osdd.net/raghava/rnapin/). Copyright © 2015 Elsevier Inc. All rights reserved.
Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale.

PubMed

Liu, Siyang; Huang, Shujia; Rao, Junhua; Ye, Weijian; Krogh, Anders; Wang, Jun

2015-01-01

Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction of population-scale pan-genomes. Our study also highlights the usefulness of the de novo assembly strategy for definition of genome structure.
Evaluation of atpB nucleotide sequences for phylogenetic studies of ferns and other pteridophytes.

PubMed

Wolf, P

1997-10-01

Inferring basal relationships among vascular plants poses a major challenge to plant systematists. The divergence events that describe these relationships occurred long ago and considerable homoplasy has since accrued for both molecular and morphological characters. A potential solution is to examine phylogenetic analyses from multiple data sets. Here I present a new source of phylogenetic data for ferns and other pteridophytes. I sequenced the chloroplast gene atpB from 23 pteridophyte taxa and used maximum parsimony to infer relationships. A 588-bp region of the gene appeared to contain a statistically significant amount of phylogenetic signal and the resulting trees were largely congruent with similar analyses of nucleotide sequences from rbcL. However, a combined analysis of atpB plus rbcL produced a better resolved tree than did either data set alone. In the shortest trees, leptosporangiate ferns formed a monophyletic group. Also, I detected a well-supported clade of Psilotaceae (Psilotum and Tmesipteris) plus Ophioglossaceae (Ophioglossum and Botrychium). The demonstrated utility of atpB suggests that sequences from this gene should play a role in phylogenetic analyses that incorporate data from chloroplast genes, nuclear genes, morphology, and fossil data.
Identification of Delta5-fatty acid desaturase from the cellular slime mold dictyostelium discoideum.

PubMed

Saito, T; Ochiai, H

1999-10-01

cDNA fragments putatively encoding amino acid sequences characteristic of the fatty acid desaturase were obtained using expressed sequence tag (EST) information of the Dictyostelium cDNA project. Using this sequence, we have determined the cDNA sequence and genomic sequence of a desaturase. The cloned cDNA is 1489 nucleotides long and the deduced amino acid sequence comprised 464 amino acid residues containing an N-terminal cytochrome b5 domain. The whole sequence was 38.6% identical to the initially identified Delta5-desaturase of Mortierella alpina. We have confirmed its function as Delta5-desaturase by over expression mutation in D. discoideum and also the gain of function mutation in the yeast Saccharomyces cerevisiae. Analysis of the lipids from transformed D. discoideum and yeast demonstrated the accumulation of Delta5-desaturated products. This is the first report concering fatty acid desaturase in cellular slime molds.
Tissue Gene Expression Analysis Using Arrayed Normalized cDNA Libraries

PubMed Central

Eickhoff, Holger; Schuchhardt, Johannes; Ivanov, Igor; Meier-Ewert, Sebastian; O'Brien, John; Malik, Arif; Tandon, Neeraj; Wolski, Eryk-Witold; Rohlfs, Elke; Nyarsik, Lajos; Reinhardt, Richard; Nietfeld, Wilfried; Lehrach, Hans

2000-01-01

We have used oligonucleotide-fingerprinting data on 60,000 cDNA clones from two different mouse embryonic stages to establish a normalized cDNA clone set. The normalized set of 5,376 clones represents different clusters and therefore, in almost all cases, different genes. The inserts of the cDNA clones were amplified by PCR and spotted on glass slides. The resulting arrays were hybridized with mRNA probes prepared from six different adult mouse tissues. Expression profiles were analyzed by hierarchical clustering techniques. We have chosen radioactive detection because it combines robustness with sensitivity and allows the comparison of multiple normalized experiments. Sensitive detection combined with highly effective clustering algorithms allowed the identification of tissue-specific expression profiles and the detection of genes specifically expressed in the tissues investigated. The obtained results are publicly available (http://www.rzpd.de) and can be used by other researchers as a digital expression reference. [The sequence data described in this paper have been submitted to the EMBL data library under accession nos. AL360374–AL36537.] PMID:10958641
Complete nucleotide sequence of the gene for human heparin cofactor II and mapping to chromosomal band 22q11

DOE Office of Scientific and Technical Information (OSTI.GOV)

Herzog, R.; Lutz, S.; Blin, N.

1991-02-05

Heparin cofactor II (HCII) is a 66-kDa plasma glycoprotein that inhibits thrombin rapidly in the presence of dermatan sulfate or heparin. Clones comprising the entire HCII gene were isolated from a human leukocyte genomic library in EMBL-3 {lambda} phage. The sequence of the gene was determined on both strands of DNA (15,849 bp) and included 1,749 bp of 5{prime}-flanking sequence, five exons, four introns, and 476 bp of DNA 3{prime} to the polyadenylation site. Ten complete and one partial Alu repeats were identified in the introns and 5{prime}-flanking region. The HCII gene was regionally mapped on chromosome 22 using rodent-humanmore » somatic cell hybrids, carrying only parts of human chromosome 22, and the chronic myelogenous leukemia cell line K562. With the cDNA probe HCII7.2, containing the entire coding region of the gene, the HCII gene was shown to be amplified 10-20-fold in K562 cells by Southern analysis and in situ hybridization. From these data, the authors concluded that the HCII gene is localized on the chromosomal band 22q11 proximal to the breakpoint cluster region (BCR). Analysis by pulsed-field gel electrophoresis indicated that the amplified HCII gene in K562 cells maps at least 2 Mbp proximal to BCR-1. Furthermore, the HCII7.2 cDNA probe detected two frequent restriction fragment length polymorphisms with the restriction enzymes BamHI and Hind III.« less
Analysis of expressed sequence tags from the four main developmental stages of Trypanosoma congolense

PubMed Central

Helm, Jared R.; Hertz-Fowler, Christiane; Aslett, Martin; Berriman, Matthew; Sanders, Mandy; Quail, Michael A.; Soares, Marcelo B.; Bonaldo, Maria F.; Sakurai, Tatsuya; Inoue, Noboru; Donelson, John E.

2009-01-01

Trypanosoma congolense is one of the most economically important pathogens of livestock in Africa. Culture-derived parasites of each of the three main insect stages of the T. congolense life cycle, i.e., the procyclic, epimastigote and metacyclic stages, and bloodstream stage parasites isolated from infected mice, were used to construct stage-specific cDNA libraries and expressed sequence tags (ESTs or cDNA clones) in each library were sequenced. Thirteen EST clusters encoding different variant surface glycoproteins (VSGs) were detected in the metacyclic library and twenty-six VSG EST clusters were found in the bloodstream library, six of which are shared by the metacyclic library. Rare VSG ESTs are present in the epimastigote library, and none were detected in the procyclic library. ESTs encoding enzymes that catalyze oxidative phosphorylation and amino acid metabolism are about twice as abundant in the procyclic and epimastigote stages as in the metacyclic and bloodstream stages. In contrast, ESTs encoding enzymes involved in glycolysis, the citric acid cycle and nucleotide metabolism are about the same in all four developmental stages. Cysteine proteases, kinases and phosphatases are the most abundant enzyme groups represented by the ESTs. All four libraries contain T. congolense-specific expressed sequences not present in the T. brucei and T. cruzi genomes. Normalized cDNA libraries were constructed from the metacyclic and bloodstream stages, and found to be further enriched for T. congolense-specific ESTs. Given that cultured T. congolense offers an experimental advantage over other African trypanosome species, these ESTs provide a basis for further investigation of the molecular properties of these four developmental stages, especially the epimastigote and metacyclic stages for which it is difficult to obtain large quantities of organisms. The T. congolense EST databases are available at: http://www.sanger.ac.uk/Projects/T_congolense/EST_index.shtml. PMID
Identification of differentially-expressed genes potentially implicated in drought response in pitaya (Hylocereus undatus) by suppression subtractive hybridization and cDNA microarray analysis.

PubMed

Fan, Qing-Jie; Yan, Feng-Xia; Qiao, Guang; Zhang, Bing-Xue; Wen, Xiao-Peng

2014-01-01

Drought is one of the most severe threats to the growth, development and yield of plant. In order to unravel the molecular basis underlying the high tolerance of pitaya (Hylocereus undatus) to drought stress, suppression subtractive hybridization (SSH) and cDNA microarray approaches were firstly combined to identify the potential important or novel genes involved in the plant responses to drought stress. The forward (drought over drought-free) and reverse (drought-free over drought) suppression subtractive cDNA libraries were constructed using in vitro shoots of cultivar 'Zihonglong' exposed to drought stress and drought-free (control). A total of 2112 clones, among which half were from either forward or reverse SSH library, were randomly picked up to construct a pitaya cDNA microarray. Microarray analysis was carried out to verify the expression fluctuations of this set of clones upon drought treatment compared with the controls. A total of 309 expressed sequence tags (ESTs), 153 from forward library and 156 from reverse library, were obtained, and 138 unique ESTs were identified after sequencing by clustering and blast analyses, which included genes that had been previously reported as responsive to water stress as well as some functionally unknown genes. Thirty six genes were mapped to 47 KEGG pathways, including carbohydrate metabolism, lipid metabolism, energy metabolism, nucleotide metabolism, and amino acid metabolism of pitaya. Expression analysis of the selected ESTs by reverse transcriptase polymerase chain reaction (RT-PCR) corroborated the results of differential screening. Moreover, time-course expression patterns of these selected ESTs further confirmed that they were closely responsive to drought treatment. Among the differentially expressed genes (DEGs), many are related to stress tolerances including drought tolerance. Thereby, the mechanism of drought tolerance of this pitaya genotype is a very complex physiological and biochemical process, in
Variant translocation partners of the anaplastic lymphoma kinase (ALK) gene in two cases of anaplastic large cell lymphoma, identified by inverse cDNA polymerase chain reaction.

PubMed

Takeoka, Kayo; Okumura, Atsuko; Honjo, Gen; Ohno, Hitoshi

2014-01-01

In anaplastic large cell lymphoma (ALCL), the anaplastic lymphoma kinase (ALK) gene is rearranged with diverse partners due to variant translocations/inversions. Case 1 was a 39-year-old man who developed multiple tumors in the mediastinum, psoas muscle, lung, and lymph nodes. A biopsy specimen of the inguinal node was effaced by large tumor cells expressing CD30, epithelial membrane antigen, and cytoplasmic ALK, which led to a diagnosis of ALK(+) ALCL. Case 2 was a 51-year-old man who was initially diagnosed with undifferentiated carcinoma. He developed multiple skin tumors eight years after his initial presentation, and was finally diagnosed with ALK(+) ALCL. He died of therapy-related acute myeloid leukemia. G-banding and fluorescence in situ hybridization using an ALK break-apart probe revealed the rearrangement of ALK and suggested variant translocation in both cases. We applied an inverse cDNA polymerase chain reaction (PCR) strategy to identify the partner of ALK. Nucleotide sequencing of the PCR products and a database search revealed that the sequences of ATIC in case 1 and TRAF1 in case 2 appeared to follow those of ALK. We subsequently confirmed ATIC-ALK and TRAF1-ALK fusions by reverse transcriptase PCR and nucleotide sequencing. We successfully determined the partner gene of ALK in two cases of ALK(+) ALCL. ATIC is the second most common partner of variant ALK rearrangements, while the TRAF1-ALK fusion gene was first reported in 2013, and this is the second reported case of ALK(+) ALCL carrying TRAF1-ALK.
Characterization of a gene family abundantly expressed in Oenothera organensis pollen that shows sequence similarity to polygalacturonase.

PubMed Central

Brown, S M; Crouch, M L

1990-01-01

We have isolated and characterized cDNA clones of a gene family (P2) expressed in Oenothera organensis pollen. This family contains approximately six to eight family members and is expressed at high levels only in pollen. The predicted protein sequence from a near full-length cDNA clone shows that the protein products of these genes are at least 38,000 daltons. We identified the protein encoded by one of the cDNAs in this family by using antibodies to beta-galactosidase/pollen cDNA fusion proteins. Immunoblot analysis using these antibodies identifies a family of proteins of approximately 40 kilodaltons that is present in mature pollen, indicating that these mRNAs are not stored solely for translation after pollen germination. These proteins accumulate late in pollen development and are not detectable in other parts of the plant. Although not present in unpollinated or self-pollinated styles, the 40-kilodalton to 45-kilodalton antigens are detectable in extracts from cross-pollinated styles, suggesting that the proteins are present in pollen tubes growing through the style during pollination. The proteins are also present in pollen tubes growing in vitro. Both nucleotide and amino acid sequences are similar to the published sequences for cDNAs encoding the enzyme polygalacturonase, which suggests that the P2 gene family may function in depolymerizing pectin during pollen development, germination, and tube growth. Cross-hybridizing RNAs and immunoreactive proteins were detected in pollen from a wide variety of plant species, which indicates that the P2 family of polygalacturonase-like genes are conserved and may be expressed in the pollen from many angiosperms. PMID:2152116
Energy efficiency trade-offs drive nucleotide usage in transcribed regions

PubMed Central

Chen, Wei-Hua; Lu, Guanting; Bork, Peer; Hu, Songnian; Lercher, Martin J.

2016-01-01

Efficient nutrient usage is a trait under universal selection. A substantial part of cellular resources is spent on making nucleotides. We thus expect preferential use of cheaper nucleotides especially in transcribed sequences, which are often amplified thousand-fold compared with genomic sequences. To test this hypothesis, we derive a mutation-selection-drift equilibrium model for nucleotide skews (strand-specific usage of ‘A' versus ‘T' and ‘G' versus ‘C'), which explains nucleotide skews across 1,550 prokaryotic genomes as a consequence of selection on efficient resource usage. Transcription-related selection generally favours the cheaper nucleotides ‘U' and ‘C' at synonymous sites. However, the information encoded in mRNA is further amplified through translation. Due to unexpected trade-offs in the codon table, cheaper nucleotides encode on average energetically more expensive amino acids. These trade-offs apply to both strand-specific nucleotide usage and GC content, causing a universal bias towards the more expensive nucleotides ‘A' and ‘G' at non-synonymous coding sites. PMID:27098217

Construction and characterization of a normalized cDNA library of Nannochloropsis oculata (Eustigmatophyceae)

NASA Astrophysics Data System (ADS)

Yu, Jianzhong; Ma, Xiaolei; Pan, Kehou; Yang, Guanpin; Yu, Wengong

2010-07-01

We constructed and characterized a normalized cDNA library of Nannochloropsis oculata CS-179, and obtained 905 nonredundant sequences (NRSs) ranging from 431-1 756 bp in length. Among them, 496 were very similar to nonredundant ones in the GenBank ( E ≤1.0e-05), and 349 ESTs had significant hits with the clusters of eukaryotic orthologous groups (KOG). Bases G and/or C at the third position of codons of 14 amino acid residues suggested a strong bias in the conserved domain of 362 NRSs (>60%). We also identified the unigenes encoding phosphorus and nitrogen transporters, suggesting that N. oculata could efficiently transport and metabolize phosphorus and nitrogen, and recognized the unigenes that involved in biosynthesis and storage of both fatty acids and polyunsaturated fatty acids (PUFAs), which will facilitate the demonstration of eicosapentaenoic acid (EPA) biosynthesis pathway of N. oculata. In comparison with the original cDNA library, the normalized library significantly increased the efficiencies of random sequencing and rarely expressed genes discovering, and decreased the frequency of abundant gene sequences.
Reading biological processes from nucleotide sequences

NASA Astrophysics Data System (ADS)

Murugan, Anand

Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical
Identification, Characterization and Full-Length Sequence Analysis of a Novel Polerovirus Associated with Wheat Leaf Yellowing Disease

PubMed Central

Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng

2017-01-01

To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae. Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae. Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat. PMID:28932215
Identification, Characterization and Full-Length Sequence Analysis of a Novel Polerovirus Associated with Wheat Leaf Yellowing Disease.

PubMed

Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng

2017-01-01

To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae . Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae . Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat.
Amino acid sequence of bovine muzzle epithelial desmocollin derived from cloned cDNA: a novel subtype of desmosomal cadherins.

PubMed

Koch, P J; Goldschmidt, M D; Walsh, M J; Zimbelmann, R; Schmelz, M; Franke, W W

1991-05-01

Desmosomes are cell-type-specific intercellular junctions found in epithelium, myocardium and certain other tissues. They consist of assemblies of molecules involved in the adhesion of specific cell types and in the anchorage of cell-type-specific cytoskeletal elements, the intermediate-size filaments, to the plasma membrane. To explore the individual desmosomal components and their functions we have isolated DNA clones encoding the desmosomal glycoprotein, desmocollin, using antibodies and a cDNA expression library from bovine muzzle epithelium. The cDNA-deduced amino-acid sequence of desmocollin (presently we cannot decide to which of the two desmocollins, DC I or DC II, this clone relates) defines a polypeptide with a calculated molecular weight of 85,000, with a single candidate sequence of 24 amino acids sufficiently long for a transmembrane arrangement, and an extracellular aminoterminal portion of 561 amino acid residues, compared to a cytoplasmic part of only 176 amino acids. Amino acid sequence comparisons have revealed that desmocollin is highly homologous to members of the cadherin family of cell adhesion molecules, including the previously sequenced desmoglein, another desmosome-specific cadherin. Using riboprobes derived from cDNAs for Northern-blot analyses, we have identified an mRNA of approximately 6 kb in stratified epithelia such as muzzle epithelium and tongue mucosa but not in two epithelial cell culture lines containing desmosomes and desmoplakins. The difference may indicate drastic differences in mRNA concentration or the existence of cell-type-specific desmocollin subforms. The molecular topology of desmocollin(s) is discussed in relation to possible functions of the individual molecular domains.
Genome-Wide Profiling of RNA–Protein Interactions Using CLIP-Seq

PubMed Central

Stork, Cheryl; Zheng, Sika

2017-01-01

UV crosslinking immunoprecipitation (CLIP) is an increasingly popular technique to study protein–RNA interactions in tissues and cells. Whole cells or tissues are ultraviolet irradiated to generate a covalent bond between RNA and proteins that are in close contact. After partial RNase digestion, antibodies specific to an RNA binding protein (RBP) or a protein–epitope tag is then used to immunoprecipitate the protein–RNA complexes. After stringent washing and gel separation the RBP–RNA complex is excised. The RBP is protease digested to allow purification of the bound RNA. Reverse transcription of the RNA followed by high-throughput sequencing of the cDNA library is now often used to identify protein bound RNA on a genome-wide scale. UV irradiation can result in cDNA truncations and/or mutations at the crosslink sites, which complicates the alignment of the sequencing library to the reference genome and the identification of the crosslinking sites. Meanwhile, one or more amino acids of a crosslinked RBP can remain attached to its bound RNA due to incomplete digestion of the protein. As a result, reverse transcriptase may not read through the crosslink sites, and produce cDNA ending at the crosslinked nucleotide. This is harnessed by one variant of CLIP methods to identify crosslinking sites at a nucleotide resolution. This method, individual nucleotide resolution CLIP (iCLIP) circularizes cDNA to capture the truncated cDNA and also increases the efficiency of ligating sequencing adapters to the library. Here, we describe the detailed procedure of iCLIP. PMID:26965263
An alternative method for cDNA cloning from surrogate eukaryotic cells transfected with the corresponding genomic DNA.

PubMed

Hu, Lin-Yong; Cui, Chen-Chen; Song, Yu-Jie; Wang, Xiang-Guo; Jin, Ya-Ping; Wang, Ai-Hua; Zhang, Yong

2012-07-01

cDNA is widely used in gene function elucidation and/or transgenics research but often suitable tissues or cells from which to isolate mRNA for reverse transcription are unavailable. Here, an alternative method for cDNA cloning is described and tested by cloning the cDNA of human LALBA (human alpha-lactalbumin) from genomic DNA. First, genomic DNA containing all of the coding exons was cloned from human peripheral blood and inserted into a eukaryotic expression vector. Next, by delivering the plasmids into either 293T or fibroblast cells, surrogate cells were constructed. Finally, the total RNA was extracted from the surrogate cells and cDNA was obtained by RT-PCR. The human LALBA cDNA that was obtained was compared with the corresponding mRNA published in GenBank. The comparison showed that the two sequences were identical. The novel method for cDNA cloning from surrogate eukaryotic cells described here uses well-established techniques that are feasible and simple to use. We anticipate that this alternative method will have widespread applications.
Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

PubMed Central

Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

1988-01-01

Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437
Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome

PubMed Central

Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred

2014-01-01

Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield. PMID:25333064
Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome.

PubMed

Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred

2014-09-01

Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield.
cDNA cloning of carrot extracellular beta-fructosidase and its expression in response to wounding and bacterial infection.

PubMed Central

Sturm, A; Chrispeels, M J

1990-01-01

We isolated a full-length cDNA for apoplastic (extracellular or cell wall-bound) beta-fructosidase (invertase), determined its nucleotide sequence, and used it as a probe to measure changes in mRNA as a result of wounding of carrot storage roots and infection of carrot plants with the bacterial pathogen Erwinia carotovora. The derived amino acid sequence of extracellular beta-fructosidase shows that it is a basic protein (pl 9.9) with a signal sequence for entry into the endoplasmic reticulum and a propeptide at the N terminus that is not present in the mature protein. Amino acid sequence comparison with yeast and bacterial invertases shows that the overall homology is only about 28%, but that there are short conserved motifs, one of which is at the active site. Maturing carrot storage roots contain barely detectable levels of mRNA for extracellular beta-fructosidase and these levels rise slowly but dramatically after wounding with maximal expression after 12 hours. Infection of roots and leaves of carrot plants with E. carotovora results in a very fast increase in the mRNA levels with maximal expression after 1 hour. These results indicate that apoplastic beta-fructosidase is probably a new and hitherto unrecognized pathogenesis-related protein [Van Loon, L.C. (1985). Plant Mol. Biol. 4, 111-116]. Suspension-cultured carrot cells contain high levels of mRNA for extracellular beta-fructosidase and these levels remain the same whether the cells are grown on sucrose, glucose, or fructose. PMID:2152110
Cloning and molecular characterization of the salt-regulated jojoba ScRab cDNA encoding a small GTP-binding protein.

PubMed

Mizrahi-Aviv, Ela; Mills, David; Benzioni, Aliza; Bar-Zvi, Dudy

2002-10-01

Salt stress results in a massive change in gene expression. An 837 bp cDNA designated ScRab was cloned from shoot cultures of the salt tolerant jojoba (Simmondsia chinesis). The cloned cDNA encodes a full length 200 amino acid long polypeptide that bears high homology to the Rab subfamily of small GTP binding proteins, particularly, the Rab5 subfamily. ScRab expression is reduced in shoots grown in the presence of salt compared to shoots from non-stressed cultures. His6-tagged ScRAB protein was expressed in E. coli, and purified to homogeneity. The purified protein bound radiolabelled GTP. The unlabelled guanine nucleotides GTP, GTP gamma S and GDP but not ATP, CTP or UTP competed with GTP binding.
Species composition of the genus Saprolegnia in fin fish aquaculture environments, as determined by nucleotide sequence analysis of the nuclear rDNA ITS regions.

PubMed

de la Bastide, Paul Y; Leung, Wai Lam; Hintz, William E

2015-01-01

The ITS region of the rDNA gene was compared for Saprolegnia spp. in order to improve our understanding of nucleotide sequence variability within and between species of this genus, determine species composition in Canadian fin fish aquaculture facilities, and to assess the utility of ITS sequence variability in genetic marker development. From a collection of more than 400 field isolates, ITS region nucleotide sequences were studied and it was determined that there was sufficient consistent inter-specific variation to support the designation of species identity based on ITS sequence data. This non-subjective approach to species identification does not rely upon transient morphological features. Phylogenetic analyses comparing our ITS sequences and species designations with data from previous studies generally supported the clade scheme of Diéguez-Uribeondo et al. (2007) and found agreement with the molecular taxonomic cluster system of Sandoval-Sierra et al. (2014). Our Canadian ITS sequence collection will thus contribute to the public database and assist the clarification of Saprolegnia spp. taxonomy. The analysis of ITS region sequence variability facilitated genus- and species-level identification of unknown samples from aquaculture facilities and provided useful information on species composition. A unique ITS-RFLP for the identification of S. parasitica was also described. Copyright © 2014 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Analysing grouping of nucleotides in DNA sequences using lumped processes constructed from Markov chains.

PubMed

Guédon, Yann; d'Aubenton-Carafa, Yves; Thermes, Claude

2006-03-01

The most commonly used models for analysing local dependencies in DNA sequences are (high-order) Markov chains. Incorporating knowledge relative to the possible grouping of the nucleotides enables to define dedicated sub-classes of Markov chains. The problem of formulating lumpability hypotheses for a Markov chain is therefore addressed. In the classical approach to lumpability, this problem can be formulated as the determination of an appropriate state space (smaller than the original state space) such that the lumped chain defined on this state space retains the Markov property. We propose a different perspective on lumpability where the state space is fixed and the partitioning of this state space is represented by a one-to-many probabilistic function within a two-level stochastic process. Three nested classes of lumped processes can be defined in this way as sub-classes of first-order Markov chains. These lumped processes enable parsimonious reparameterizations of Markov chains that help to reveal relevant partitions of the state space. Characterizations of the lumped processes on the original transition probability matrix are derived. Different model selection methods relying either on hypothesis testing or on penalized log-likelihood criteria are presented as well as extensions to lumped processes constructed from high-order Markov chains. The relevance of the proposed approach to lumpability is illustrated by the analysis of DNA sequences. In particular, the use of lumped processes enables to highlight differences between intronic sequences and gene untranslated region sequences.
An integrated genetic linkage map of watermelon and genetic diversity based on single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers

USDA-ARS?s Scientific Manuscript database

Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...
Fabrication of high quality cDNA microarray using a small amount of cDNA.

PubMed

Park, Chan Hee; Jeong, Ha Jin; Jung, Jae Jun; Lee, Gui Yeon; Kim, Sang-Chul; Kim, Tae Soo; Yang, Sang Hwa; Chung, Hyun Cheol; Rha, Sun Young

2004-05-01

DNA microarray technology has become an essential part of biological research. It enables the genome-scale analysis of gene expression in various types of model systems. Manufacturing high quality cDNA microarrays of microdeposition type depends on some key factors including a printing device, spotting pins, glass slides, spotting solution, and humidity during spotting. UsingEthe Microgrid II TAS model printing device, this study defined the optimal conditions for producing high density, high quality cDNA microarrays with the least amount of cDNA product. It was observed that aminosilane-modified slides were superior to other types of surface modified-slides. A humidity of 30+/-3% in a closed environment and the overnight drying of the spotted slides gave the best conditions for arraying. In addition, the cDNA dissolved in 30% DMSO gave the optimal conditions for spotting compared to the 1X ArrayIt, 3X SSC and 50% DMSO. Lastly, cDNA in the concentration range of 100-300 ng/ micro l was determined to be best for arraying and post-processing. Currently, the printing system in this study yields reproducible 9000 spots with a spot size 150 mm diameter, and a 200 nm spot spacing.
Horse cDNA clones encoding two MHC class I genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barbis, D.P.; Maher, J.K.; Stanek, J.

1994-12-31

Two full-length clones encoding MHC class I genes were isolated by screening a horse cDNA library, using a probe encoding in human HLA-A2.2Y allele. The library was made in the pcDNA1 vector (Invitrogen, San Diego, CA), using mRNA from peripheral blood lymphocytes obtained from a Thoroughbred stallion (No. 0834) homozygous for a common horse MHC haplotype (ELA-A2, -B2, -D2; Antczak et al. 1984; Donaldson et al. 1988). The clones were sequenced, using SP6 and T7 universal primers and horse-specific oligonucleotides designed to extend previously determined sequences.
Cloning of the cDNA for U1 small nuclear ribonucleoprotein particle 70K protein from Arabidopsis thaliana

NASA Technical Reports Server (NTRS)

Reddy, A. S.; Czernik, A. J.; An, G.; Poovaiah, B. W.

1992-01-01

We cloned and sequenced a plant cDNA that encodes U1 small nuclear ribonucleoprotein (snRNP) 70K protein. The plant U1 snRNP 70K protein cDNA is not full length and lacks the coding region for 68 amino acids in the amino-terminal region as compared to human U1 snRNP 70K protein. Comparison of the deduced amino acid sequence of the plant U1 snRNP 70K protein with the amino acid sequence of animal and yeast U1 snRNP 70K protein showed a high degree of homology. The plant U1 snRNP 70K protein is more closely related to the human counter part than to the yeast 70K protein. The carboxy-terminal half is less well conserved but, like the vertebrate 70K proteins, is rich in charged amino acids. Northern analysis with the RNA isolated from different parts of the plant indicates that the snRNP 70K gene is expressed in all of the parts tested. Southern blotting of genomic DNA using the cDNA indicates that the U1 snRNP 70K protein is coded by a single gene.
Sequence-Based Prioritization of Nonsynonymous Single-Nucleotide Polymorphisms for the Study of Disease Mutations

PubMed Central

Jiang, Rui ; Yang, Hua ; Zhou, Linqi ; Kuo, C.-C. Jay ; Sun, Fengzhu ; Chen, Ting

2007-01-01

The increasing demand for the identification of genetic variation responsible for common diseases has translated into a need for sophisticated methods for effectively prioritizing mutations occurring in disease-associated genetic regions. In this article, we prioritize candidate nonsynonymous single-nucleotide polymorphisms (nsSNPs) through a bioinformatics approach that takes advantages of a set of improved numeric features derived from protein-sequence information and a new statistical learning model called “multiple selection rule voting” (MSRV). The sequence-based features can maximize the scope of applications of our approach, and the MSRV model can capture subtle characteristics of individual mutations. Systematic validation of the approach demonstrates that this approach is capable of prioritizing causal mutations for both simple monogenic diseases and complex polygenic diseases. Further studies of familial Alzheimer diseases and diabetes show that the approach can enrich mutations underlying these polygenic diseases among the top of candidate mutations. Application of this approach to unclassified mutations suggests that there are 10 suspicious mutations likely to cause diseases, and there is strong support for this in the literature. PMID:17668383
Ab initio electron propagator calculations of transverse conduction through DNA nucleotide bases in 1-nm nanopore corroborate third generation sequencing.

PubMed

Kletsov, Aleksey A; Glukhovskoy, Evgeny G; Chumakov, Aleksey S; Ortiz, Joseph V

2016-01-01

The conduction properties of DNA molecule, particularly its transverse conductance (electron transfer through nucleotide bridges), represent a point of interest for DNA chemistry community, especially for DNA sequencing. However, there is no fully developed first-principles theory for molecular conductance and current that allows one to analyze the transverse flow of electrical charge through a nucleotide base. We theoretically investigate the transverse electron transport through all four DNA nucleotide bases by implementing an unbiased ab initio theoretical approach, namely, the electron propagator theory. The electrical conductance and current through DNA nucleobases (guanine [G], cytosine [C], adenine [A] and thymine [T]) inserted into a model 1-nm Ag-Ag nanogap are calculated. The magnitudes of the calculated conductance and current are ordered in the following hierarchies: gA>gG>gC>gT and IG>IA>IT>IC correspondingly. The new distinguishing parameter for the nucleobase identification is proposed, namely, the onset bias magnitude. Nucleobases exhibit the following hierarchy with respect to this parameter: Vonset(A)sequencing techniques as well as in the field of DNA chemistry. Copyright © 2015 Elsevier B.V. All rights reserved.

Nucleotide sequences, genetic organization, and distribution of pEU30 and pEL60 from Erwinia amylovora.

PubMed

Foster, Gayle C; McGhee, Gayle C; Jones, Alan L; Sundin, George W

2004-12-01

The nucleotide sequences, genetic organization, and distribution of plasmids pEU30 (30,314 bp) and pEL60 (60,145 bp) from the plant pathogen Erwinia amylovora are described. The newly characterized pEU30 and pEL60 plasmids inhabited strains isolated in the western United States and Lebanon, respectively. The gene content of pEU30 resembled plasmids found in plant-associated bacteria, while that of pEL60 was most similar to IncL/M plasmids inhabiting enteric bacteria.
Composition for nucleic acid sequencing

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2008-08-26

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
The complete sequence and structural analysis of human apolipoprotein B-100: relationship between apoB-100 and apoB-48 forms.

PubMed Central

Cladaras, C; Hadzopoulou-Cladaras, M; Nolte, R T; Atkinson, D; Zannis, V I

1986-01-01

We have isolated and sequenced overlapping cDNA clones covering the entire sequence of human apolipoprotein B-100 (apoB-100). DNA sequence analysis and determination of the mRNA transcription initiation site by S1 nuclease mapping showed that the apoB mRNA consists of 14,112 nucleotides including the 5' and 3' untranslated regions which are 128 and 301 nucleotides respectively. The DNA-derived protein sequence shows that apoB-100 is 513,000 daltons and contains 4560 amino acids including a 24-amino-acid-long signal peptide. The mol. wt of apoB-100 implies that there is one apoB molecule per LDL particle. Computer analysis of the predicted secondary structure of the protein showed that some of the potential alpha helical and beta sheet structures are amphipathic, whereas others have non-amphipathic neutral to apolar character. These latter regions may contribute to the formation of the lipid-binding domains of apoB-100. The protein contains 25 cysteines and 20 potential N-glycosylation sites. The majority of cysteines are distributed in the amino terminal portion of the protein. Four of the potential glycosylation sites are in predicted beta turn structures and may represent true glycosylation positions. ApoB lacks the tandem repeats which are characteristic of other apolipoproteins. The mean hydrophobicity the mean value of H1 and helical hydrophobic moment the mean value of microH profiles of apoB showed the presence of several potential helical regions with strong polar character and high hydrophobic moment. The region with the highest hydrophobic moment, between amino acid residues 3352 and 3369, contains five closely spaced, positively charged residues, and has sequence homology to the LDL receptor binding site of apoE. This region is flanked by three neighbouring regions with positively charged amino acids and high hydrophobic moment that are located between residues 3174 and 3681. One or more of these closely spaced apoB sequences may be involved in the
Microaspiration of esophageal gland cells and cDNA library construction for identifying parasitism genes of plant-parasitic nematodes.

PubMed

Hussey, Richard S; Huang, Guozhong; Allen, Rex

2011-01-01

Identifying parasitism genes encoding proteins secreted from a plant-parasitic nematode's esophageal gland cells and injected through its stylet into plant tissue is the key to understanding the molecular basis of nematode parasitism of plants. Parasitism genes have been cloned by directly microaspirating the cytoplasm from the esophageal gland cells of different parasitic stages of cyst or root-knot nematodes to provide mRNA to create a gland cell-specific cDNA library by long-distance reverse-transcriptase polymerase chain reaction. cDNA clones are sequenced and deduced protein sequences with a signal peptide for secretion are identified for high-throughput in situ hybridization to confirm gland-specific expression.
Molecular Properties of Poliovirus Isolates: Nucleotide Sequence Analysis, Typing by PCR and Real-Time RT-PCR.

PubMed

Burns, Cara C; Kilpatrick, David R; Iber, Jane C; Chen, Qi; Kew, Olen M

2016-01-01

Virologic surveillance is essential to the success of the World Health Organization initiative to eradicate poliomyelitis. Molecular methods have been used to detect polioviruses in tissue culture isolates derived from stool samples obtained through surveillance for acute flaccid paralysis. This chapter describes the use of realtime PCR assays to identify and serotype polioviruses. In particular, a degenerate, inosine-containing, panpoliovirus (panPV) PCR primer set is used to distinguish polioviruses from NPEVs. The high degree of nucleotide sequence diversity among polioviruses presents a challenge to the systematic design of nucleic acid-based reagents. To accommodate the wide variability and rapid evolution of poliovirus genomes, degenerate codon positions on the template were matched to mixed-base or deoxyinosine residues on both the primers and the TaqMan™ probes. Additional assays distinguish between Sabin vaccine strains and non-Sabin strains. This chapter also describes the use of generic poliovirus specific primers, along with degenerate and inosine-containing primers, for routine VP1 sequencing of poliovirus isolates. These primers, along with nondegenerate serotype-specific Sabin primers, can also be used to sequence individual polioviruses in mixtures.
De Novo Transcriptome Sequencing Analysis of cDNA Library and Large-Scale Unigene Assembly in Japanese Red Pine (Pinus densiflora)

PubMed Central

Liu, Le; Zhang, Shijie; Lian, Chunlan

2015-01-01

Japanese red pine (Pinus densiflora) is extensively cultivated in Japan, Korea, China, and Russia and is harvested for timber, pulpwood, garden, and paper markets. However, genetic information and molecular markers were very scarce for this species. In this study, over 51 million sequencing clean reads from P. densiflora mRNA were produced using Illumina paired-end sequencing technology. It yielded 83,913 unigenes with a mean length of 751 bp, of which 54,530 (64.98%) unigenes showed similarity to sequences in the NCBI database. Among which the best matches in the NCBI Nr database were Picea sitchensis (41.60%), Amborella trichopoda (9.83%), and Pinus taeda (4.15%). A total of 1953 putative microsatellites were identified in 1784 unigenes using MISA (MicroSAtellite) software, of which the tri-nucleotide repeats were most abundant (50.18%) and 629 EST-SSR (expressed sequence tag- simple sequence repeats) primer pairs were successfully designed. Among 20 EST-SSR primer pairs randomly chosen, 17 markers yielded amplification products of the expected size in P. densiflora. Our results will provide a valuable resource for gene-function analysis, germplasm identification, molecular marker-assisted breeding and resistance-related gene(s) mapping for pine for P. densiflora. PMID:26690126
Nucleotide sequence of the beta-lactamase gene from Enterococcus faecalis HH22 and its similarity to staphylococcal beta-lactamase genes.

PubMed Central

Zscheck, K K; Murray, B E

1991-01-01

The nucleotide sequence of the constitutively produced beta-lactamase (Bla) gene from Enterococcus faecalis HH22 was shown to be identical to the published sequences of three of four staphylococcal type A beta-lactamase genes; more differences were seen with the genes for staphylococcal type C and D enzymes. One hundred forty nucleotides upstream of the beta-lactamase start codon were determined for an inducible staphylococcal beta-lactamase and were identical to those of the constitutively expressed enterococcal gene, indicating that the changes resulting in constitutive expression are not due to changes in the promoter or operator region. Moreover, complementation studies indicated that production of the enterococcal enzyme could be repressed. The genes for the enterococcal Bla and an inducible staphylococcal Bla were each cloned into a shuttle vector and transformed into enterococcal and staphylococcal recipients. The major difference between the backgrounds of the two hosts was that more enzyme was produced by the staphylococcal host, regardless of the source of the gene. The location of the enzyme was found to be host dependent, since each cloned gene generated extracellular (free) enzyme in the staphylococcus and cell-bound enzyme in the enterococcus. On the basis of the identities of the enterococcal Bla and several staphylococcal Bla sequences, these data suggest the recent spread of beta-lactamase to enterococci and also suggest the loss of a functional repressor. PMID:1952840
Molecular cloning and nucleotide sequences of the genes for two essential proteins constituting a novel enzyme system for heptaprenyl diphosphate synthesis.

PubMed

Koike-Takeshita, A; Koyama, T; Obata, S; Ogura, K

1995-08-04

The genes encoding two dissociable components essential for Bacillus stearothermophilus heptaprenyl diphosphate synthase (all-trans-hexparenyl-diphosphate:isopentenyl-diphosphate hexaprenyl-trans-transferase, EC 2.5.1.30) were cloned, and their nucleotide sequences were determined. Sequence analyses revealed the presence of three open reading frames within 2,350 base pairs, designated as ORF-1, ORF-2, and ORF-3 in order of nucleotide sequence, which encode proteins of 220, 234, and 323 amino acids, respectively. Deletion experiments have shown that expression of the enzymatic activity requires the presence of ORF-1 and ORF-3, but ORF-2 is not essential. As a result, this enzyme was proved genetically to consist of two different protein compounds with molecular masses of 25 kDa (Component I) and 36 kDa (Component II), encoded by two of the three tandem genes. The protein encoded by ORF-1 has no similarity to any protein so far registered. However, the protein encoded by ORF-3 shows a 32% similarity to the farnesyl diphosphate synthase of the same bacterium and has seven highly conserved regions that have been shown typical in prenyltransferases (Koyama, T., Obata, S., Osabe, M., Takeshita, A., Yokoyama, K., Uchida, M., Nishino, T., and Ogura, K. (1993) J. Biochem. (Tokyo) 113, 355-363).
Isolation and expression of three gibberellin 20-oxidase cDNA clones from Arabidopsis.

PubMed

Phillips, A L; Ward, D A; Uknes, S; Appleford, N E; Lange, T; Huttly, A K; Gaskin, P; Graebe, J E; Hedden, P

1995-07-01

Using degenerate oligonucleotide primers based on a pumpkin (Cucurbita maxima) gibberellin (GA) 20-oxidase sequence, six different fragments of dioxygenase genes were amplified by polymerase chain reaction from arabidopsis thaliana genomic DNA. One of these was used to isolate two different full-length cDNA clones, At2301 and At2353, from shoots of the GA-deficient Arabidopsis mutant ga1-2. A third, related clone, YAP169, was identified in the Database of Expressed Sequence Tags. The cDNA clones were expressed in Escherichia coli as fusion proteins, each of which oxidized GA12 at C-20 to GA15, GA24, and the C19 compound GA9, a precursor of bioactive GAs; the C20 tricarboxylic acid compound GA25 was formed as a minor product. The expression products also oxidized the 13-hydroxylated substrate GA53, but less effectively than GA12. The three cDNAs hybridized to mRNA species with tissue-specific patterns of accumulation, with At2301 being expressed in stems and inflorescences, At2353 in inflorescences and developing siliques, and YAP169 in siliques only. In the floral shoots of the ga1-2 mutant, transcript levels corresponding to each cDNA decreased dramatically after GA3 application, suggesting that GA biosynthesis may be controlled, at least in part, through down-regulation of the expression of the 20-oxidase genes.
Cloning and characterization of a cDNA encoding topoisomerase II in pea and analysis of its expression in relation to cell proliferation.

PubMed

Reddy, M K; Nair, S; Tewari, K K; Mudgil, Y; Yadav, B S; Sopory, S K

1999-09-01

We have isolated and sequenced four overlapping cDNA clones to identify the full-length cDNA for topoisomerase II (PsTopII) from pea. Using degenerate primers, based on the conserved amino acid sequences of other eukaryotic type II topoisomerases, a 680 bp fragment was PCR-amplified with pea cDNA as template. This fragment was used as a probe to screen an oligo-dT-primed pea cDNA library. A partial cDNA clone was isolated that was truncated at the 3' end. RACE-PCR was employed to isolate the remaining portion of the gene. The total size of PsTopII is 4639 bp with an open reading frame of 4392 bp. The deduced amino acid sequence shows a strong homology to other eukaryotic topoisomerase II (topo II) at the N-terminus end. The topo II transcript was abundant in proliferative tissues. We also show that the level of topo II transcripts could be stimulated by exogenous application of growth factors that induced proliferation in vitro cultures. Light irradiation to etiolated tissue strongly stimulated the expression of topo II. These results suggest that topo II gene expression is up-regulated in response to light and hormones and correlates with cell proliferation. Besides, we have also isolated and analysed the 5'-flanking region of the pea TopII gene. This is first report on the isolation of a putative promoter for topoisomerase II from plants.
The nucleotide sequence of a major glycine transfer RNA from the posterior silk gland of Bombyx mori L.

PubMed Central

Zúñiga, M C; Steitz, J A

1977-01-01

The nucleotide sequence of tRNA1Gly isolated from the posterior silk gland of Bombyx mori has been determined. This transfer RNA is present in high amounts in the posterior silk gland during the fifth larval instar. It has a GCC anticodon, capable of decoding a major glycine codon in the fibroin messenger RNA, GGU. Structural features of Bombyx tRNA1Gly and its homology to other eukaryotic glycine tRNAs are discussed. Images PMID:414206
Molecular cloning and characterization of a new basic peroxidase cDNA from soybean hypocotyls infected with Phytophthora sojae f.sp. glycines.

PubMed

Yi, S Y; Hwang, B K

1998-10-31

Differential display techniques were used to isolate cDNA clones corresponding to genes which were expressed in soybean hypocotyls by Phytophthora sojae f.sp. glycines infection. With a partial cDNA clone C20CI4 from the differential display PCR as a probe, a new basic peroxidase cDNA clone, designated GMIPER1, was isolated from a cDNA library of soybean hypocotyls infected with P. sojae f.sp. glycines. Sequence analysis revealed that the peroxidase clone encodes a mature protein of 35,813 Da with a putative signal peptide of 27 amino acids in its N-terminus. The amino acid sequence of the soybean peroxidase GMIPER1 is between 54-75% identical to other plant peroxidases including a soybean seed coat peroxidase. Southern blot analysis indicated that multiple copies of sequences related to GMIPER1 exist in the soybean genome. The mRNAs corresponding to the GMIPER1 cDNA accumulated predominantly in the soybean hypocotyls infected with the incompatible race of P. sojae f.sp. glycines, but were expressed at low levels in the compatible interaction. Soybean GMIPER1 mRNAs were not expressed in hypocotyls, leaves, stems, and roots of soybean seedlings. However, treatments with ethephon, salicylic acid or methyl jasmonate induced the accumulation of the GMIPER1 mRNAs in the different organs of soybean. These results suggest that the GMIPER1 gene encoding a putative pathogen-induced peroxidase may play an important role in induced resistance of soybean to P. sojae f.sp. glycines and in response to various external stresses.
Cloning of a cDNA encoding bovine mitochondrial NADP(+)-specific isocitrate dehydrogenase and structural comparison with its isoenzymes from different species.

PubMed Central

Huh, T L; Ryu, J H; Huh, J W; Sung, H C; Oh, I U; Song, B J; Veech, R L

1993-01-01

Mitochondrial NADP(+)-specific isocitrate dehydrogenase (IDP) was co-purified with the pyruvate dehydrogenase complex from bovine kidney mitochondria. The determination of its N-terminal 16-amino-acid sequence revealed that it is highly similar to the IDP from yeast. A cDNA clone (1.8 kb long) encoding this protein was isolated from a bovine kidney lambda gt11 cDNA library using a synthetic oligodeoxynucleotide. The deduced protein sequence of this cDNA clone rendered a precursor protein of 452 amino-acid residues (50,830 Da) and a mature protein of 413 amino-acid residues (46,519 Da). It is 100% identical to the internal tryptic peptide sequences of the autologous form from pig heart and 62% similar to that from yeast. However, it shares little similarity with the mitochondrial NAD(+)-specific isoenzyme from yeast. Structural analyses of the deduced proteins of IDP isoenzymes from different species indicated that similarity exists in certain regions, which may represent the common domains for the active sites or coenzyme-binding sites. In Northern-blot analysis, one species of mRNA (about 2.2 kb for both bovine and human) was hybridized with a 32P-labelled cDNA probe. Southern-blot analysis of genomic DNAs verified simple patterns of hybridization with this cDNA. These results strongly indicate that the mitochondrial IDP may be derived from a single gene family which does not appear to be closely related to that of the NAD(+)-specific isoenzyme. Images Figure 1 Figure 3 Figure 4 Figure 5 PMID:8318002
cDNA cloning, characterization and expression analysis of a novel antimicrobial peptide gene penaeidin-3 (Fi-Pen3) from the haemocytes of Indian white shrimp Fenneropenaeus indicus.

PubMed

Shanthi, S; Vaseeharan, B

2012-03-20

A new member of antimicrobial peptide genes of the penaeidin family, penaeidin 3, was cloned from the haemocytes of Indian white shrimp Fenneropeneaus indicus (F. indicus), by reverse transcription PCR (RT-PCR) and rapid amplification of cDNA end (RACE-PCR) methods. The complete nucleotide sequence of cDNA clone of Indian white shrimp F. indicus Penaeidin 3 (Fi-Pen3) was 243bp long and has an open reading frame which encodes 80 amino acid peptide. The homology analysis of Fi-Pen3 sequence with other Penaeidins 3 shows higher similarity with Penaeus monodon (92%). The theoretical 3D structure generated through ab initio modelling indicated the presence of two-disulphide bridges in the alpha-helix. The signal peptide sequence of Fi-Pen3 is almost entirely homologous to that of other Penaeidin 3 of crustaceans, while differing relatively in the N-terminal domain of the mature peptide. The mature peptide has a predicted molecular weight of 84.9kDa, and a theoretical pI of 9.38. Phylogenetic analysis of Fi-Pen3 shows high resemblance with other Pen-3 from P. monodon, Litopenaeus stylirostris, Litopenaeus vannamei and Litopenaeus setiferus. Fi-Pen3 found to be expressed in haemocytes, heart, hepatopancreas, muscles, gills, intestine, and eyestalk with higher expression in haemocytes. Microbial challenge resulted in mRNA up-regulation, up to 6h post injection of Vibrio parahemolyticus. The Fi-Pen3 mRNA expression of F. indicus in the premolt stage (D(01) and D(02)) was significantly up-regulated than the postmolt (A and B) and intermolt stages (C). The findings of the present paper underline the involvement of Fi-Pen3 in innate immune system of F. indicus. Copyright © 2011 Elsevier GmbH. All rights reserved.
JNSViewer—A JavaScript-based Nucleotide Sequence Viewer for DNA/RNA secondary structures

PubMed Central

Dong, Min; Graham, Mitchell; Yadav, Nehul

2017-01-01

Many tools are available for visualizing RNA or DNA secondary structures, but there is scarce implementation in JavaScript that provides seamless integration with the increasingly popular web computational platforms. We have developed JNSViewer, a highly interactive web service, which is bundled with several popular tools for DNA/RNA secondary structure prediction and can provide precise and interactive correspondence among nucleotides, dot-bracket data, secondary structure graphs, and genic annotations. In JNSViewer, users can perform RNA secondary structure predictions with different programs and settings, add customized genic annotations in GFF format to structure graphs, search for specific linear motifs, and extract relevant structure graphs of sub-sequences. JNSViewer also allows users to choose a transcript or specific segment of Arabidopsis thaliana genome sequences and predict the corresponding secondary structure. Popular genome browsers (i.e., JBrowse and BrowserGenome) were integrated into JNSViewer to provide powerful visualizations of chromosomal locations, genic annotations, and secondary structures. In addition, we used StructureFold with default settings to predict some RNA structures for Arabidopsis by incorporating in vivo high-throughput RNA structure profiling data and stored the results in our web server, which might be a useful resource for RNA secondary structure studies in plants. JNSViewer is available at http://bioinfolab.miamioh.edu/jnsviewer/index.html. PMID:28582416
The Complete Nucleotide Sequence of the Human Immunoglobulin Heavy Chain Variable Region Locus

PubMed Central

Matsuda, Fumihiko; Ishii, Kazuo; Bourvagnet, Patrice; Kuma, Kei-ichi; Hayashida, Hidenori; Miyata, Takashi; Honjo, Tasuku

1998-01-01

The complete nucleotide sequence of the 957-kb DNA of the human immunoglobulin heavy chain variable (VH) region locus was determined and 43 novel VH segments were identified. The region contains 123 VH segments classifiable into seven different families, of which 79 are pseudogenes. Of the 44 VH segments with an open reading frame, 39 are expressed as heavy chain proteins and 1 as mRNA, while the remaining 4 are not found in immunoglobulin cDNAs. Combinatorial diversity of VH region was calculated to be ∼6,000. Conservation of the promoter and recombination signal sequences was observed to be higher in functional VH segments than in pseudogenes. Phylogenetic analysis of 114 VH segments clearly showed clustering of the VH segments of each family. However, an independent branch in the tree contained a single VH, V4-44.1P, sharing similar levels of homology to human VH families and to those of other vertebrates. Comparison between different copies of homologous units that appear repeatedly across the locus clearly demonstrates that dynamic DNA reorganization of the locus took place at least eight times between 133 and 10 million years ago. One nonimmunoglobulin gene of unknown function was identified in the intergenic region. PMID:9841928
Isolation and characterization of cDNA clones for carrot extensin and a proline-rich 33-kDa protein

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, J.; Varner, J.E.

1985-07-01

Extensins are hydroxyproline-rich glycoproteins associated with most dicotyledonous plant cell walls. To isolate cDNA clones encoding extensin, the authors started by isolating poly(A) RNA from carrot root tissue, and then translating the RNA in vitro, in the presence of tritiated leucine or proline. A 33-kDa peptide was identified in the translation products as a putative extensin precursor. From a cDNA library constructed with poly(A) RNA from wounded carrots, one cDNA clone (pDC5) was identified that specifically hybridized to poly(A) RNA encoding this 33-kDa peptide. They isolated three cDNA clones (pDC11, pDC12, and pDC16) from another cDNA library using pCD5 asmore » a probe. DNA sequence data, RNA hybridization analysis, and hybrid released in vitro translation indicate that the cDNA clones pDC11 encodes extensin and that cDNA clones pDC12 and pDC16 encode the 33-kDa peptide, which as yet has an unknown identity and function. The assumption that the 33-kDa peptide was an extensin precursor was invalid. RNA hybridization analysis showed that RNA encoded by both clone types is accumulated upon wounding.« less
Open reading frames in a 4556 nucleotide sequence within MDV-1 BamHI-D DNA fragment: evidence for splicing of mRNA from a new viral glycoprotein gene.

PubMed

Becker, Y; Asher, Y; Tabor, E; Davidson, I; Malkinson, M

1994-01-01

A DNA segment of the MDV-1 BamHI-D fragment was sequenced, and the open reading frames (ORFs) present in the 4556 nucleotide fragment were analyzed by computer programs. Computer analysis identified 19 putative ORFs in the sequence ranging from a coding capacity of 37 amino acids (aa) (ORF-1a) to 684aa (ORF-1). The special properties of four ORFs (1a, 1, 2, and 3) were investigated. Two adjacent ORFs, ORF-1a and ORF-1, were found by computer analysis to have the properties of two introns encoding a glycoprotein: ORF-1a encodes an aa sequence with the properties of a signal peptide, and ORF-1 encodes a polypeptide with a membrane anchor domain and putative N-glycosylation sites in the aa sequence. ORF-1a and ORF-1 were found to be transcribed in MDV-1-infected cells. Two RNA transcripts were detected: a precursor RNA and its spliced form. Both are transcribed from a promoter located 5' to ORF-1a, and splice donor and acceptor sites are used to splice the mRNA after cleavage of a 71-nucleotide sequence. This finding suggest that ORF-1a and ORF-1 are two introns of a new MDV-1 glycoprotein gene. The DNA sequence containing ORF-1 was transiently expressed in COS-1 cells, and the viral protein produced in these cells was found to react with anti-MDV serotype-1 Antigen B-specific monoclonal antibodies. These studies indicate that the protein encoded by ORF-1 has antigenic properties resembling Antigen B of MDV-1. A gene homologous to ORF-1 was detected in the genome of both MDV-2(SB1) and MDV-3(HVT), which serve as commercial vaccine strains. Two additional ORFs were noted in the 4556 nucleotide sequence: ORF-2, which encodes a 333 aa polypeptide initiating in the UL and terminating in the TRL prior to the putative origin of replication, and ORF-3, which encodes a 155 aa polypeptide that is partly homologous to the phosphoprotein pp38 encoded by the BamHI-H sequence. The 65 N-terminal aa of the two gene products are identical, both being derived from the nucleotide
Characterization of a dam Mutant of Serratia marcescens and Nucleotide Sequence of the dam Region

PubMed Central

Ostendorf, Tammo; Cherepanov, Peter; de Vries, Johann; Wackernagel, Wilfried

1999-01-01

The DNA of Serratia marcescens has N6-adenine methylation in GATC sequences. Among 2-aminopurine-sensitive mutants isolated from S. marcescens Sr41, one was identified which lacked GATC methylation. The mutant showed up to 30-fold increased spontaneous mutability and enhanced mutability after treatment with 2-aminopurine, ethyl methanesulfonate, or UV light. The gene (dam) coding for the adenine methyltransferase (Dam enzyme) of S. marcescens was identified on a gene bank plasmid which alleviated the 2-aminopurine sensitivity and the higher mutability of a dam-13::Tn9 mutant of Escherichia coli. Nucleotide sequencing revealed that the deduced amino acid sequence of Dam (270 amino acids; molecular mass, 31.3 kDa) has 72% identity to the Dam enzyme of E. coli. The dam gene is located between flanking genes which are similar to those found to the sides of the E. coli dam gene. The results of complementation studies indicated that like Dam of E. coli and unlike Dam of Vibrio cholerae, the Dam enzyme of S. marcescens plays an important role in mutation avoidance by allowing the mismatch repair enzymes to discriminate between the parental and newly synthesized strands during correction of replication errors. PMID:10383952
High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses

PubMed Central

Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

2015-01-01

The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3–6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1–2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site. PMID:26500442

Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

PubMed

Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

2016-09-01

Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.
Evolutionary relationships in the ilarviruses: nucleotide sequence of prunus necrotic ringspot virus RNA 3.

PubMed

Sánchez-Navarro, J A; Pallás, V

1997-01-01

The complete nucleotide sequence of an isolate of prunus necrotic ringspot virus (PNRSV) RNA 3 has been determined. Elucidation of the amino acid sequence of the proteins encoded by the two large open reading frames (ORFs) allowed us to carry out comparative and phylogenetic studies on the movement (MP) and coat (CP) proteins in the ilarvirus group. Amino acid sequence comparison of the MP revealed a highly conserved basic sequence motif with an amphipathic alpha-helical structure preceding the conserved motif of the '30K superfamily' proposed by Mushegian and Koonin [26] for MP's. Within this '30K' motif a strictly conserved transmembrane domain is present in all ilarviruses sequenced so far. At the amino-terminal end, prune dwarf virus (PDV) has an extension not present in other ilarviruses but which is observed in all bromo- and cucumoviruses, suggesting a common ancestor or a recombinational event in the Bromoviridae family. Examination of the N-terminus of the CP's of all ilarviruses revealed a highly basic region, part of which resembles the Arg-rich motif that has been characterized in the RNA-binding protein family. This motif has also been found in the other members of the Bromoviridae family, suggesting its involvement in a structural function. Furthermore this region is required for infectivity in ilarviruses. The similarities found in this Arg-rich motif are discussed in terms of this process known as genome activation. Finally, phylogenetic analysis of both the MP and CP proteins revealed a higher relationship of A1MV to PNRSV, apple mosaic virus (ApMV) and PDV than any other member of the ilarvirus group. In that sense, A1MV should be considered as a true ilarvirus instead of forming a distinct group of viruses.
Cloning of a cDNA encoding rat aldehyde dehydrogenase with high activity for retinal oxidation.

PubMed

Bhat, P V; Labrecque, J; Boutin, J M; Lacroix, A; Yoshida, A

1995-12-12

Retinoic acid (RA), an important regulator of cell differentiation, is biosynthesized from retinol via retinal by a two-step oxidation process. We previously reported the purification and partial amino acid (aa) sequence of a rat kidney aldehyde dehydrogenase (ALDH) isozyme that catalyzed the oxidation of 9-cis and all-trans retinal to corresponding RA with high efficiency [Labrecque et al. Biochem. J. 305 (1995) 681-684]. A rat kidney cDNA library was screened using a 291-bp PCR product generated from total kidney RNA using a pair of oligodeoxyribonucleotide primers matched with the aa sequence. The full-length rat kidney ALDH cDNA contains a 2315-bp (501 aa) open reading frame (ORF). The aa sequence of rat kidney ALDH is 89, 96 and 87% identical to that of the rat cytosolic ALDH, the mouse cytosolic ALDH and human cytosolic ALDH, respectively. Northern blot and RT-PCR-mediated analysis demonstrated that rat kidney ALDH is strongly expressed in kidney, lung, testis, intestine, stomach and trachea, but weakly in the liver.
Method for sequencing nucleic acid molecules

DOEpatents

Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

2006-06-06

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules

DOEpatents

Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

2006-05-30

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Cloning of a coconut endosperm cDNA encoding a 1-acyl-sn-glycerol-3-phosphate acyltransferase that accepts medium-chain-length substrates.

PubMed Central

Knutzon, D S; Lardizabal, K D; Nelsen, J S; Bleibaum, J L; Davies, H M; Metz, J G

1995-01-01

Immature coconut (Cocos nucifera) endosperm contains a 1-acyl-sn-glycerol-3-phosphate acyltransferase (LPAAT) activity that shows a preference for medium-chain-length fatty acyl-coenzyme A substrates (H.M. Davies, D.J. Hawkins, J.S. Nelsen [1995] Phytochemistry 39:989-996). Beginning with solubilized membrane preparations, we have used chromatographic separations to identify a polypeptide with an apparent molecular mass of 29 kD, whose presence in various column fractions correlates with the acyltransferase activity detected in those same fractions. Amino acid sequence data obtained from several peptides generated from this protein were used to isolate a full-length clone from a coconut endosperm cDNA library. Clone pCGN5503 contains a 1325-bp cDNA insert with an open reading frame encoding a 308-amino acid protein with a calculated molecular mass of 34.8 kD. Comparison of the deduced amino acid sequence of pCGN5503 to sequences in the data banks revealed significant homology to other putative LPAAT sequences. Expression of the coconut cDNA in Escherichia coli conferred upon those cells a novel LPAAT activity whose substrate activity profile matched that of the coconut enzyme. PMID:8552723
Genomic organization, sequence characterization and expression analysis of Tenebrio molitor apolipophorin-III in response to an intracellular pathogen, Listeria monocytogenes.

PubMed

Noh, Ju Young; Patnaik, Bharat Bhusan; Tindwa, Hamisi; Seo, Gi Won; Kim, Dong Hyun; Patnaik, Hongray Howrelia; Jo, Yong Hun; Lee, Yong Seok; Lee, Bok Luel; Kim, Nam Jung; Han, Yeon Soo

2014-01-25

Apolipophorin III (apoLp-III) is a well-known hemolymph protein having a functional role in lipid transport and immune response of insects. We cloned full-length cDNA encoding putative apoLp-III from larvae of the coleopteran beetle, Tenebrio molitor (TmapoLp-III), by identification of clones corresponding to the partial sequence of TmapoLp-III, subsequently followed with full length sequencing by a clone-by-clone primer walking method. The complete cDNA consists of 890 nucleotides, including an ORF encoding 196 amino acid residues. Excluding a putative signal peptide of the first 20 amino acid residues, the 176-residue mature apoLp-III has a calculated molecular mass of 19,146Da. Genomic sequence analysis with respect to its cDNA showed that TmapoLp-III was organized into four exons interrupted by three introns. Several immune-related transcription factor binding sites were discovered in the putative 5'-flanking region. BLAST and phylogenetic analyses reveal that TmapoLp-III has high sequence identity (88%) with Tribolium castaneum apoLp-III but shares little sequence homologies (<26%) with other apoLp-IIIs. Homology modeling of Tm apoLp-III shows a bundle of five amphipathic alpha helices, including a short helix 3'. The 'helix-short helix-helix' motif was predicted to be implicated in lipid binding interactions, through reversible conformational changes and accommodating the hydrophobic residues to the exterior for stability. Highest level of TmapoLp-III mRNA was detected at late pupal stages, albeit it is expressed in the larval and adult stages at lower levels. The tissue specific expression of the transcripts showed significantly higher numbers in larval fat body and adult integument. In addition, TmapoLp-III mRNA was found to be highly upregulated in late stages of L. monocytogenes or E. coli challenge. These results indicate that TmapoLp-III may play an important role in innate immune responses against bacterial pathogens in T. molitor. Copyright © 2013
Molecular cloning and characterization of ADP-glucose pyrophosphorylase cDNA clones isolated from pea cotyledons.

PubMed

Burgess, D; Penton, A; Dunsmuir, P; Dooner, H

1997-02-01

Three ADP-glucose pyrophosphorylase (ADPG-PPase) cDNA clones have been isolated and characterized from a pea cotyledon cDNA library. Two of these clones (Psagps1 and Psagps2) encode the small subunit of ADPG-PPase. The deduced amino acid sequences for these two clones are 95% identical. Expression of these two genes differs in that the Psagps2 gene shows comparatively higher expression in seeds relative to its expression in other tissues. Psagps2 expression also peaks midway through seed development at a time in which Psagps1 transcripts are still accumulating. The third cDNA isolated (Psagp11) encodes the large subunit of ADPG-PPase. It shows greater selectivity in expression than either of the small subunit clones. It is highly expressed in sink organs (seed, pod, and seed coat) and undetectable in leaves.
Human ribosomal protein L37 has motifs predicting serine/threonine phosphorylation and a zinc-finger domain.

PubMed

Barnard, G F; Staniunas, R J; Puder, M; Steele, G D; Chen, L B

1994-08-02

Ribosomal protein L37 mRNA is overexpressed in colon cancer. The nucleotide sequences of human L37 from several tumor and normal, colon and liver cDNA sources were determined to be identical. L37 mRNA was approximately 375 nucleotides long encoding 97 amino acids with M(r) = 11,070, pI = 12.6, multiple potential serine/threonine phosphorylation sites and a zinc-finger domain. The human sequence is compared to other species.
[Construction and characterization of a cDNA library from human liver tissue of cirrhosis].

PubMed

Chen, Xiao-hong; Chen, Zhi; Chen, Feng; Zhu, Hai-hong; Zhou, Hong-juan; Yao, Hang-ping

2005-03-01

To construct a cDNA library from human liver tissue of cirrhosis. The total RNA from human liver tissue of cirrhosis was extracted using Trizol method, and the mRNA was purified using mRNA purification kit. SMART technique and CDSIII/3' primer were used for first-strand cDNA synthesis. Long distance PCR was then used to synthesize the double-strand cDNA that was then digested by proteinase K and Sfi I, and was fractionated by CHOMA SPIN-400 column. The cDNA fragments longer than 0.4 kb were collected and ligated to lambdaTripl Ex2 vector. Then lambda-phage packaging reaction and library amplification were performed. The qualities of both unamplified and amplified cDNA libraries was strictly checked by conventional titer determination. Eleven plaques were randomly picked and tested using PCR with universal primers derived from the sequence flanking the vector. The titers of unamplifed and amplified libraries were 1.03 x 10(6) pfu/ml and 1.36 x 10(9) pfu/ml respectively. The percentages of recombinants from both libraries were 97.24 % in unamplified library and 99.02 % in amplified library. The lengths of the inserts were 1.02 kb in average (36.36 % 1 approximately equals 2 kb and 63.64 % 0.5 approximately equals 1.0 kb). A high quality cDNA library from human liver tissue of cirrhosis was constructed successfully, which can be used for screening and cloning new special genes associated with the occurrence of cirrhosis.
O-acetylserine(thiol)lyase from spinach (Spinacia oleracea L.) leaf: cDNA cloning, characterization, and overexpression in Escherichia coli of the chloroplast isoform.

PubMed

Rolland, N; Droux, M; Lebrun, M; Douce, R

1993-01-01

The last enzymatic step for L-cysteine biosynthesis is catalyzed by O-acetylserine(thiol)lyase (OASTL, EC 4.2.99.8) which synthesizes L-cysteine from O-acetylserine and "sulfide." We have isolated and characterized a full-length cDNA (1432 bp) from a lambda gt11 library of spinach leaf encoding the complete precursor of the chloroplast isoform. The 1149-nucleotide open reading frame coding for O-acetylserine(thiol)lyase was in the direction opposite that of the lambda gt11 beta-galactosidase gene. The derived amino acid sequence indicates that the protein precursor consists of 383 amino acid residues including a N-terminal presequence peptide of 52 residues. The amino acid sequence of mature spinach chloroplast O-acetylserine(thiol)lyase shows 40 and 57% homology with its bacterial counterparts. Sequence comparison with several pyridoxal 5'-phosphate-containing proteins reveals the presence of a lysine residue assumed to be involved in cofactor binding. A synthetic cDNA was constructed, coding for the entire 331-amino-acid mature O-acetylserine(thiol)lyase and for an initiating methionine. A high level of expression of the active mature chloroplast isoform was achieved in an Escherichia coli strain carrying the T7 RNA polymerase system (F. W. Studier, A. H. Rosenberg, J. J. Dunn, and J. W. Dubendorff, 1990, in Methods in Enzymology, D. V. Goeddel, Ed., Vol. 185, pp. 60-89, Academic Press, San Diego, CA). Addition of pyridoxine to the bacterial growth medium enhanced the enzyme activity due to the recombinant protein. The extent of production is 25-fold higher than in chloroplast from spinach leaves and the recombinant protein presents the relative molecular mass and immunological properties of the natural enzyme from spinach leaf chloroplast. This work, together with our previous biochemical studies, are in accordance with a prokaryotic type enzyme for L-cysteine biosynthesis in higher plant chloroplasts. Southern blot analysis indicated that O
Primer ID Validates Template Sampling Depth and Greatly Reduces the Error Rate of Next-Generation Sequencing of HIV-1 Genomic RNA Populations

PubMed Central

Zhou, Shuntai; Jones, Corbin; Mieczkowski, Piotr

2015-01-01

ABSTRACT Validating the sampling depth and reducing sequencing errors are critical for studies of viral populations using next-generation sequencing (NGS). We previously described the use of Primer ID to tag each viral RNA template with a block of degenerate nucleotides in the cDNA primer. We now show that low-abundance Primer IDs (offspring Primer IDs) are generated due to PCR/sequencing errors. These artifactual Primer IDs can be removed using a cutoff model for the number of reads required to make a template consensus sequence. We have modeled the fraction of sequences lost due to Primer ID resampling. For a typical sequencing run, less than 10% of the raw reads are lost to offspring Primer ID filtering and resampling. The remaining raw reads are used to correct for PCR resampling and sequencing errors. We also demonstrate that Primer ID reveals bias intrinsic to PCR, especially at low template input or utilization. cDNA synthesis and PCR convert ca. 20% of RNA templates into recoverable sequences, and 30-fold sequence coverage recovers most of these template sequences. We have directly measured the residual error rate to be around 1 in 10,000 nucleotides. We use this error rate and the Poisson distribution to define the cutoff to identify preexisting drug resistance mutations at low abundance in an HIV-infected subject. Collectively, these studies show that >90% of the raw sequence reads can be used to validate template sampling depth and to dramatically reduce the error rate in assessing a genetically diverse viral population using NGS. IMPORTANCE Although next-generation sequencing (NGS) has revolutionized sequencing strategies, it suffers from serious limitations in defining sequence heterogeneity in a genetically diverse population, such as HIV-1 due to PCR resampling and PCR/sequencing errors. The Primer ID approach reveals the true sampling depth and greatly reduces errors. Knowing the sampling depth allows the construction of a model of how to maximize
Integrating multiple genomic data to predict disease-causing nonsynonymous single nucleotide variants in exome sequencing studies.

PubMed

Wu, Jiaxin; Li, Yanda; Jiang, Rui

2014-03-01

Exome sequencing has been widely used in detecting pathogenic nonsynonymous single nucleotide variants (SNVs) for human inherited diseases. However, traditional statistical genetics methods are ineffective in analyzing exome sequencing data, due to such facts as the large number of sequenced variants, the presence of non-negligible fraction of pathogenic rare variants or de novo mutations, and the limited size of affected and normal populations. Indeed, prevalent applications of exome sequencing have been appealing for an effective computational method for identifying causative nonsynonymous SNVs from a large number of sequenced variants. Here, we propose a bioinformatics approach called SPRING (Snv PRioritization via the INtegration of Genomic data) for identifying pathogenic nonsynonymous SNVs for a given query disease. Based on six functional effect scores calculated by existing methods (SIFT, PolyPhen2, LRT, MutationTaster, GERP and PhyloP) and five association scores derived from a variety of genomic data sources (gene ontology, protein-protein interactions, protein sequences, protein domain annotations and gene pathway annotations), SPRING calculates the statistical significance that an SNV is causative for a query disease and hence provides a means of prioritizing candidate SNVs. With a series of comprehensive validation experiments, we demonstrate that SPRING is valid for diseases whose genetic bases are either partly known or completely unknown and effective for diseases with a variety of inheritance styles. In applications of our method to real exome sequencing data sets, we show the capability of SPRING in detecting causative de novo mutations for autism, epileptic encephalopathies and intellectual disability. We further provide an online service, the standalone software and genome-wide predictions of causative SNVs for 5,080 diseases at http://bioinfo.au.tsinghua.edu.cn/spring.
Complete nucleotide sequence and genome structure of a Japanese isolate of hibiscus latent Fort Pierce virus, a unique tobamovirus that contains an internal poly(A) region in its 3' end.

PubMed

Yoshida, Tetsuya; Kitazawa, Yugo; Komatsu, Ken; Neriya, Yutaro; Ishikawa, Kazuya; Fujita, Naoko; Hashimoto, Masayoshi; Maejima, Kensaku; Yamaji, Yasuyuki; Namba, Shigetou

2014-11-01

In this study, we detected a Japanese isolate of hibiscus latent Fort Pierce virus (HLFPV-J), a member of the genus Tobamovirus, in a hibiscus plant in Japan and determined the complete sequence and organization of its genome. HLFPV-J has four open reading frames (ORFs), each of which shares more than 98 % nucleotide sequence identity with those of other HLFPV isolates. Moreover, HLFPV-J contains a unique internal poly(A) region of variable length, ranging from 44 to 78 nucleotides, in its 3'-untranslated region (UTR), as is the case with hibiscus latent Singapore virus (HLSV), another hibiscus-infecting tobamovirus. The length of the HLFPV-J genome was 6431 nucleotides, including the shortest internal poly(A) region. The sequence identities of ORFs 1, 2, 3 and 4 of HLFPV-J to other tobamoviruses were 46.6-68.7, 49.9-70.8, 31.0-70.8 and 39.4-70.1 %, respectively, at the nucleotide level and 39.8-75.0, 43.6-77.8, 19.2-70.4 and 31.2-74.2 %, respectively, at the amino acid level. The 5'- and 3'-UTRs of HLFPV-J showed 24.3-58.6 and 13.0-79.8 % identity, respectively, to other tobamoviruses. In particular, when compared to other tobamoviruses, each ORF and UTR of HLFPV-J showed the highest sequence identity to those of HLSV. Phylogenetic analysis showed that HLFPV-J, other HLFPV isolates and HLSV constitute a malvaceous-plant-infecting tobamovirus cluster. These results indicate that the genomic structure of HLFPV-J has unique features similar to those of HLSV. To our knowledge, this is the first report of the complete genome sequence of HLFPV.
Quantum-Sequencing: Fast electronic single DNA molecule sequencing

NASA Astrophysics Data System (ADS)

Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

2014-03-01

A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.
Mosaic organization of DNA nucleotides

NASA Technical Reports Server (NTRS)

Peng, C. K.; Buldyrev, S. V.; Havlin, S.; Simons, M.; Stanley, H. E.; Goldberger, A. L.

1994-01-01

Long-range power-law correlations have been reported recently for DNA sequences containing noncoding regions. We address the question of whether such correlations may be a trivial consequence of the known mosaic structure ("patchiness") of DNA. We analyze two classes of controls consisting of patchy nucleotide sequences generated by different algorithms--one without and one with long-range power-law correlations. Although both types of sequences are highly heterogenous, they are quantitatively distinguishable by an alternative fluctuation analysis method that differentiates local patchiness from long-range correlations. Application of this analysis to selected DNA sequences demonstrates that patchiness is not sufficient to account for long-range correlation properties.
Nucleotide sequences of Dictyostelium discoideum developmentally regulated cDNAs rich in (AAC) imply proteins that contain clusters of asparagine, glutamine, or threonine.

PubMed

Shaw, D R; Richter, H; Giorda, R; Ohmachi, T; Ennis, H L

1989-09-01

A Dictyostelium discoideum repetitive element composed of long repeats of the codon (AAC) is found in developmentally regulated transcripts. The concentration of (AAC) sequences is low in mRNA from dormant spores and growing cells and increases markedly during spore germination and multicellular development. The sequence hybridizes to many different sized Dictyostelium DNA restriction fragments indicating that it is scattered throughout the genome. Four cDNA clones isolated contain (AAC) sequences in the deduced coding region. Interestingly, the (AAC)-rich sequences are present in all three reading frames in the deduced proteins, i.e., AAC (asparagine), ACA (threonine) and CAA (glutamine). Three of the clones contain only one of these in-frame so that the individual proteins carry either asparagine, threonine, or glutamine clusters, not mixtures. However, one clone is both glutamine- and asparagine-rich. The (AAC) portion of the transcripts are reiterated 300 times in the haploid genome while the other portions of the cDNAs represent single copy genes, whose sequences show no similarity other than the (AAC) repeats. The repeated sequence is similar to the opa or M sequence found in Drosophila melanogaster notch and homeo box genes and in fly developmentally regulated transcripts. The transcripts are present on polysomes suggesting that they are translated. Although the function of these repeats is unknown, long amino acid repeats are a characteristic feature of extracellular proteins of lower eukaryotes.
iCLIP: Protein–RNA interactions at nucleotide resolution

PubMed Central

Huppertz, Ina; Attig, Jan; D’Ambrogio, Andrea; Easton, Laura E.; Sibley, Christopher R.; Sugimoto, Yoichiro; Tajnik, Mojca; König, Julian; Ule, Jernej

2014-01-01

RNA-binding proteins (RBPs) are key players in the post-transcriptional regulation of gene expression. Precise knowledge about their binding sites is therefore critical to unravel their molecular function and to understand their role in development and disease. Individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) identifies protein–RNA crosslink sites on a genome-wide scale. The high resolution and specificity of this method are achieved by an intramolecular cDNA circularization step that enables analysis of cDNAs that truncated at the protein–RNA crosslink sites. Here, we describe the improved iCLIP protocol and discuss critical optimization and control experiments that are required when applying the method to new RBPs. PMID:24184352
Nucleotide sequence variation at two genes of the phenylpropanoid pathway, the FAH1 and F3H genes, in Arabidopsis thaliana.

PubMed

Aguadé, M

2001-01-01

The FAH1 and F3H genes encode ferulate-5-hydroxylase and flavanone-3-hydroxylase, which are enzymes in the pathways leading to the synthesis of sinapic acid esters and flavonoids, respectively. Nucleotide variation at these genes was surveyed by sequencing a sample of 20 worldwide Arabidopsis thaliana ecotypes and one Arabidopsis lyrata spp. petraea stock. In contrast with most previously studied genes, the percentage of singletons was rather low in both the FAH1 and the F3H gene regions. There was, therefore, no footprint of a recent species expansion in the pattern of nucleotide variation in these regions. In both FAH1 and F3H, nucleotide variation was structured into two major highly differentiated haplotypes. In both genes, there was a peak of silent polymorphism in the 5' part of the coding region without a parallel increase in silent divergence. In FAH1, the peak was centered at the beginning of the second exon. In F3H, nucleotide diversity was highest at the beginning of the gene. The observed pattern of variation in both FAH1 and F3H, although suggestive of balancing selection, was compatible with a neutral model with no recombination.
Analysis of xylem formation in pine by cDNA sequencing

NASA Technical Reports Server (NTRS)

Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.;

1998-01-01

Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

PubMed

Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

2015-10-19

Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species.
Structure of the coding region and mRNA variants of the apyrase gene from pea (Pisum sativum)

NASA Technical Reports Server (NTRS)

Shibata, K.; Abe, S.; Davies, E.

2001-01-01

Partial amino acid sequences of a 49 kDa apyrase (ATP diphosphohydrolase, EC 3.6.1.5) from the cytoskeletal fraction of etiolated pea stems were used to derive oligonucleotide DNA primers to generate a cDNA fragment of pea apyrase mRNA by RT-PCR and these primers were used to screen a pea stem cDNA library. Two almost identical cDNAs differing in just 6 nucleotides within the coding regions were found, and these cDNA sequences were used to clone genomic fragments by PCR. Two nearly identical gene fragments containing 8 exons and 7 introns were obtained. One of them (H-type) encoded the mRNA sequence described by Hsieh et al. (1996) (DDBJ/EMBL/GenBank Z32743), while the other (S-type) differed by the same 6 nucleotides as the mRNAs, suggesting that these genes may be alleles. The six nucleotide differences between these two alleles were found solely in the first exon, and these mutation sites had two types of consensus sequences. These mRNAs were found with varying lengths of 3' untranslated regions (3'-UTR). There are some similarities between the 3'-UTR of these mRNAs and those of actin and actin binding proteins in plants. The putative roles of the 3'-UTR and alternative polyadenylation sites are discussed in relation to their possible role in targeting the mRNAs to different subcellular compartments.
An efficient and sensitive method for preparing cDNA libraries from scarce biological samples

PubMed Central

Sterling, Catherine H.; Veksler-Lublinsky, Isana; Ambros, Victor

2015-01-01

The preparation and high-throughput sequencing of cDNA libraries from samples of small RNA is a powerful tool to quantify known small RNAs (such as microRNAs) and to discover novel RNA species. Interest in identifying the small RNA repertoire present in tissues and in biofluids has grown substantially with the findings that small RNAs can serve as indicators of biological conditions and disease states. Here we describe a novel and straightforward method to clone cDNA libraries from small quantities of input RNA. This method permits the generation of cDNA libraries from sub-picogram quantities of RNA robustly, efficiently and reproducibly. We demonstrate that the method provides a significant improvement in sensitivity compared to previous cloning methods while maintaining reproducible identification of diverse small RNA species. This method should have widespread applications in a variety of contexts, including biomarker discovery from scarce samples of human tissue or body fluids. PMID:25056322
Identification of mitochondrial DNA sequence variation and development of single nucleotide polymorphic markers for CMS-D8 in cotton.

PubMed

Suzuki, Hideaki; Yu, Jiwen; Wang, Fei; Zhang, Jinfa

2013-06-01

Cytoplasmic male sterility (CMS), which is a maternally inherited trait and controlled by novel chimeric genes in the mitochondrial genome, plays a pivotal role in the production of hybrid seed. In cotton, no PCR-based marker has been developed to discriminate CMS-D8 (from Gossypium trilobum) from its normal Upland cotton (AD1, Gossypium hirsutum) cytoplasm. The objective of the current study was to develop PCR-based single nucleotide polymorphic (SNP) markers from mitochondrial genes for the CMS-D8 cytoplasm. DNA sequence variation in mitochondrial genes involved in the oxidative phosphorylation chain including ATP synthase subunit 1, 4, 6, 8 and 9, and cytochrome c oxidase 1, 2 and 3 subunits were identified by comparing CMS-D8, its isogenic maintainer and restorer lines on the same nuclear genetic background. An allelic specific PCR (AS-PCR) was utilized for SNP typing by incorporating artificial mismatched nucleotides into the third or fourth base from the 3' terminus in both the specific and nonspecific primers. The result indicated that the method modifying allele-specific primers was successful in obtaining eight SNP markers out of eight SNPs using eight primer pairs to discriminate two alleles between AD1 and CMS-D8 cytoplasms. Two of the SNPs for atp1 and cox1 could also be used in combination to discriminate between CMS-D8 and CMS-D2 cytoplasms. Additionally, a PCR-based marker from a nine nucleotide insertion-deletion (InDel) sequence (AATTGTTTT) at the 59-67 bp positions from the start codon of atp6, which is present in the CMS and restorer lines with the D8 cytoplasm but absent in the maintainer line with the AD1 cytoplasm, was also developed. A SNP marker for two nucleotide substitutions (AA in AD1 cytoplasm to CT in CMS-D8 cytoplasm) in the intron (1,506 bp) of cox2 gene was also developed. These PCR-based SNP markers should be useful in discriminating CMS-D8 and AD1 cytoplasms, or those with CMS-D2 cytoplasm as a rapid, simple, inexpensive, and
The nucleotide sequence of RNA1 of Lettuce big-vein virus, genus Varicosavirus, reveals its relation to nonsegmented negative-strand RNA viruses.

PubMed

Sasaya, Takahide; Ishikawa, Koichi; Koganezawa, Hiroki

2002-06-05

The complete nucleotide sequence of RNA1 from Lettuce big-vein virus (LBVV), the type member of the genus Varicosavirus, was determined. LBVV RNA1 consists of 6797 nucleotides and contains one large ORF that encodes a large (L) protein of 2040 amino acids with a predicted M(r) of 232,092. Northern blot hybridization analysis indicated that the LBVV RNA1 is a negative-sense RNA. Database searches showed that the amino acid sequence of L protein is homologous to those of L polymerases of nonsegmented negative-strand RNA viruses. A cluster dendrogram derived from alignments of the LBVV L protein and the L polymerases indicated that the L protein is most closely related to the L polymerases of plant rhabdoviruses. Transcription termination/polyadenylation signal-like poly(U) tracts that resemble those in rhabdovirus and paramyxovirus RNAs were present upstream and downstream of the coding region. Although LBVV is related to rhabdoviruses, a key distinguishing feature is that the genome of LBVV is segmented. The results reemphasize the need to reconsider the taxonomic position of varicosaviruses.
Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome.

PubMed

Dresch, Jacqueline M; Zellers, Rowan G; Bork, Daniel K; Drewell, Robert A

2016-01-01

A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development.
Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome

PubMed Central

Dresch, Jacqueline M.; Zellers, Rowan G.; Bork, Daniel K.; Drewell, Robert A.

2016-01-01

A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development. PMID:27330274
Cloning and expression of the cDNA encoding human fumarylacetoacetate hydrolase, the enzyme deficient in hereditary tyrosinemia: assignment of the gene to chromosome 15.

PubMed Central

Phaneuf, D; Labelle, Y; Bérubé, D; Arden, K; Cavenee, W; Gagné, R; Tanguay, R M

1991-01-01

Type 1 hereditary tyrosinemia (HT) is an autosomal recessive disease characterized by a deficiency of the enzyme fumarylacetoacetate hydrolase (FAH; E.C.3.7.1.2). We have isolated human FAH cDNA clones by screening a liver cDNA expression library using specific antibodies and plaque hybridization with a rat FAH cDNA probe. A 1,477-bp cDNA was sequenced and shown to code for FAH by an in vitro transcription-translation assay and sequence homology with tryptic fragments of purified FAH. Transient expression of this FAH cDNA in transfected CV-1 mammalian cells resulted in the synthesis of an immunoreactive protein comigrating with purified human liver FAH on SDS-PAGE and having enzymatic activity as shown by the hydrolysis of the natural substrate fumarylacetoacetate. This indicates that the single polypeptide chain encoded by the FAH gene contains all the genetic information required for functional activity, suggesting that the dimer found in vivo is a homodimer. The human FAH cDNA was used as a probe to determine the gene's chromosomal localization using somatic cell hybrids and in situ hybridization. The human FAH gene maps to the long arm of chromosome 15 in the region q23-q25. Images Figure 1 Figure 3 Figure 4 Figure 6 Figure 8 PMID:1998338
MIG-seq: an effective PCR-based method for genome-wide single-nucleotide polymorphism genotyping using the next-generation sequencing platform

PubMed Central

Suyama, Yoshihisa; Matsuki, Yu

2015-01-01

Restriction-enzyme (RE)-based next-generation sequencing methods have revolutionized marker-assisted genetic studies; however, the use of REs has limited their widespread adoption, especially in field samples with low-quality DNA and/or small quantities of DNA. Here, we developed a PCR-based procedure to construct reduced representation libraries without RE digestion steps, representing de novo single-nucleotide polymorphism discovery, and its genotyping using next-generation sequencing. Using multiplexed inter-simple sequence repeat (ISSR) primers, thousands of genome-wide regions were amplified effectively from a wide variety of genomes, without prior genetic information. We demonstrated: 1) Mendelian gametic segregation of the discovered variants; 2) reproducibility of genotyping by checking its applicability for individual identification; and 3) applicability in a wide variety of species by checking standard population genetic analysis. This approach, called multiplexed ISSR genotyping by sequencing, should be applicable to many marker-assisted genetic studies with a wide range of DNA qualities and quantities. PMID:26593239
Nucleotide sequence of the ribosomal RNA gene of Physarum polycephalum: intron 2 and its flanking regions of the 26S rRNA gene.

PubMed Central

Nomiyama, H; Kuhara, S; Kukita, T; Otsuka, T; Sakaki, Y

1981-01-01

The 26S ribosomal RNA gene of Physarum polycephalum is interrupted by two introns, and we have previously determined the sequence of one of them (intron 1) (Nomiyama et al. Proc.Natl.Acad.Sci.USA 78, 1376-1380, 1981). In this study we sequenced the second intron (intron 2) of about 0.5 kb length and its flanking regions, and found that one nucleotide at each junction is identical in intron 1 and intron 2, though the junction regions share no other sequence homology. Comparison of the flanking exon sequences to E. coli 23S rRNA sequences shows that conserved sequences are interspersed with tracts having little homology. In particular, the region encompassing the intron 2 interruption site is highly conserved. The E. coli ribosomal protein L1 binding region is also conserved. Images PMID:6171776
Cu,Zn superoxide dismutase: cloning and analysis of the Taenia solium gene and Taenia crassiceps cDNA.

PubMed

Parra-Unda, Ricardo; Vaca-Paniagua, Felipe; Jiménez, Lucia; Landa, Abraham

2012-01-01

Cytosolic Cu,Zn superoxide dismutase (Cu,Zn-SOD) catalyzes the dismutation of superoxide (O(2)(-)) to oxygen and hydrogen peroxide (H(2)O(2)) and plays an important role in the establishment and survival of helminthes in their hosts. In this work, we describe the Taenia solium Cu,Zn-SOD gene (TsCu,Zn-SOD) and a Taenia crassiceps (TcCu,Zn-SOD) cDNA. TsCu,Zn-SOD gene that spans 2.841 kb, and has three exons and two introns; the splicing junctions follow the GT-AG rule. Analysis in silico of the gene revealed that the 5'-flanking region has three putative TATA and CCAAT boxes, and transcription factor binding sites for NF1 and AP1. The transcription start site was a C, located at 22 nucleotides upstream of the translation start codon (ATG). Southern blot analysis showed that TcCu,Zn-SOD and TsCu,Zn-SOD genes are encoded by a single copy. The deduced amino acid sequences of TsCu,Zn-SOD gene and TcCu,Zn-SOD cDNA reveal 98.47% of identity, and the characteristic motives, including the catalytic site and β-barrel structure of the Cu,Zn-SOD. Proteomic and immunohistochemical analysis indicated that Cu,Zn-SOD does not have isoforms, is distributed throughout the bladder wall and is concentrated in the tegument of T. solium and T. crassiceps cysticerci. Expression analysis revealed that TcCu,Zn-SOD mRNA and protein expression levels do not change in cysticerci, even upon exposure to O(2)(-) (0-3.8 nmol/min) and H(2)O(2) (0-2mM), suggesting that this gene is constitutively expressed in these parasites. Published by Elsevier Inc.
Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

PubMed Central

Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J.; Debily, Marie-Anne; Devignes, Marie-Dominique; Dubchak, Inna; Endo, Toshinori; Estreicher, Anne; Eyras, Eduardo; Fukami-Kobayashi, Kaoru; R. Gopinath, Gopal; Graudens, Esther; Hahn, Yoonsoo; Han, Michael; Han, Ze-Guang; Hanada, Kousuke; Hanaoka, Hideki; Harada, Erimi; Hashimoto, Katsuyuki; Hinz, Ursula; Hirai, Momoki; Hishiki, Teruyoshi; Hopkinson, Ian; Imbeaud, Sandrine; Inoko, Hidetoshi; Kanapin, Alexander; Kaneko, Yayoi; Kasukawa, Takeya; Kelso, Janet; Kersey, Paul; Kikuno, Reiko; Kimura, Kouichi; Korn, Bernhard; Kuryshev, Vladimir; Makalowska, Izabela; Makino, Takashi; Mano, Shuhei; Mariage-Samson, Regine; Mashima, Jun; Matsuda, Hideo; Mewes, Hans-Werner; Minoshima, Shinsei; Nagai, Keiichi; Nagasaki, Hideki; Nagata, Naoki; Nigam, Rajni; Ogasawara, Osamu; Ohara, Osamu; Ohtsubo, Masafumi; Okada, Norihiro; Okido, Toshihisa; Oota, Satoshi; Ota, Motonori; Ota, Toshio; Otsuki, Tetsuji; Piatier-Tonneau, Dominique; Poustka, Annemarie; Ren, Shuang-Xi; Saitou, Naruya; Sakai, Katsunaga; Sakamoto, Shigetaka; Sakate, Ryuichi; Schupp, Ingo; Servant, Florence; Sherry, Stephen; Shiba, Rie; Shimizu, Nobuyoshi; Shimoyama, Mary; Simpson, Andrew J; Soares, Bento; Steward, Charles; Suwa, Makiko; Suzuki, Mami; Takahashi, Aiko; Tamiya, Gen; Tanaka, Hiroshi; Taylor, Todd; Terwilliger, Joseph D; Unneberg, Per; Veeramachaneni, Vamsi; Watanabe, Shinya; Wilming, Laurens; Yasuda, Norikazu; Yoo, Hyang-Sook; Stodolsky, Marvin; Makalowski, Wojciech; Go, Mitiko; Nakai, Kenta; Takagi, Toshihisa; Kanehisa, Minoru; Sakaki, Yoshiyuki; Quackenbush, John; Okazaki, Yasushi; Hayashizaki, Yoshihide; Hide, Winston; Chakraborty, Ranajit; Nishikawa, Ken; Sugawara, Hideaki; Tateno, Yoshio; Chen, Zhu; Oishi, Michio; Tonellato, Peter; Apweiler, Rolf; Okubo, Kousaku; Wagner, Lukas; Wiemann, Stefan; Strausberg, Robert L; Isogai, Takao; Auffray, Charles; Nomura, Nobuo; Sugano, Sumio

2004-01-01

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In
The primary structure of L37--a rat ribosomal protein with a zinc finger-like motif.

PubMed

Chan, Y L; Paz, V; Olvera, J; Wool, I G

1993-04-30

The amino acid sequence of the rat 60S ribosomal subunit protein L37 was deduced from the sequence of nucleotides in a recombinant cDNA. Ribosomal protein L37 has 96 amino acids, the NH2-terminal methionine is removed after translation of the mRNA, and has a molecular weight of 10,939. Ribosomal protein L37 has a single zinc finger-like motif of the C2-C2 type. Hybridization of the cDNA to digests of nuclear DNA suggests that there are 13 or 14 copies of the L37 gene. The mRNA for the protein is about 500 nucleotides in length. Rat L37 is related to Saccharomyces cerevisiae ribosomal protein YL35 and to Caenorhabditis elegans L37. We have identified in the data base a DNA sequence that encodes the chicken homolog of rat L37.
Characterization and Nucleotide Sequence of CARB-6, a New Carbenicillin-Hydrolyzing β-Lactamase from Vibrio cholerae

PubMed Central

Choury, Danièle; Aubert, Gérald; Szajnert, Marie-France; Azibi, Kemal; Delpech, Marc; Paul, Gérard

1999-01-01

A clinical strain of Vibrio cholerae non-O1 non-O139 isolated in France produced a new β-lactamase with a pI of 5.35. The purified enzyme, with a molecular mass of 33,000 Da, was characterized. Its kinetic constants show it to be a carbenicillin-hydrolyzing enzyme comparable to the five previously reported CARB β-lactamases and to SAR-1, another carbenicillin-hydrolyzing β-lactamase that has a pI of 4.9 and that is produced by a V. cholerae strain from Tanzania. This β-lactamase is designated CARB-6, and the gene for CARB-6 could not be transferred to Escherichia coli K-12 by conjugation. The nucleotide sequence of the structural gene was determined by direct sequencing of PCR-generated fragments from plasmid DNA with four pairs of primers covering the whole sequence of the reference CARB-3 gene. The gene encodes a 288-amino-acid protein that shares 94% homology with the CARB-1, CARB-2, and CARB-3 enzymes, 93% homology with the Proteus mirabilis N29 enzyme, and 86.5% homology with the CARB-4 enzyme. The sequence of CARB-6 differs from those of CARB-3, CARB-2, CARB-1, N29, and CARB-4 at 15, 16, 17, 19, and 37 amino acid positions, respectively. All these mutations are located in the C-terminal region of the sequence and at the surface of the molecule, according to the crystal structure of the Staphylococcus aureus PC-1 β-lactamase. PMID:9925522
A new approach for cloning hLIF cDNA from genomic DNA isolated from the oral mucous membrane.

PubMed

Cui, Y H; Zhu, G Q; Chen, Q J; Wang, Y F; Yang, M M; Song, Y X; Wang, J G; Cao, B Y

2011-11-25

Complementary DNA (cDNA) is valuable for investigating protein structure and function in the study of life science, but it is difficult to obtain by traditional reverse transcription. We employed a novel strategy to clone human leukemia inhibitory factor (hLIF) gene cDNA from genomic DNA, which was directly isolated from the mucous membrane of mouth. The hLIF sequence, which is 609 bp long and is composed of three exons, can be acquired within a few hours by amplifying each exon and splicing all of them using overlap-PCR. This new approach developed is simple, time- and cost-effective, without RNA preparation or cDNA synthesis, and is not limited to the specific tissues for a particular gene and the expression level of the gene.
A Universal Next-Generation Sequencing Protocol To Generate Noninfectious Barcoded cDNA Libraries from High-Containment RNA Viruses

PubMed Central

Moser, Lindsey A.; Ramirez-Carvajal, Lisbeth; Puri, Vinita; Pauszek, Steven J.; Matthews, Krystal; Dilley, Kari A.; Mullan, Clancy; McGraw, Jennifer; Khayat, Michael; Beeri, Karen; Yee, Anthony; Dugan, Vivien; Heise, Mark T.; Frieman, Matthew B.; Rodriguez, Luis L.; Bernard, Kristen A.; Wentworth, David E.

2016-01-01

ABSTRACT Several biosafety level 3 and/or 4 (BSL-3/4) pathogens are high-consequence, single-stranded RNA viruses, and their genomes, when introduced into permissive cells, are infectious. Moreover, many of these viruses are select agents (SAs), and their genomes are also considered SAs. For this reason, cDNAs and/or their derivatives must be tested to ensure the absence of infectious virus and/or viral RNA before transfer out of the BSL-3/4 and/or SA laboratory. This tremendously limits the capacity to conduct viral genomic research, particularly the application of next-generation sequencing (NGS). Here, we present a sequence-independent method to rapidly amplify viral genomic RNA while simultaneously abolishing both viral and genomic RNA infectivity across multiple single-stranded positive-sense RNA (ssRNA+) virus families. The process generates barcoded DNA amplicons that range in length from 300 to 1,000 bp, which cannot be used to rescue a virus and are stable to transport at room temperature. Our barcoding approach allows for up to 288 barcoded samples to be pooled into a single library and run across various NGS platforms without potential reconstitution of the viral genome. Our data demonstrate that this approach provides full-length genomic sequence information not only from high-titer virion preparations but it can also recover specific viral sequence from samples with limited starting material in the background of cellular RNA, and it can be used to identify pathogens from unknown samples. In summary, we describe a rapid, universal standard operating procedure that generates high-quality NGS libraries free of infectious virus and infectious viral RNA. IMPORTANCE This report establishes and validates a standard operating procedure (SOP) for select agents (SAs) and other biosafety level 3 and/or 4 (BSL-3/4) RNA viruses to rapidly generate noninfectious, barcoded cDNA amenable for next-generation sequencing (NGS). This eliminates the burden of testing all
Characterization of Sri Lanka rabies virus isolates using nucleotide sequence analysis of nucleoprotein gene.

PubMed

Arai, Y T; Takahashi, H; Kameoka, Y; Shiino, T; Wimalaratne, O; Lodmell, D L

2001-01-01

Thirty-four suspected rabid brain samples from 2 humans, 24 dogs, 4 cats, 2 mongooses, I jackal and I water buffalo were collected in 1995-1996 in Sri Lanka. Total RNA was extracted directly from brain suspensions and examined using a one-step reverse transcription-polymerase chain reaction (RT-PCR) for the rabies virus nucleoprotein (N) gene. Twenty-eight samples were found positive for the virus N gene by RT-PCR and also for the virus antigens by fluorescent antibody (FA) test. Rabies virus isolates obtained from different animal species in different regions of Sri Lanka were genetically homogenous. Sequences of 203 nucleotides (nt)-long RT-PCR products obtained from 16 of 27 samples were found identical. Sequences of 1350 nt of N genes of 14 RT-PCR products were determined. The Sri Lanka isolates under study formed a specific cluster that included also an earlier isolate from India but did not include the known isolates from China, Thailand, Malaysia, Israel, Iran, Oman, Saudi Arabia, Russia, Nepal, Philippines, Japan and from several other countries. These results suggest that one type of rabies virus is circulating among human, dog, cat, mongoose, jackal and water buffalo living near Colombo City and in other five remote regions in Sri Lanka.
The Nucleotide Sequence and Spliced pol mRNA Levels of the Nonprimate Spumavirus Bovine Foamy Virus

PubMed Central

Holzschu, Donald L.; Delaney, Mari A.; Renshaw, Randall W.; Casey, James W.

1998-01-01

We have determined the complete nucleotide sequence of a replication-competent clone of bovine foamy virus (BFV) and have quantitated the amount of splice pol mRNA processed early in infection. The 544-amino-acid Gag protein precursor has little sequence similarity with its primate foamy virus homologs, but the putative nucleocapsid (NC) protein, like the primate NCs, contains the three glycine-arginine-rich regions that are postulated to bind genomic RNA during virion assembly. The BFV gag and pol open reading frames overlap, with pro and pol in the same translational frame. As with the human foamy virus (HFV) and feline foamy virus, we have detected a spliced pol mRNA by PCR. Quantitatively, this mRNA approximates the level of full-length genomic RNA early in infection. The integrase (IN) domain of reverse transcriptase does not contain the canonical HH-CC zinc finger motif present in all characterized retroviral INs, but it does contain a nearby histidine residue that could conceivably participate as a member of the zinc finger. The env gene encodes a protein that is over 40% identical in sequence to the HFV Env. By comparison, the Gag precursor of BFV is predicted to be only 28% identical to the HFV protein. PMID:9499074
Digital transcriptome profiling using selective hexamer priming for cDNA synthesis.

PubMed

Armour, Christopher D; Castle, John C; Chen, Ronghua; Babak, Tomas; Loerch, Patrick; Jackson, Stuart; Shah, Jyoti K; Dey, John; Rohl, Carol A; Johnson, Jason M; Raymond, Christopher K

2009-09-01

We developed a procedure for the preparation of whole transcriptome cDNA libraries depleted of ribosomal RNA from only 1 microg of total RNA. The method relies on a collection of short, computationally selected oligonucleotides, called 'not-so-random' (NSR) primers, to obtain full-length, strand-specific representation of nonribosomal RNA transcripts. In this study we validated the technique by profiling human whole brain and universal human reference RNA using ultra-high-throughput sequencing.
Uncommon nucleotide excision repair phenotypes revealed by targeted high-throughput sequencing.

PubMed

Calmels, Nadège; Greff, Géraldine; Obringer, Cathy; Kempf, Nadine; Gasnier, Claire; Tarabeux, Julien; Miguet, Marguerite; Baujat, Geneviève; Bessis, Didier; Bretones, Patricia; Cavau, Anne; Digeon, Béatrice; Doco-Fenzy, Martine; Doray, Bérénice; Feillet, François; Gardeazabal, Jesus; Gener, Blanca; Julia, Sophie; Llano-Rivas, Isabel; Mazur, Artur; Michot, Caroline; Renaldo-Robin, Florence; Rossi, Massimiliano; Sabouraud, Pascal; Keren, Boris; Depienne, Christel; Muller, Jean; Mandel, Jean-Louis; Laugel, Vincent

2016-03-22

Deficient nucleotide excision repair (NER) activity causes a variety of autosomal recessive diseases including xeroderma pigmentosum (XP) a disorder which pre-disposes to skin cancer, and the severe multisystem condition known as Cockayne syndrome (CS). In view of the clinical overlap between NER-related disorders, as well as the existence of multiple phenotypes and the numerous genes involved, we developed a new diagnostic approach based on the enrichment of 16 NER-related genes by multiplex amplification coupled with next-generation sequencing (NGS). Our test cohort consisted of 11 DNA samples, all with known mutations and/or non pathogenic SNPs in two of the tested genes. We then used the same technique to analyse samples from a prospective cohort of 40 patients. Multiplex amplification and sequencing were performed using AmpliSeq protocol on the Ion Torrent PGM (Life Technologies). We identified causative mutations in 17 out of the 40 patients (43%). Four patients showed biallelic mutations in the ERCC6(CSB) gene, five in the ERCC8(CSA) gene: most of them had classical CS features but some had very mild and incomplete phenotypes. A small cohort of 4 unrelated classic XP patients from the Basque country (Northern Spain) revealed a common splicing mutation in POLH (XP-variant), demonstrating a new founder effect in this population. Interestingly, our results also found ERCC2(XPD), ERCC3(XPB) or ERCC5(XPG) mutations in two cases of UV-sensitive syndrome and in two cases with mixed XP/CS phenotypes. Our study confirms that NGS is an efficient technique for the analysis of NER-related disorders on a molecular level. It is particularly useful for phenotypes with combined features or unusually mild symptoms. Targeted NGS used in conjunction with DNA repair functional tests and precise clinical evaluation permits rapid and cost-effective diagnosis in patients with NER-defects.

Transcripts of the NADH-dehydrogenase subunit 3 gene are differentially edited in Oenothera mitochondria.

PubMed Central

Schuster, W; Wissinger, B; Unseld, M; Brennicke, A

1990-01-01

A number of cytosines are altered to be recognized as uridines in transcripts of the nad3 locus in mitochondria of the higher plant Oenothera. Such nucleotide modifications can be found at 16 different sites within the nad3 coding region. Most of these alterations in the mRNA sequence change codon identities to specify amino acids better conserved in evolution. Individual cDNA clones differ in their degree of editing at five nucleotide positions, three of which are silent, while two lead to codon alterations specifying different amino acids. None of the cDNA clones analysed is maximally edited at all possible sites, suggesting slow processing or lowered stringency of editing at these nucleotides. Differentially edited transcripts could be editing intermediates or could code for differing polypeptides. Two edited nucleotides in an open reading frame located upstream of nad3 change two amino acids in the deduced polypeptide. Part of the well-conserved ribosomal protein gene rps12 also encoded downstream of nad3 in other plants, is lost in Oenothera mitochondria by recombination events. The functional rps12 protein must be imported from the cytoplasm since the deleted sequences of this gene are not found in the Oenothera mitochondrial genome. The pseudogene sequence is not edited at any nucleotide position. Images Fig. 3. Fig. 4. Fig. 7. PMID:1688531
Exploring single nucleotide polymorphism (SNP), microsatellite (SSR) and differentially expressed genes in the jellyfish (Rhopilema esculentum) by transcriptome sequencing.

PubMed

Li, Yunfeng; Zhou, Zunchun; Tian, Meilin; Tian, Yi; Dong, Ying; Li, Shilei; Liu, Weidong; He, Chongbo

2017-08-01

In this study, single nucleotide polymorphism (SNP), microsatellite (SSR) and differentially expressed genes (DEGs) in the oral parts, gonads, and umbrella parts of the jellyfish Rhopilema esculentum were analyzed by RNA-Seq technology. A total of 76.4 million raw reads and 72.1 million clean reads were generated from deep sequencing. Approximately 119,874 tentative unigenes and 149,239 transcripts were obtained. A total of 1,034,708 SNP markers were detected in the three tissues. For microsatellite mining, 5088 SSRs were identified from the unigene sequences. The most frequent repeat motifs were mononucleotide repeats, which accounted for 61.93%. Transcriptome comparison of the three tissues yielded a total of 8841 DEGs, of which 3560 were up-regulated and 5281 were down-regulated. This study represents the greatest sequencing effort carried out for a jellyfish and provides the first high-throughput transcriptomic resource for jellyfish. Copyright © 2017 Elsevier B.V. All rights reserved.
Informatic selection of a neural crest-melanocyte cDNA set for microarray analysis

PubMed Central

Loftus, S. K.; Chen, Y.; Gooden, G.; Ryan, J. F.; Birznieks, G.; Hilliard, M.; Baxevanis, A. D.; Bittner, M.; Meltzer, P.; Trent, J.; Pavan, W.

1999-01-01

With cDNA microarrays, it is now possible to compare the expression of many genes simultaneously. To maximize the likelihood of finding genes whose expression is altered under the experimental conditions, it would be advantageous to be able to select clones for tissue-appropriate cDNA sets. We have taken advantage of the extensive sequence information in the dbEST expressed sequence tag (EST) database to identify a neural crest-derived melanocyte cDNA set for microarray analysis. Analysis of characterized genes with dbEST identified one library that contained ESTs representing 21 neural crest-expressed genes (library 198). The distribution of the ESTs corresponding to these genes was biased toward being derived from library 198. This is in contrast to the EST distribution profile for a set of control genes, characterized to be more ubiquitously expressed in multiple tissues (P < 1 × 10−9). From library 198, a subset of 852 clustered ESTs were selected that have a library distribution profile similar to that of the 21 neural crest-expressed genes. Microarray analysis demonstrated the majority of the neural crest-selected 852 ESTs (Mel1 array) were differentially expressed in melanoma cell lines compared with a non-neural crest kidney epithelial cell line (P < 1 × 10−8). This was not observed with an array of 1,238 ESTs that was selected without library origin bias (P = 0.204). This study presents an approach for selecting tissue-appropriate cDNAs that can be used to examine the expression profiles of developmental processes and diseases. PMID:10430933
Complete complementary DNA-derived amino acid sequence of canine cardiac phospholamban.

PubMed Central

Fujii, J; Ueno, A; Kitano, K; Tanaka, S; Kadoma, M; Tada, M

1987-01-01

Complementary DNA (cDNA) clones specific for phospholamban of sarcoplasmic reticulum membranes have been isolated from a canine cardiac cDNA library. The amino acid sequence deduced from the cDNA sequence indicates that phospholamban consists of 52 amino acid residues and lacks an amino-terminal signal sequence. The protein has an inferred mol wt 6,080 that is in agreement with its apparent monomeric mol wt 6,000, estimated previously by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Phospholamban contains two distinct domains, a hydrophilic region at the amino terminus (domain I) and a hydrophobic region at the carboxy terminus (domain II). We propose that domain I is localized at the cytoplasmic surface and offers phosphorylatable sites whereas domain II is anchored into the sarcoplasmic reticulum membrane. PMID:3793929
Complete nucleotide sequence of the freshwater unicellular cyanobacterium Synechococcus elongatus PCC 6301 chromosome: gene content and organization.

PubMed

Sugita, Chieko; Ogata, Koretsugu; Shikata, Masamitsu; Jikuya, Hiroyuki; Takano, Jun; Furumichi, Miho; Kanehisa, Minoru; Omata, Tatsuo; Sugiura, Masahiro; Sugita, Mamoru

2007-01-01

The entire genome of the unicellular cyanobacterium Synechococcus elongatus PCC 6301 (formerly Anacystis nidulans Berkeley strain 6301) was sequenced. The genome consisted of a circular chromosome 2,696,255 bp long. A total of 2,525 potential protein-coding genes, two sets of rRNA genes, 45 tRNA genes representing 42 tRNA species, and several genes for small stable RNAs were assigned to the chromosome by similarity searches and computer predictions. The translated products of 56% of the potential protein-coding genes showed sequence similarities to experimentally identified and predicted proteins of known function, and the products of 35% of the genes showed sequence similarities to the translated products of hypothetical genes. The remaining 9% of genes lacked significant similarities to genes for predicted proteins in the public DNA databases. Some 139 genes coding for photosynthesis-related components were identified. Thirty-seven genes for two-component signal transduction systems were also identified. This is the smallest number of such genes identified in cyanobacteria, except for marine cyanobacteria, suggesting that only simple signal transduction systems are found in this strain. The gene arrangement and nucleotide sequence of Synechococcus elongatus PCC 6301 were nearly identical to those of a closely related strain Synechococcus elongatus PCC 7942, except for the presence of a 188.6 kb inversion. The sequences as well as the gene information shown in this paper are available in the Web database, CYORF (http://www.cyano.genome.jp/).
Nucleotide Selectivity in Abiotic RNA Polymerization Reactions.

PubMed

Coari, Kristin M; Martin, Rebecca C; Jain, Kopal; McGown, Linda B

2017-09-01

In order to establish an RNA world on early Earth, the nucleotides must form polymers through chemical rather than biochemical reactions. The polymerization products must be long enough to perform catalytic functions, including self-replication, and to preserve genetic information. These functions depend not only on the length of the polymers, but also on their sequences. To date, studies of abiotic RNA polymerization generally have focused on routes to polymerization of a single nucleotide and lengths of the homopolymer products. Less work has been done the selectivity of the reaction toward incorporation of some nucleotides over others in nucleotide mixtures. Such information is an essential step toward understanding the chemical evolution of RNA. To address this question, in the present work RNA polymerization reactions were performed in the presence of montmorillonite clay catalyst. The nucleotides included the monophosphates of adenosine, cytosine, guanosine, uridine and inosine. Experiments included reactions of mixtures of an imidazole-activated nucleotide (ImpX) with one or more unactivated nucleotides (XMP), of two or more ImpX, and of XMP that were activated in situ in the polymerization reaction itself. The reaction products were analyzed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) to identify the lengths and nucleotide compositions of the polymerization products. The results show that the extent of polymerization, the degree of heteropolymerization vs. homopolymerization, and the composition of the polymeric products all vary among the different nucleotides and depend upon which nucleotides and how many different nucleotides are present in the mixture.
Nucleotide Selectivity in Abiotic RNA Polymerization Reactions

NASA Astrophysics Data System (ADS)

Coari, Kristin M.; Martin, Rebecca C.; Jain, Kopal; McGown, Linda B.

2017-09-01

In order to establish an RNA world on early Earth, the nucleotides must form polymers through chemical rather than biochemical reactions. The polymerization products must be long enough to perform catalytic functions, including self-replication, and to preserve genetic information. These functions depend not only on the length of the polymers, but also on their sequences. To date, studies of abiotic RNA polymerization generally have focused on routes to polymerization of a single nucleotide and lengths of the homopolymer products. Less work has been done the selectivity of the reaction toward incorporation of some nucleotides over others in nucleotide mixtures. Such information is an essential step toward understanding the chemical evolution of RNA. To address this question, in the present work RNA polymerization reactions were performed in the presence of montmorillonite clay catalyst. The nucleotides included the monophosphates of adenosine, cytosine, guanosine, uridine and inosine. Experiments included reactions of mixtures of an imidazole-activated nucleotide (ImpX) with one or more unactivated nucleotides (XMP), of two or more ImpX, and of XMP that were activated in situ in the polymerization reaction itself. The reaction products were analyzed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) to identify the lengths and nucleotide compositions of the polymerization products. The results show that the extent of polymerization, the degree of heteropolymerization vs. homopolymerization, and the composition of the polymeric products all vary among the different nucleotides and depend upon which nucleotides and how many different nucleotides are present in the mixture.
iCLIP: protein-RNA interactions at nucleotide resolution.

PubMed

Huppertz, Ina; Attig, Jan; D'Ambrogio, Andrea; Easton, Laura E; Sibley, Christopher R; Sugimoto, Yoichiro; Tajnik, Mojca; König, Julian; Ule, Jernej

2014-02-01

RNA-binding proteins (RBPs) are key players in the post-transcriptional regulation of gene expression. Precise knowledge about their binding sites is therefore critical to unravel their molecular function and to understand their role in development and disease. Individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) identifies protein-RNA crosslink sites on a genome-wide scale. The high resolution and specificity of this method are achieved by an intramolecular cDNA circularization step that enables analysis of cDNAs that truncated at the protein-RNA crosslink sites. Here, we describe the improved iCLIP protocol and discuss critical optimization and control experiments that are required when applying the method to new RBPs. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Characterization of rat calcitonin mRNA.

PubMed Central

Amara, S G; David, D N; Rosenfeld, M G; Roos, B A; Evans, R M

1980-01-01

A chimeric plasmic containing cDNA complementary to rat calcitonin mRNA has been constructed. Partial sequence analysis shows that the insert contains a nucleotide sequence encoding the complete amino acid sequence of calcitonin. Two basic amino acids precede and three basic amino acids follow the hormone sequence, suggesting that calcitonin is generated by the proteolytic cleavage of a larger precursor in a manner analogous to that of other small polypeptide hormones. The COOH-terminal proline, known to be amidated in the secreted hormone, is followed by a glycine in the precursor. The cloned calcitonin DNA was used to characterize the expression of calcitonin mRNA. Cytoplasmic mRNAs from calcitonin-producing rat medullary thyroid carcinoma lines and from normal rat thyroid glands contain a single species, 1050 nucleotides long, whch hybridizes to the cloned calcitonin cDNA. The concentration of calcitonin mRNA sequences is greater in those tumors that produce larger amounts of immunoreactive calcitonin. RNAs from other endocrine tissues, including anterior and neurointermediate lobes of rat pituitary, contain no detectable calcitonin mRNA. Images PMID:6933496
Inferring Multiple Refugia and Phylogeographical Patterns in Pinus massoniana Based on Nucleotide Sequence Variation and DNA Fingerprinting

PubMed Central

Lin, Chung-Jian; Huang, Chi-Chung; Huang, Chao-Ching; Chiang, Yu-Chung; Chiang, Tzen-Yuh

2012-01-01

Background Pinus massoniana, an ecologically and economically important conifer, is widespread across central and southern mainland China and Taiwan. In this study, we tested the central–marginal paradigm that predicts that the marginal populations tend to be less polymorphic than the central ones in their genetic composition, and examined a founders' effect in the island population. Methodology/Principal Findings We examined the phylogeography and population structuring of the P. massoniana based on nucleotide sequences of cpDNA atpB-rbcL intergenic spacer, intron regions of the AdhC2 locus, and microsatellite fingerprints. SAMOVA analysis of nucleotide sequences indicated that most genetic variants resided among geographical regions. High levels of genetic diversity in the marginal populations in the south region, a pattern seemingly contradicting the central–marginal paradigm, and the fixation of private haplotypes in most populations indicate that multiple refugia may have existed over the glacial maxima. STRUCTURE analyses on microsatellites revealed that genetic structure of mainland populations was mediated with recent genetic exchanges mostly via pollen flow, and that the genetic composition in east region was intermixed between south and west regions, a pattern likely shaped by gene introgression and maintenance of ancestral polymorphisms. As expected, the small island population in Taiwan was genetically differentiated from mainland populations. Conclusions/Significance The marginal populations in south region possessed divergent gene pools, suggesting that the past glaciations might have low impacts on these populations at low latitudes. Estimates of ancestral population sizes interestingly reflect a recent expansion in mainland from a rather smaller population, a pattern that seemingly agrees with the pollen record. PMID:22952747
A novel representation of the conformational structure of transfer RNAs. Correlation of the folding patterns of the polynucleotide chain with the base sequence and the nucleotide backbone torsions.

PubMed Central

Srinivasan, A R; Yathindra, N

1977-01-01

A novel description of the conformational characteristics of all the individual nucleotides and the phosphodiesters in tRNAs is presented in the form of a circular plot. This representation furnishes information of the base sequence with the folding patterns of the polynucleotide chain as one traverses along the circumference and with the individual nucleotide and phosphodiester linkage torsions along the radii. The circular plot obtained for yeast tRNAPhe strikingly distinguishes the helical and the loop regions. The variation of the different nucleotide torsions along the entire chain length and their effect on the secondary helical and tertiary loop regions become readily apparent. PMID:339206
Variation in the Nucleotide Sequence of Cottontail Rabbit Papillomavirus a and b Subtypes Affects Wart Regression and Malignant Transformation and Level of Viral Replication in Domestic Rabbits

PubMed Central

Salmon, Jérôme; Nonnenmacher, Mathieu; Cazé, Sandrine; Flamant, Patricia; Croissant, Odile; Orth, Gérard; Breitburd, Françoise

2000-01-01

We previously reported the partial characterization of two cottontail rabbit papillomavirus (CRPV) subtypes with strikingly divergent E6 and E7 oncoproteins. We report now the complete nucleotide sequences of these subtypes, referred to as CRPVa4 (7,868 nucleotides) and CRPVb (7,867 nucleotides). The CRPVa4 and CRPVb genomes differed at 238 (3%) nucleotide positions, whereas CRPVa4 and the prototype CRPV differed by only 5 nucleotides. The most variable region (7% nucleotide divergence) included the long regulatory region (LRR) and the E6 and E7 genes. A mutation in the stop codon resulted in an 8-amino-acid-longer CRPVb E4 protein, and a nucleotide deletion reduced the coding capacity of the E5 gene from 101 to 25 amino acids. In domestic rabbits homozygous for a specific haplotype of the DRA and DQA genes of the major histocompatibility complex, warts induced by CRPVb DNA or a chimeric genome containing the CRPVb LRR/E6/E7 region showed an early regression, whereas warts induced by CRPVa4 or a chimeric genome containing the CRPVa4 LRR/E6/E7 region persisted and evolved into carcinomas. In contrast, most CRPVa, CRPVb, and chimeric CRPV DNA-induced warts showed no early regression in rabbits homozygous for another DRA-DQA haplotype. Little, if any, viral replication is usually observed in domestic rabbit warts. When warts induced by CRPVa and CRPVb virions and DNA were compared, the number of cells positive for viral DNA or capsid antigens was found to be greater by 1 order of magnitude for specimens induced by CRPVb. Thus, both sequence variation in the LRR/E6/E7 region and the genetic constitution of the host influence the expression of the oncogenic potential of CRPV. Furthermore, intratype variation may overcome to some extent the host restriction of CRPV replication in domestic rabbits. PMID:11044121
Molecular cloning and characterization of a cDNA encoding the gibberellin biosynthetic enzyme ent-kaurene synthase B from pumpkin (Cucurbita maxima L.).

PubMed

Yamaguchi, S; Saito, T; Abe, H; Yamane, H; Murofushi, N; Kamiya, Y

1996-08-01

The first committed step in the formation of diterpenoids leading to gibberellin (GA) biosynthesis is the conversion of geranylgeranyl diphosphate (GGDP) to ent-kaurene. ent-Kaurene synthase A (KSA) catalyzes the conversion of GGDP to copalyl diphosphate (CDP), which is subsequently converted to ent-kaurene by ent-kaurene synthase B (KSB). A full-length KSB cDNA was isolated from developing cotyledons in immature seeds of pumpkin (Cucurbita maxima L.). Degenerate oligonucleotide primers were designed from the amino acid sequences obtained from the purified protein to amplify a cDNA fragment, which was used for library screening. The isolated full-length cDNA was expressed in Escherichia coli as a fusion protein, which demonstrated the KSB activity to cyclize [3H]CDP to [3H]ent-kaurene. The KSB transcript was most abundant in growing tissues, but was detected in every organ in pumpkin seedlings. The deduced amino acid sequence shares significant homology with other terpene cyclases, including the conserved DDXXD motif, a putative divalent metal ion-diphosphate complex binding site. A putative transit peptide sequence that may target the translated product into the plastids is present in the N-terminal region.
Sequencing and analysis of 10967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morin, R D; Chang, E; Petrescu, A

2005-10-31

Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection initiative. Here we present an analysis of 10967 clones (8049 from X. laevis and 2918 from X. tropicalis). The clone set contains 2013 orthologs between X. laevis and X. tropicalis as well as 1795 paralog pairs within X. laevis. 1199 are in-paralogs, believed to have resulted from an allotetraploidization event approximately 30 million years ago, and the remaining 546 are likely out-paralogs that have resulted from more ancient gene duplications, prior to the divergence betweenmore » the two species. We do not detect any evidence for positive selection by the Yang and Nielsen maximum likelihood method of approximating d{sub N}/d{sub S}. However, d{sub N}/d{sub S} for X. laevis in-paralogs is elevated relative to X. tropicalis orthologs. This difference is highly significant, and indicates an overall relaxation of selective pressures on duplicated gene pairs. Within both groups of paralogs, we found evidence of subfunctionalization, manifested as differential expression of paralogous genes among tissues, as measured by EST information from public resources. We have observed, as expected, a higher instance of subfunctionalization in out-paralogs relative to in-paralogs.« less
Studies of a biochemical factory: tomato trichome deep expressed sequence tag sequencing and proteomics.

PubMed

Schilmiller, Anthony L; Miner, Dennis P; Larson, Matthew; McDowell, Eric; Gang, David R; Wilkerson, Curtis; Last, Robert L

2010-07-01

Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces beta-caryophyllene and alpha-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells.
Sequence, molecular properties, and chromosomal mapping of mouse lumican

NASA Technical Reports Server (NTRS)

Funderburgh, J. L.; Funderburgh, M. L.; Hevelone, N. D.; Stech, M. E.; Justice, M. J.; Liu, C. Y.; Kao, W. W.; Conrad, G. W.; Spooner, B. S. (Principal Investigator)

1995-01-01

PURPOSE. Lumican is a major proteoglycan of vertebrate cornea. This study characterizes mouse lumican, its molecular form, cDNA sequence, and chromosomal localization. METHODS. Lumican sequence was determined from cDNA clones selected from a mouse corneal cDNA expression library using a bovine lumican cDNA probe. Tissue expression and size of lumican mRNA were determined using Northern hybridization. Glycosidase digestion followed by Western blot analysis provided characterization of molecular properties of purified mouse corneal lumican. Chromosomal mapping of the lumican gene (Lcn) used Southern hybridization of a panel of genomic DNAs from an interspecific murine backcross. RESULTS. Mouse lumican is a 338-amino acid protein with high-sequence identity to bovine and chicken lumican proteins. The N-terminus of the lumican protein contains consensus sequences for tyrosine sulfation. A 1.9-kb lumican mRNA is present in cornea and several other tissues. Antibody against bovine lumican reacted with recombinant mouse lumican expressed in Escherichia coli and also detected high molecular weight proteoglycans in extracts of mouse cornea. Keratanase digestion of corneal proteoglycans released lumican protein, demonstrating the presence of sulfated keratan sulfate chains on mouse corneal lumican in vivo. The lumican gene (Lcn) was mapped to the distal region of mouse chromosome 10. The Lcn map site is in the region of a previously identified developmental mutant, eye blebs, affecting corneal morphology. CONCLUSIONS. This study demonstrates sulfated keratan sulfate proteoglycan in mouse cornea and describes the tools (antibodies and cDNA) necessary to investigate the functional role of this important corneal molecule using naturally occurring and induced mutants of the murine lumican gene.
Heat-shock response in a molluscan cell line: characterization of the response and cloning of an inducible HSP70 cDNA.

PubMed

Laursen, J R; di Liu, H; Wu, X J; Yoshino, T P

1997-11-01

Sublethal heat-shock of cells of the Bge (Biomphalaria glabrata embryonic) snail cell line resulted in increased or new expression of metabolically labeled polypeptides of approximately 21.5, 41, 70, and 74 kDa molecular mass. Regulation of this response appeared to be at the transcriptional level since a similar protein banding pattern was seen upon SDS-PAGE/fluorographic analysis of polypeptides produced by in vitro translation of total RNA from cells subjected to heat shock. Using a yeast (Saccharomyces cerevisiae) 70-kDa heat-shock protein (HSP70) probe to screen a cDNA library from heat-treated Bge cells, we isolated a full-length cDNA clone encoding a putative Bge HSP70. The cDNA was 2453 bp in length and contained an open reading frame of 1908 bp encoding a 636-amino-acid polypeptide with calculated molecular mass of 70,740 Da. Comparison of a conserved region of 209 amino acid residues revealed > 80% identity between the deduced amino acid sequence of Bge HSP70 and that of yeast (81%), the human blood fluke Schistosoma mansoni (for which B. glabrata serves as intermediate host) (81%), Drosophila (81%), human (84%), and the marine gastropod Aplysia californica (88%, 90%). In addition to the extensive sharing of sequence homology, the identification of several eukaryotic HSP70 signature sequences and an N-linked glycosylation site characteristic of cytoplasmic HSPs strongly support the identity of the Bge cDNA as encoding an authentic HSP70. Results of a Northern blot analysis, using Bge HSP70 clone-specific probes, indicated that gene expression was heat inducible and not constitutively expressed. This is the first reported sequence of an inducible HSP70 from cells originating from a freshwater gastropod and provides a first step in the development of a genetic transformation system for molluscs of medical importance.
Nucleotide sequence of the Saccharomyces cerevisiae PUT4 proline-permease-encoding gene: similarities between CAN1, HIP1 and PUT4 permeases.

PubMed

Vandenbol, M; Jauniaux, J C; Grenson, M

1989-11-15

The complete nucleotide (nt) sequence of the PUT4 gene, whose product is required for high-affinity proline active transport in the yeast Saccharomyces cerevisiae, is presented. The sequence contains a single long open reading frame of 1881 nt, encoding a polypeptide with a calculated Mr of 68,795. The predicted protein is strongly hydrophobic and exhibits six potential glycosylation sites. Its hydropathy profile suggests the presence of twelve membrane-spanning regions flanked by hydrophilic N- and C-terminal domains. The N terminus does not resemble signal sequences found in secreted proteins. These features are characteristic of integral membrane proteins catalyzing translocation of ligands across cellular membranes. Protein sequence comparisons indicate strong resemblance to the arginine and histidine permeases of S. cerevisiae, but no marked sequence similarity to the proline permease of Escherichia coli or to other known prokaryotic or eukaryotic transport proteins. The strong similarity between the three yeast amino acid permeases suggests a common ancestor for the three proteins.
Molecular Cloning and Characterization of an Acetylcholinesterase cDNA in the Brown Planthopper, Nilaparvata lugens

PubMed Central

Yang, Zhifan; Chen, Jun; Chen, Yongqin; Jiang, Sijing

2010-01-01

A full cDNA encoding an acetylcholinesterase (AChE, EC 3.1.1.7) was cloned and characterized from the brown planthopper, Nilaparvata lugens Stål (Hemiptera: Delphacidae). The complete cDNA (2467 bp) contains a 1938-bp open reading frame encoding 646 amino acid residues. The amino acid sequence of the AChE deduced from the cDNA consists of 30 residues for a putative signal peptide and 616 residues for the mature protein with a predicted molecular weight of 69,418. The three residues (Ser242, Glu371, and His485) that putatively form the catalytic triad and the six Cys that form intra-subunit disulfide bonds are completely conserved, and 10 out of the 14 aromatic residues lining the active site gorge of the AChE are also conserved. Northern blot analysis of poly(A)+ RNA showed an approximately 2.6-kb transcript, and Southern blot analysis revealed there likely was just a single copy of this gene in N. lugens. The deduced protein sequence is most similar to AChE of Nephotettix cincticeps with 83% amino acid identity. Phylogenetic analysis constructed with 45 AChEs from 30 species showed that the deduced N. lugens AChE formed a cluster with the other 8 insect AChE2s. Additionally, the hypervariable region and amino acids specific to insect AChE2 also existed in the AChE of N. lugens. The results revealed that the AChE cDNA cloned in this work belongs to insect AChE2 subgroup, which is orthologous to Drosophila AChE. Comparison of the AChEs between the susceptible and resistant strains revealed a point mutation, Gly185Ser, is likely responsible for the insensitivity of the AChE to methamidopho in the resistant strain. PMID:20874389
Molecular cloning and characterization of an acetylcholinesterase cDNA in the brown planthopper, Nilaparvata lugens.

PubMed

Yang, Zhifan; Chen, Jun; Chen, Yongqin; Jiang, Sijing

2010-01-01

A full cDNA encoding an acetylcholinesterase (AChE, EC 3.1.1.7) was cloned and characterized from the brown planthopper, Nilaparvata lugens Stål (Hemiptera: Delphacidae). The complete cDNA (2467 bp) contains a 1938-bp open reading frame encoding 646 amino acid residues. The amino acid sequence of the AChE deduced from the cDNA consists of 30 residues for a putative signal peptide and 616 residues for the mature protein with a predicted molecular weight of 69,418. The three residues (Ser242, Glu371, and His485) that putatively form the catalytic triad and the six Cys that form intra-subunit disulfide bonds are completely conserved, and 10 out of the 14 aromatic residues lining the active site gorge of the AChE are also conserved. Northern blot analysis of poly(A)+ RNA showed an approximately 2.6-kb transcript, and Southern blot analysis revealed there likely was just a single copy of this gene in N. lugens. The deduced protein sequence is most similar to AChE of Nephotettix cincticeps with 83% amino acid identity. Phylogenetic analysis constructed with 45 AChEs from 30 species showed that the deduced N. lugens AChE formed a cluster with the other 8 insect AChE2s. Additionally, the hypervariable region and amino acids specific to insect AChE2 also existed in the AChE of N. lugens. The results revealed that the AChE cDNA cloned in this work belongs to insect AChE2 subgroup, which is orthologous to Drosophila AChE. Comparison of the AChEs between the susceptible and resistant strains revealed a point mutation, Gly185Ser, is likely responsible for the insensitivity of the AChE to methamidopho in the resistant strain.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.