Tramontano, A; Macchiato, M F
1986-01-01
An algorithm to determine the probability that a reading frame codifies for a protein is presented. It is based on the results of our previous studies on the thermodynamic characteristics of a translated reading frame. We also develop a prediction procedure to distinguish between coding and non-coding reading frames. The procedure is based on the characteristics of the putative product of the DNA sequence and not on periodicity characteristics of the sequence, so the prediction is not biased by the presence of overlapping translated reading frames or by the presence of translated reading frames on the complementary DNA strand. PMID:3753761
Büssow, Konrad; Hoffmann, Steve; Sievert, Volker
2002-12-19
Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
Lammers, P J; McLaughlin, S; Papin, S; Trujillo-Provencio, C; Ryncarz, A J
1990-01-01
An 11-kbp DNA element of unknown function interrupts the nifD gene in vegetative cells of Anabaena sp. strain PCC 7120. In developing heterocysts the nifD element excises from the chromosome via site-specific recombination between short repeat sequences that flank the element. The nucleotide sequence of the nifH-proximal half of the element was determined to elucidate the genetic potential of the element. Four open reading frames with the same relative orientation as the nifD element-encoded xisA gene were identified in the sequenced region. Each of the open reading frames was preceded by a reasonable ribosome-binding site and had biased codon utilization preferences consistent with low levels of expression. Open reading frame 3 was highly homologous with three cytochrome P-450 omega-hydroxylase proteins and showed regional homology to functionally significant domains common to the cytochrome P-450 superfamily. The sequence encoding open reading frame 2 was the most highly conserved portion of the sequenced region based on heterologous hybridization experiments with three genera of heterocystous cyanobacteria. Images PMID:2123860
Schaeffer, E; Sninsky, J J
1984-01-01
Proteins that are related evolutionarily may have diverged at the level of primary amino acid sequence while maintaining similar secondary structures. Computer analysis has been used to compare the open reading frames of the hepatitis B virus to those of the woodchuck hepatitis virus at the level of amino acid sequence, and to predict the relative hydrophilic character and the secondary structure of putative polypeptides. Similarity is seen at the levels of relative hydrophilicity and secondary structure, in the absence of sequence homology. These data reinforce the proposal that these open reading frames encode viral proteins. Computer analysis of this type can be more generally used to establish structural similarities between proteins that do not share obvious sequence homology as well as to assess whether an open reading frame is fortuitous or codes for a protein. PMID:6585835
Open Reading Frame Phylogenetic Analysis on the Cloud
2013-01-01
Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843
The organisation and interviral homologies of genes at the 3' end of tobacco rattle virus RNA1
Boccara, Martine; Hamilton, William D. O.; Baulcombe, David C.
1986-01-01
The RNA1 of tobacco rattle virus (TRV) has been cloned as cDNA and the nucleotide sequence determined of 2 kb from the 3'-terminal region. The sequence contains three long open reading frames. One of these starts 5' of the cDNA and probably corresponds to the carboxy-terminal sequence of a 170-K protein encoded on RNA1. The deduced protein sequence from this reading frame shows homology with the putative replicases of tobacco mosaic virus (TMV) and tricornaviruses. The location of the second open reading frame, which encodes a 29-K polypeptide, was shown by Northern blot analysis to coincide with a 1.6-kb subgenomic RNA. The validity of this reading frame was confirmed by showing that the cDNA extending over this region could be transcribed and translated in vitro to produce a polypeptide of the predicted size which co-migrates in electrophoresis with a translation product of authentic viral RNA. The sequence of this 29-K polypeptide showed homology with two regions in the 30-K protein of TMV. This homology includes positions in the TMV 30-K protein where mutations have been identified which affect the transport of virus between cells. The third open reading frame encodes a potential 16-K protein and was shown by Northern blot hybridisation to be contained within the region of a 0.7-kb subgenomic RNA which is found in cellular RNA of infected cells but not virus particles. The many similarities between TRV and TMV in viral morphology, gene organisation and sequence suggest that these two viral groups may share a common viral ancestor. ImagesFig. 2.Fig. 3. PMID:16453668
Bioinformatic analysis suggests that the Orbivirus VP6 cistron encodes an overlapping gene
Firth, Andrew E
2008-01-01
Background The genus Orbivirus includes several species that infect livestock – including Bluetongue virus (BTV) and African horse sickness virus (AHSV). These viruses have linear dsRNA genomes divided into ten segments, all of which have previously been assumed to be monocistronic. Results Bioinformatic evidence is presented for a short overlapping coding sequence (CDS) in the Orbivirus genome segment 9, overlapping the VP6 cistron in the +1 reading frame. In BTV, a 77–79 codon AUG-initiated open reading frame (hereafter ORFX) is present in all 48 segment 9 sequences analysed. The pattern of base variations across the 48-sequence alignment indicates that ORFX is subject to functional constraints at the amino acid level (even when the constraints due to coding in the overlapping VP6 reading frame are taken into account; MLOGD software). In fact the translated ORFX shows greater amino acid conservation than the overlapping region of VP6. The ORFX AUG codon has a strong Kozak context in all 48 sequences. Each has only one or two upstream AUG codons, always in the VP6 reading frame, and (with a single exception) always with weak or medium Kozak context. Thus, in BTV, ORFX may be translated via leaky scanning. A long (83–169 codon) ORF is present in a corresponding location and reading frame in all other Orbivirus species analysed except Saint Croix River virus (SCRV; the most divergent). Again, the pattern of base variations across sequence alignments indicates multiple coding in the VP6 and ORFX reading frames. Conclusion At ~9.5 kDa, the putative ORFX product in BTV is too small to appear on most published protein gels. Nonetheless, a review of past literature reveals a number of possible detections. We hope that presentation of this bioinformatic analysis will stimulate an attempt to experimentally verify the expression and functional role of ORFX, and hence lead to a greater understanding of the molecular biology of these important pathogens. PMID:18489030
Brown, T A; Davies, R W; Ray, J A; Waring, R B; Scazzocchio, C
1983-01-01
A 2830-bp segment of the mitochondrial genome of the fungus Aspergillus nidulans was sequenced and shown to contain two unidentified reading frames (URFs). These reading frames are 352 and 488 codons in length, and would specify unmodified proteins of mol. wts. 39,000 and 54,000, respectively. The derived amino acid sequences indicate that these genes are equivalent to the human mitochondrial URFs 1 and 4, with 39% amino acid homology for URF1 and 26% for URF4. Both URFs were shown by secondary structure predictions to code for predominantly beta-sheeted proteins with strong structural conservation between the fungal and human homologues. Counterparts of mammalian URFs have not previously been identified in non-mammalian genomes, and the discovery that A. nidulans possesses reading frames so closely homologous with URF1 and URF4 shows that these genes are of general functional importance in the mitochondria of diverse species. PMID:11894959
Sugita, Mamoru; Shinozaki, Kazuo; Sugiura, Masahiro
1985-01-01
The nucleotide sequence of a tRNALys(UUU) gene on tobacco (Nicotiana tabacum) chloroplast DNA has been determined. This gene is located 215 base pairs upstream from the gene for the 32,000-dalton thylakoid membrane protein on the same DNA strand and has a 2526-base-pair intron in the anticodon loop. The intron boundary sequence does not follow the G-U/A-G rule but is similar to those of tobacco chloroplast split genes for tRNAGly(UCC) and ribosomal proteins L2 and S12. The intron contains one major open reading frame of 509 codons. The codon usage in the open reading frame resembles those observed in the genes for tobacco chloroplast proteins so far analyzed. The primary transcript of this tRNA gene is 2.7 kilobases long. Images PMID:16593561
Sugita, M; Shinozaki, K; Sugiura, M
1985-06-01
The nucleotide sequence of a tRNA(Lys)(UUU) gene on tobacco (Nicotiana tabacum) chloroplast DNA has been determined. This gene is located 215 base pairs upstream from the gene for the 32,000-dalton thylakoid membrane protein on the same DNA strand and has a 2526-base-pair intron in the anticodon loop. The intron boundary sequence does not follow the G-U/A-G rule but is similar to those of tobacco chloroplast split genes for tRNA(Gly)(UCC) and ribosomal proteins L2 and S12. The intron contains one major open reading frame of 509 codons. The codon usage in the open reading frame resembles those observed in the genes for tobacco chloroplast proteins so far analyzed. The primary transcript of this tRNA gene is 2.7 kilobases long.
The TGA codons are present in the open reading frame of selenoprotein P cDNA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hill, K.E.; Lloyd, R.S.; Read, R.
1991-03-11
The TGA codon in DNA has been shown to direct incorporation of selenocysteine into protein. Several proteins from bacteria and animals contain selenocysteine in their primary structures. Each of the cDNA clones of these selenoproteins contains one TGA codon in the open reading frame which corresponds to the selenocysteine in the protein. A cDNA clone for selenoprotein P (SeP), obtained from a {gamma}ZAP rat liver library, was sequenced by the dideoxy termination method. The correct reading frame was determined by comparison of the deduced amino acid sequence with the amino acid sequence of several peptides from SeP. Using SeP labelledmore » with {sup 75}Se in vivo, the selenocysteine content of the peptides was verified by the collection of carboxymethylated {sup 77}Se-selenocysteine as it eluted from the amino acid analyzer and determination of the radioactivity contained in the collected samples. Ten TGA codons are present in the open reading frame of the cDNA. Peptide fragmentation studies and the deduced sequence indicate that selenium-rich regions are located close to the carboxy terminus. Nine of the 10 selenocysteines are located in the terminal 26% of the sequence with four in the terminal 15 amino acids. The deduced sequence codes for a protein of 385 amino acids. Cleavage of the signal peptide gives the mature protein with 366 amino acids and a calculated mol wt of 41,052 Da. Searches of PIR and SWISSPROT protein databases revealed no similarity with glutathione peroxidase or other selenoproteins.« less
Gamo, F J; Lafuente, M J; Casamayor, A; Ariño, J; Aldea, M; Casas, C; Herrero, E; Gancedo, C
1996-06-15
We report the sequence of a 15.5 kb DNA segment located near the left telomere of chromosome XV of Saccharomyces cerevisiae. The sequence contains nine open reading frames (ORFs) longer than 300 bp. Three of them are internal to other ones. One corresponds to the gene LGT3 that encodes a putative sugar transporter. Three adjacent ORFs were separated by two stop codons in frame. These ORFs presented homology with the gene CPS1 that encodes carboxypeptidase S. The stop codons were not found in the same sequence derived from another yeast strain. Two other ORFs without significant homology in databases were also found. One of them, O0420, is very rich in serine and threonine and presents a series of repeated or similar amino acid stretches along the sequence.
Using expected sequence features to improve basecalling accuracy of amplicon pyrosequencing data.
Rask, Thomas S; Petersen, Bent; Chen, Donald S; Day, Karen P; Pedersen, Anders Gorm
2016-04-22
Amplicon pyrosequencing targets a known genetic region and thus inherently produces reads highly anticipated to have certain features, such as conserved nucleotide sequence, and in the case of protein coding DNA, an open reading frame. Pyrosequencing errors, consisting mainly of nucleotide insertions and deletions, are on the other hand likely to disrupt open reading frames. Such an inverse relationship between errors and expectation based on prior knowledge can be used advantageously to guide the process known as basecalling, i.e. the inference of nucleotide sequence from raw sequencing data. The new basecalling method described here, named Multipass, implements a probabilistic framework for working with the raw flowgrams obtained by pyrosequencing. For each sequence variant Multipass calculates the likelihood and nucleotide sequence of several most likely sequences given the flowgram data. This probabilistic approach enables integration of basecalling into a larger model where other parameters can be incorporated, such as the likelihood for observing a full-length open reading frame at the targeted region. We apply the method to 454 amplicon pyrosequencing data obtained from a malaria virulence gene family, where Multipass generates 20 % more error-free sequences than current state of the art methods, and provides sequence characteristics that allow generation of a set of high confidence error-free sequences. This novel method can be used to increase accuracy of existing and future amplicon sequencing data, particularly where extensive prior knowledge is available about the obtained sequences, for example in analysis of the immunoglobulin VDJ region where Multipass can be combined with a model for the known recombining germline genes. Multipass is available for Roche 454 data at http://www.cbs.dtu.dk/services/MultiPass-1.0 , and the concept can potentially be implemented for other sequencing technologies as well.
Irie, S; Doi, S; Yorifuji, T; Takagi, M; Yano, K
1987-01-01
The nucleotide sequence of the genes from Pseudomonas putida encoding oxidation of benzene to catechol was determined. Five open reading frames were found in the sequence. Four corresponding protein molecules were detected by a DNA-directed in vitro translation system. Escherichia coli cells containing the fragment with the four open reading frames transformed benzene to cis-benzene glycol, which is an intermediate of the oxidation of benzene to catechol. The relation between the product of each cistron and the components of the benzene oxidation enzyme system is discussed. Images PMID:3667527
A retrotransposable element from the mosquito Anopheles gambiae .
Besansky, N J
1990-01-01
A family of middle repetitive elements from the African malaria vector Anopheles gambiae is described. Approximately 100 copies of the element, designated T1Ag, are dispersed in the genome. Full-length elements are 4.6 kilobase pairs in length, but truncation of the 5' end is common. Nucleotide sequences of one full-length, two 5'-truncated, and two 5' ends of T1Ag elements were determined and aligned to define a consensus sequence. Sequence analysis revealed two long, overlapping open reading frames followed by a polyadenylation signal, AATAAA, and a tail consisting of tandem repetitions of the motif TGAAA. No direct or inverted long terminal repeats (LTRs) were detected. The first open reading frame, 442 amino acids in length, includes a domain resembling that of nucleic acid-binding proteins. The second open reading frame, 975 amino acids long, resembles the reverse transcriptases of a category of retrotransposable elements without LTRs, variously termed class II retrotransposons, class III elements or non-LTR retrotransposons. Similarity at the sequence and structural levels places T1Ag in this category. Images PMID:1689457
Jonniaux, J L; Coster, F; Purnelle, B; Goffeau, A
1994-12-01
We report the amino acid sequence of 13 open reading frames (ORF > 299 bp) located on a 21.7 kb DNA segment from the left arm of chromosome XIV of Saccharomyces cerevisiae. Five open reading frames had been entirely or partially sequenced previously: WHI3, GCR2, SPX19, SPX18 and a heat shock gene similar to SSB1. The products of 8 other ORFs are new putative proteins among which N1394 is probably a membrane protein. N1346 contains a leucine zipper pattern and the corresponding ORF presents an HAP (global regulator of respiratory genes) upstream activating sequence in the promoting region. N1386 shares homologies with the DNA structure-specific recognition protein family SSRPs and the corresponding ORF is preceded by an MCB (MluI cell cycle box) upstream activating factor.
Benne, R; De Vries, B F; Van den Burg, J; Klaver, B
1983-01-01
The nucleotide sequence of a 2.5-kb segment of the maxi-circle of Trypanosoma brucei mtDNA has been determined. The segment contains the gene for apocytochrome b, which displays about 25% homology at the amino acid level to the apocytochrome b gene from fungal and mammalian mtDNAs. Northern blot and S1 nuclease analyses have yielded accurate map positions of an RNA species in an area that coincides with the reading frame. The segment also contains two pairs of overlapping unassigned reading frames, which lack homology with any known mitochondrial gene or URF. The DNA sequence in these areas is AG-rich (70%), resulting in URFs with an unusually high level of glycine and charged amino acids (60%). They may not encode proteins, in spite of their size and the fact that abundant transcripts are mapped in these areas. Images PMID:6314266
Identification of a non-LTR retrotransposon from the gypsy moth
K.J. Garner; J.M. Slavicek
1999-01-01
A family of highly repetitive elements, named LDT1, has been identified in the gypsy moth, Lymantria dispar. The complete element is 5.4 kb in length and lacks long-terminal repeats, The element contains two open reading frames with a significant amino acid sequence similarity to several non-LTR retrotransposons. The first open reading frame contains...
Ohno, S
1984-01-01
Three outstanding properties uniquely qualify repeats of base oligomers as the primordial coding sequences of all polypeptide chains. First, when compared with randomly generated base sequences in general, they are more likely to have long open reading frames. Second, periodical polypeptide chains specified by such repeats are more likely to assume either alpha-helical or beta-sheet secondary structures than are polypeptide chains of random sequence. Third, provided that the number of bases in the oligomeric unit is not a multiple of 3, these internally repetitious coding sequences are impervious to randomly sustained base substitutions, deletions, and insertions. This is because the recurring periodicity of their polypeptide chains is given by three consecutive copies of the oligomeric unit translated in three different reading frames. Accordingly, when one reading frame is open, the other two are automatically open as well, all three being capable of coding for polypeptide chains of identical periodicity. Under this circumstance, a frame shift due to the deletion or insertion of a number of bases that is not a multiple of 3 fails to alter the down-stream amino acid sequence, and even a base change causing premature chain-termination can silence only one of the three potential coding units. Newly arisen coding sequences in modern organisms are oligomeric repeats, and most of the older genes retain various vestiges of their original internal repetitions. Some of the genes (e.g., oncogenes) have even inherited the property of being impervious to randomly sustained base changes.
Identification of the initiation site of poliovirus polyprotein synthesis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dorner, A.J.; Dorner, L.F.; Larsen, G.R.
1982-06-01
The complete nucleotide sequence of poliovirus RNA has a long open reading frame capable of encoding the precursor polyprotein NCVPOO. The first AUG codon in this reading frame is located 743 nucleotides from the 5' end of the RNA and is preceded by eight AUG codons in all three reading frames. Because all proteins that map at the amino terminus of the polyprotein (P1-1a, VPO, and VP4) are blocked at their amino termini and previous studies of ribosome binding have been inconclusive, direct identification of the initiation site of protein synthesis was difficult. We separated and identified all of themore » tryptic peptides of capsid protein VP4 and correlated these peptides with the amino acid sequence predicted to follow the AUG codon at nucleotide 743. Our data indicate that VP4 begins with a blocked glycine that is encoded immediately after the AUG codon at nucleotide 743. An S1 nuclease analysis of poliovirus mRNA failed to reveal a splice in the 5' region. We concluded that synthesis of poliovirus polyprotein is initiated at nucleotide 743, the first AUG codon in the long open reading frame.« less
Hall, R L; Moyer, R W
1991-01-01
Entomopoxvirus virions are frequently contained within crystalline occlusion bodies, which are composed of primarily a single protein, spheroidin, which is analogous to the polyhedrin protein of baculovirus. The spheroidin gene of Amsacta moorei entomopoxvirus was identified following the microsequencing of polypeptides generated from cyanogen bromide treatment of spheroidin and the subsequent synthesis of oligonucleotide hybridization probes. DNA sequencing of a 6.8-kb region of DNA containing the spheroidin gene showed that the spheroidin protein is derived from a 3.0-kb open reading frame potentially encoding a protein of 115 kDa. Three copies of the heptanucleotide, TTTTTNT, a sequence associated with early gene transcription in the vertebrate poxviruses, and four in-frame translational termination signals were found within 60 bp upstream of the putative spheroidin gene promoter (TAAATG). The spheroidin gene promoter region contains the sequence TAAATG, which is found in many late promoters of the vertebrate poxviruses and which serves as the site of transcriptional initiation, as shown by primer extension. Primer extension experiments also showed that spheroidin gene transcripts contain 5' poly(A) sequences typical of vertebrate poxvirus late transcripts. The 92 bases upstream of the initiating TAAATG are unusually A + T rich and contain only 7 G or C residues. An analysis of open reading frames around the spheroidin gene suggests that the colinear core of "essential genes" typical of the vertebrate poxviruses is absent in A. moorei entomopoxvirus. Images PMID:1942245
Harper, B; McClain, S; Ganko, E W
2012-08-01
Global regulatory agencies require bioinformatic sequence analysis as part of their safety evaluation for transgenic crops. Analysis typically focuses on encoded proteins and adjacent endogenous flanking sequences. Recently, regulatory expectations have expanded to include all reading frames of the inserted DNA. The intent is to provide biologically relevant results that can be used in the overall assessment of safety. This paper evaluates the relevance of assessing the allergenic potential of all DNA reading frames found in common food genes using methods considered for the analysis of T-DNA sequences used in transgenic crops. FASTA and BLASTX algorithms were used to compare genes from maize, rice, soybean, cucumber, melon, watermelon, and tomato using international regulatory guidance. Results show that BLASTX for maize yielded 7254 alignments that exceeded allergen similarity thresholds and 210,772 alignments that matched eight or more consecutive amino acids with an allergen; other crops produced similar results. This analysis suggests that each nontransgenic crop has a much greater potential for allergenic risk than what has been observed clinically. We demonstrate that a meaningful safety assessment is unlikely to be provided by using methods with inherently high frequencies of false positive alignments when broadly applied to all reading frames of DNA sequence. Copyright © 2012 Elsevier Inc. All rights reserved.
Sellem, C. H.; d'Aubenton-Carafa, Y.; Rossignol, M.; Belcour, L.
1996-01-01
The mitochondrial genome of 23 wild-type strains belonging to three different species of the filamentous fungus Podospora was examined. Among the 15 optional sequences identified are two intronic reading frames, nad1-i4-orf1 and cox1-i7-orf2. We show that the presence of these sequences was strictly correlated with tightly clustered nucleotide substitutions in the adjacent exon. This correlation applies to the presence or absence of closely related open reading frames (ORFs), found at the same genetic locations, in all the Pyrenomycete genera examined. The recent gain of these optional ORFs in the evolution of the genus Podospora probably account for such sequence differences. In the homoplasmic progeny from heteroplasmons constructed between Podospora strains differing by the presence of these optional ORFs, nad1-i4-orf1 and cox1-i7-orf2 appeared highly invasive. Sequence comparisons in the nad1-i4 intron of various strains of the Pyrenomycete family led us to propose a scenario of its evolution that includes several events of loss and gain of intronic ORFs. These results strongly reinforce the idea that group I intronic ORFs are mobile elements and that their transfer, and comcomitant modification of the adjacent exon, could participate in the modular evolution of mitochondrial genomes. PMID:8725226
Sellem, C H; d'Aubenton-Carafa, Y; Rossignol, M; Belcour, L
1996-06-01
The mitochondrial genome of 23 wild-type strains belonging to three different species of the filamentous fungus Podospora was examined. Among the 15 optional sequences identified are two intronic reading frames, nad1-i4-orf1 and cox1-i7-orf2. We show that the presence of these sequences was strictly correlated with tightly clustered nucleotide substitutions in the adjacent exon. This correlation applies to the presence or absence of closely related open reading frames (ORFs), found at the same genetic locations, in all the Pyrenomycete genera examined. The recent gain of these optional ORFs in the evolution of the genus Podospora probably account for such sequence differences. In the homoplasmic progeny from heteroplasmons constructed between Podospora strains differing by the presence of these optional ORFs, nad1-i4-orf1 and cox1-i7-orf2 appeared highly invasive. Sequence comparisons in the nad1-i4 intron of various strains of the Pyrenomycete family led us to propose a scenario of its evolution that includes several events of loss and gain of intronic ORFs. These results strongly reinforce the idea that group 1 intronic ORFs are mobile elements and that their transfer, and concomitant modification of the adjacent exon, could participate in the modular evolution of mitochondrial genomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tarlinton, D.; Strasser, A.; McLean, M.
1995-04-01
Mouse B cell precursors containing Ig D{sub H}J{sub H} junctions in one particular reading frame are selectively lost during B cell development. In this register, arbitrarily referred to as reading frame 2, D{sub H}J{sub H} junctions give rise to an open reading frame starting upstream of the D{sub H} element and including the D{sub H}J{sub H}-peptide fused to the constant region of IgM. Expression of this protein, called D{mu}, has been strongly implicated in the loss of B cell precursors containing reading frame 2 D{sub H}J{sub H} junctions. In an attempt to elucidate the means of D{mu} counterselection, we havemore » examined the reading frame distribution of D{sub H}J{sub H} junctions in peripheral B cells from mice transgenic for either the human bcl-2 oncogene or for a functionally rearranged Ig {mu} heavy chain. In bcl-2 transgenic mice, reading frame 2 accounted for < 5% of the D{sub H}J{sub H} junctions in peripheral B cells, a value not significantly different from controls. Reading frames 1 and 3 were equally represented among the remaining junctions. By contrast, the reading frame distribution of endogenous D{sub H}J{sub H} junctions in splenic B cells from Ig {mu} heavy chain transgenic mice showed no evidence of bias against D{mu} encoding D{sub H}J{sub H} junctions. Reading frames 2 and 3 accounted for 27% and 30% of the sequenced D{sub H}J{sub H} junctions, respectively, and the remaining 43% were reading frame 1. Thus although the presence of BCL-2 cannot prevent the selective loss of reading frame 2 D{sub H}J{sub H} B cells, a functional {mu} heavy chain can. These results suggest that D{mu}-expressing B cell precursors may be selectively lost because of the premature and inappropriate cessation of heavy chain gene rearrangement rather than because of the induction of an apoptotic process which can be blocked by BCL-2. 42 refs., 4 figs., 4 tabs.« less
CCC CGA is a weak translational recoding site in Escherichia coli.
Shu, Ping; Dai, Huacheng; Mandecki, Wlodek; Goldman, Emanuel
2004-12-08
Previously published experiments had indicated unexpected expression of a control vector in which a beta-galactosidase reporter was in the +1 reading frame relative to the translation start. This control vector contained the codon pair CCC CGA in the zero reading frame, raising the possibility that ribosomes rephased on this sequence, with peptidyl-tRNA(Pro) pairing with CCC in the +1 frame. This putative rephasing might also be exacerbated by the rare CGA Arg codon in the second position due to increased vacancy of the ribosomal A-site. To test this hypothesis, a series of site-directed mutants was constructed, including mutations in both the first and second codons of this codon pair. The results show that interrupting the continuous run of C residues with synonymous codon changes essentially abolishes the frameshift. Further, changing the rare Arg codon to a common Arg codon also reduces the frequency of the frameshift. These results provide strong support for the hypothesis that CCC CGA in the zero frame is indeed a weak translational frameshift site in Escherichia coli, with a 1-2% efficiency. Because the vector sequence also contains another CCC triplet in the +1 reading frame starting within the next codon after the CGA, our data also support possible contribution to expression of a +7 nucleotide ribosome hop into the same +1 reading frame. We also confirm here a previous report that CCC UGA is a translational frameshift site, in these experiments, with about 5% efficiency.
Drosophila Nora virus capsid proteins differ from those of other picorna-like viruses.
Ekström, Jens-Ola; Habayeb, Mazen S; Srivastava, Vaibhav; Kieselbach, Thomas; Wingsle, Gunnar; Hultmark, Dan
2011-09-01
The recently discovered Nora virus from Drosophila melanogaster is a single-stranded RNA virus. Its published genomic sequence encodes a typical picorna-like cassette of replicative enzymes, but no capsid proteins similar to those in other picorna-like viruses. We have now done additional sequencing at the termini of the viral genome, extending it by 455 nucleotides at the 5' end, but no more coding sequence was found. The completeness of the final 12,333-nucleotide sequence was verified by the production of infectious virus from the cloned genome. To identify the capsid proteins, we purified Nora virus particles and analyzed their proteins by mass spectrometry. Our results show that the capsid is built from three major proteins, VP4A, B and C, encoded in the fourth open reading frame of the viral genome. The viral particles also contain traces of a protein from the third open reading frame, VP3. VP4A and B are not closely related to other picorna-like virus capsid proteins in sequence, but may form similar jelly roll folds. VP4C differs from the others and is predicted to have an essentially α-helical conformation. In a related virus, identified from EST database sequences from Nasonia parasitoid wasps, VP4C is encoded in a separate open reading frame, separated from VP4A and B by a frame-shift. This opens a possibility that VP4C is produced in non-equimolar quantities. Altogether, our results suggest that the Nora virus capsid has a different protein organization compared to the order Picornavirales. Copyright © 2011 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sellem, C.H.; Rossignol, M.; Belcour, L.
1996-06-01
The mitochondrial genome of 23 wild-type strains belonging to three different species of the filamentous fungus Podospora was examined. Among the 15 optical sequences identified are two intronic reading frames, nad1-i4-orf1 and cox1-i7-orf2. We show that the presence of these sequences was strictly correlated with tightly clustered nucleotide substitutions in the adjacent exon. This correlation applies to the presence or absence of closely related open reading frames (ORFs), found at the same genetic locations, in all the Pyrenomycete genera examined. The recent gain of these optional ORFs in the evolution of the genus Podospora probably account for such sequence differences.more » In the homoplasmic progeny from heteroplasmons constructed between Podospora strains differing by the presence of these optional ORFs, nad1-i4-orf1 and cox1-i7-orf2 appeared highly invasive. Sequence comparisons in the nad1-i4 intron of various strains of the Pyrenomycete family led us to propose a scenario of its evolution that includes several events of loss and gain of intronic ORFs. These results strongly reinforce the idea that group I intronic ORFs are mobile elements and that their transfer, and comcomitant modification of the adjacent exon, could participate in the modular evolution of mitochondrial genomes. 46 refs., 5 figs., 2 tabs.« less
Bergeron, Danny; Lapointe, Catherine; Bissonnette, Cyntia; Tremblay, Guillaume; Motard, Julie; Roucou, Xavier
2013-01-01
Spinocerebellar ataxia type 1 is an autosomal dominant cerebellar ataxia associated with the expansion of a polyglutamine tract within the ataxin-1 (ATXN1) protein. Recent studies suggest that understanding the normal function of ATXN1 in cellular processes is essential to decipher the pathogenesis mechanisms in spinocerebellar ataxia type 1. We found an alternative translation initiation ATG codon in the +3 reading frame of human ATXN1 starting 30 nucleotides downstream of the initiation codon for ATXN1 and ending at nucleotide 587. This novel overlapping open reading frame (ORF) encodes a 21-kDa polypeptide termed Alt-ATXN1 (Alternative ATXN1) with a completely different amino acid sequence from ATXN1. We introduced a hemagglutinin tag in-frame with Alt-ATXN1 in ATXN1 cDNA and showed in cell culture the co-expression of both ATXN1 and Alt-ATXN1. Remarkably, Alt-ATXN1 colocalized and interacted with ATXN1 in nuclear inclusions. In contrast, in the absence of ATXN1 expression, Alt-ATXN1 displays a homogenous nucleoplasmic distribution. Alt-ATXN1 interacts with poly(A)+ RNA, and its nuclear localization is dependent on RNA transcription. Polyclonal antibodies raised against Alt-ATXN1 confirmed the expression of Alt-ATXN1 in human cerebellum expressing ATXN1. These results demonstrate that human ATXN1 gene is a dual coding sequence and that ATXN1 interacts with and controls the subcellular distribution of Alt-ATXN1. PMID:23760502
Nishimura, Yuki; Kamikawa, Ryoma; Hashimoto, Tetsuo; Inagaki, Yuji
2014-01-01
Mitochondrial (mt) genome sequences, which often bear introns, have been sampled from phylogenetically diverse eukaryotes. Thus, we can anticipate novel insights into intron evolution from previously unstudied mt genomes. We here investigated the origins and evolution of three introns in the mt genome of the haptophyte Chrysochromulina sp. NIES-1333, which was sequenced completely in this study. All the three introns were characterized as group II, on the basis of predicted secondary structure, and the conserved sequence motifs at the 5′ and 3′ termini. Our comparative studies on diverse mt genomes prompt us to propose that the Chrysochromulina mt genome laterally acquired the introns from mt genomes in distantly related eukaryotes. Many group II introns harbor intronic open reading frames for the proteins (intron-encoded proteins or IEPs), which likely facilitate the splicing of their host introns. However, we propose that a “free-standing,” IEP-like protein, which is not encoded within any introns in the Chrysochromulina mt genome, is involved in the splicing of the first cox1 intron that lacks any open reading frames. PMID:25054084
Complete genome sequence of a new maize-associated cytorhabdovirus
USDA-ARS?s Scientific Manuscript database
A new 11,877 nt cytorhabdovirus sequence with 6 open reading frames has been identified in a maize sample. It shares 50 and 51% genome-wide nucleotide sequence identity with northern cereal mosaic cytorhabdovirus (NCMV) and barley yellow striate mosaic cytorhabdovirus (BYSMV), respectively....
Archaebacterial rhodopsin sequences: Implications for evolution
NASA Technical Reports Server (NTRS)
Lanyi, J. K.
1991-01-01
It was proposed over 10 years ago that the archaebacteria represent a separate kingdom which diverged very early from the eubacteria and eukaryotes. It follows that investigations of archaebacterial characteristics might reveal features of early evolution. So far, two genes, one for bacteriorhodopsin and another for halorhodopsin, both from Halobacterium halobium, have been sequenced. We cloned and sequenced the gene coding for the polypeptide of another one of these rhodopsins, a halorhodopsin in Natronobacterium pharaonis. Peptide sequencing of cyanogen bromide fragments, and immuno-reactions of the protein and synthetic peptides derived from the C-terminal gene sequence, confirmed that the open reading frame was the structural gene for the pharaonis halorhodopsin polypeptide. The flanking DNA sequences of this gene, as well as those of other bacterial rhodopsins, were compared to previously proposed archaebacterial consensus sequences. In pairwise comparisons of the open reading frame with DNA sequences for bacterio-opsin and halo-opsin from Halobacterium halobium, silent divergences were calculated. These indicate very considerable evolutionary distance between each pair of genes, even in the dame organism. In spite of this, three protein sequences show extensive similarities, indicating strong selective pressures.
Lafuente, M J; Gamo, F J; Gancedo, C
1996-09-01
We have determined the sequence of a 10624 bp DNA segment located in the left arm of chromosome XV of Saccharomyces cerevisiae. The sequence contains eight open reading frames (ORFs) longer than 100 amino acids. Two of them do not present significant homology with sequences found in the databases. The product of ORF o0553 is identical to the protein encoded by the gene SMF1. Internal to it there is another ORF, o0555 that is apparently expressed. The proteins encoded by ORFs o0559 and o0565 are identical to ribosomal proteins S19.e and L18 respectively. ORF o0550 encodes a protein with an RNA binding signature including RNP motifs and stretches rich in asparagine, glutamine and arginine.
Pelsy, F.; Merdinoglu, D.
2002-09-01
A chromosome-walking strategy was used to sequence and characterize retrotransposons in the grapevine genome. The reconstitution of a family of retroelements, named Tvv1, was achieved by six successive steps. These elements share a single, highly conserved open reading frame 4,153 nucleotides-long, putatively encoding the gag, pro, int, rt and rh proteins. Comparison of the Tvv1 open reading frame coding potential with those of drosophila copia and tobacco Tnt1, revealed that Tvv1 is closely related to Ty 1 copia-like retrotransposons. A highly variable untranslated leader region, upstream of the open reading frame, allowed us to differentiate Tvv1 variants, which represent a family of at least 28 copies, in varying sizes. This internal region is flanked by two long terminal repeats in direct orientation, sized between 149 and 157 bp. Among elements theoretically sized from 4,970 to 5,550 bp, we describe the full-length sequence of a reference element Tvv1-1, 5,343 nucleotides-long. The full-length sequence of Tvv1-1 compared to pea PDR1 shows a 53.3% identity. In addition, both elements contain long terminal repeats of nearly the same size in which the U5 region could be entirely absent. Therefore, we assume that Tvv1 and PDR1 could constitute a particular class of short LTRs retroelements.
Circular codes revisited: a statistical approach.
Gonzalez, D L; Giannerini, S; Rosa, R
2011-04-21
In 1996 Arquès and Michel [1996. A complementary circular code in the protein coding genes. J. Theor. Biol. 182, 45-58] discovered the existence of a common circular code in eukaryote and prokaryote genomes. Since then, circular code theory has provoked great interest and underwent a rapid development. In this paper we discuss some theoretical issues related to the synchronization properties of coding sequences and circular codes with particular emphasis on the problem of retrieval and maintenance of the reading frame. Motivated by the theoretical discussion, we adopt a rigorous statistical approach in order to try to answer different questions. First, we investigate the covering capability of the whole class of 216 self-complementary, C(3) maximal codes with respect to a large set of coding sequences. The results indicate that, on average, the code proposed by Arquès and Michel has the best covering capability but, still, there exists a great variability among sequences. Second, we focus on such code and explore the role played by the proportion of the bases by means of a hierarchy of permutation tests. The results show the existence of a sort of optimization mechanism such that coding sequences are tailored as to maximize or minimize the coverage of circular codes on specific reading frames. Such optimization clearly relates the function of circular codes with reading frame synchronization. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Mishra, Bhavya; Schütz, Gunter M.; Chowdhury, Debashish
2016-06-01
We develop a stochastic model for the programmed frameshift of ribosomes synthesizing a protein while moving along a mRNA template. Normally the reading frame of a ribosome decodes successive triplets of nucleotides on the mRNA in a step-by-step manner. We focus on the programmed shift of the ribosomal reading frame, forward or backward, by only one nucleotide which results in a fusion protein; it occurs when a ribosome temporarily loses its grip to its mRNA track. Special “slippery” sequences of nucleotides and also downstream secondary structures of the mRNA strand are believed to play key roles in programmed frameshift. Here we explore the role of an hitherto neglected parameter in regulating -1 programmed frameshift. Specifically, we demonstrate that the frameshift frequency can be strongly regulated also by the density of the ribosomes, all of which are engaged in simultaneous translation of the same mRNA, at and around the slippery sequence. Monte Carlo simulations support the analytical predictions obtained from a mean-field analysis of the stochastic dynamics.
Spliced RNA of woodchuck hepatitis virus.
Ogston, C W; Razman, D G
1992-07-01
Polymerase chain reaction was used to investigate RNA splicing in liver of woodchucks infected with woodchuck hepatitis virus (WHV). Two spliced species were detected, and the splice junctions were sequenced. The larger spliced RNA has an intron of 1300 nucleotides, and the smaller spliced sequence shows an additional downstream intron of 1104 nucleotides. We did not detect singly spliced sequences from which the smaller intron alone was removed. Control experiments showed that spliced sequences are present in both RNA and DNA in infected liver, showing that the viral reverse transcriptase can use spliced RNA as template. Spliced sequences were detected also in virion DNA prepared from serum. The upstream intron produces a reading frame that fuses the core to the polymerase polypeptide, while the downstream intron causes an inframe deletion in the polymerase open reading frame. Whereas the splicing patterns in WHV are superficially similar to those reported recently in hepatitis B virus, we detected no obvious homology in the coding capacity of spliced RNAs from these two viruses.
Lipinska, B; Rao, A S; Bolten, B M; Balakrishnan, R; Goldberg, E B
1989-01-01
We sequenced bacteriophage T4 genes 2 and 3 and the putative C-terminal portion of gene 50. They were found to have appropriate open reading frames directed counterclockwise on the T4 map. Mutations in genes 2 and 64 were shown to be in the same open reading frame, which we now call gene 2. This gene codes for a protein of 27,068 daltons. The open reading frame corresponding to gene 3 codes for a protein of 20,634 daltons. Appropriate bands on polyacrylamide gels were identified at 30 and 20 kilodaltons, respectively. We found that the product of the cloned gene 2 can protect T4 DNA double-stranded ends from exonuclease V action. Images PMID:2644202
Tobin, M B; Kovacevic, S; Madduri, K; Hoskins, J A; Skatrud, P L; Vining, L C; Stuttard, C; Miller, J R
1991-01-01
Lysine epsilon-aminotransferase (LAT) in the beta-lactam-producing actinomycetes is considered to be the first step in the antibiotic biosynthetic pathway. Cloning of restriction fragments from Streptomyces clavuligerus, a beta-lactam producer, into Streptomyces lividans, a nonproducer that lacks LAT activity, led to the production of LAT in the host. DNA sequencing of restriction fragments containing the putative lat gene revealed a single open reading frame encoding a polypeptide with an approximately Mr 49,000. Expression of this coding sequence in Escherichia coli led to the production of LAT activity. Hence, LAT activity in S. clavuligerus is derived from a single polypeptide. A second open reading frame began immediately downstream from lat. Comparison of this partial sequence with the sequences of delta-(L-alpha-aminoadipyl)-L-cysteinyl-D valine (ACV) synthetases from Penicillium chrysogenum and Cephalosporium acremonium and with nonribosomal peptide synthetases (gramicidin S and tyrocidine synthetases) found similarities among the open reading frames. Since mapping of the putative N and C termini of S. clavuligerus pcbAB suggests that the coding region occupies approximately 12 kbp and codes for a polypeptide related in size to the fungal ACV synthetases, the molecular characterization of the beta-lactam biosynthetic cluster between pcbC and cefE (approximately 25 kbp) is nearly complete. Images PMID:1917855
Origins of Genes: "Big Bang" or Continuous Creation?
NASA Astrophysics Data System (ADS)
Kesse, Paul K.; Gibbs, Adrian
1992-10-01
Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes.
The Status of Exon Skipping as a Therapeutic Approach to Duchenne Muscular Dystrophy
Lu, Qi-Long; Yokota, Toshifumi; Takeda, Shin'ichi; Garcia, Luis; Muntoni, Francesco; Partridge, Terence
2011-01-01
Duchenne muscular dystrophy (DMD) is associated with mutations in the dystrophin gene that disrupt the open reading frame whereas the milder Becker's form is associated with mutations which leave an in-frame mRNA transcript that can be translated into a protein that includes the N- and C- terminal functional domains. It has been shown that by excluding specific exons at, or adjacent to, frame-shifting mutations, open reading frame can be restored to an out-of-frame mRNA, leading to the production of a partially functional Becker-like dystrophin protein. Such targeted exclusion can be achieved by administration of oligonucleotides that are complementary to sequences that are crucial to normal splicing of the exon into the transcript. This principle has been validated in mouse and canine models of DMD with a number of variants of oligonucleotide analogue chemistries and by transduction with adeno-associated virus (AAV)-small nuclear RNA (snRNA) reagents encoding the antisense sequence. Two different oligonucleotide agents are now being investigated in human trials for splicing out of exon 51 with some early indications of success at the biochemical level. PMID:20978473
Wright, Imogen A.; Travers, Simon A.
2014-01-01
The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance. PMID:24861618
Gabe, Jeffrey D.; Dragon, Elizabeth; Chang, Ray-Jen; McCaman, Michael T.
1998-01-01
A tandem pair of nearly identical genes from Serpulina hyodysenteriae (B204) were cloned and sequenced. The full open reading frame of one gene and the partial open reading frame of the neighboring gene appear to encode secreted proteins which are homologous to, yet distinct from, the 39-kDa extracytoplasmic protein purified from the membrane fraction of S. hyodysenteriae. We have designated these newly identified genes vspA and vspB (for variable surface protein). PMID:9440540
Draft genome sequence of rice orange leaf phytoplasma from Guangdong, China
USDA-ARS?s Scientific Manuscript database
The genome of rice orange leaf phytoplasma strain LD1 from Luoding City, Guangdong, P. R. China, was sequenced. The draft LD1genome is 599,264 bp with GC content of 28.2%, 647 predicted open reading frames and 33 RNA genes....
Constructing high complexity synthetic libraries of long ORFs using in vitro selection
NASA Technical Reports Server (NTRS)
Cho, G.; Keefe, A. D.; Liu, R.; Wilson, D. S.; Szostak, J. W.
2000-01-01
We present a method that can significantly increase the complexity of protein libraries used for in vitro or in vivo protein selection experiments. Protein libraries are often encoded by chemically synthesized DNA, in which part of the open reading frame is randomized. There are, however, major obstacles associated with the chemical synthesis of long open reading frames, especially those containing random segments. Insertions and deletions that occur during chemical synthesis cause frameshifts, and stop codons in the random region will cause premature termination. These problems can together greatly reduce the number of full-length synthetic genes in the library. We describe a strategy in which smaller segments of the synthetic open reading frame are selected in vitro using mRNA display for the absence of frameshifts and stop codons. These smaller segments are then ligated together to form combinatorial libraries of long uninterrupted open reading frames. This process can increase the number of full-length open reading frames in libraries by up to two orders of magnitude, resulting in protein libraries with complexities of greater than 10(13). We have used this methodology to generate three types of displayed protein library: a completely random sequence library, a library of concatemerized oligopeptide cassettes with a propensity for forming amphipathic alpha-helical or beta-strand structures, and a library based on one of the most common enzymatic scaffolds, the alpha/beta (TIM) barrel. Copyright 2000 Academic Press.
Khrustalev, Vladislav Victorovich; Ermalovich, Marina Anatolyevna; Hübschen, Judith M; Khrustaleva, Tatyana Aleksandrovna
2017-12-21
In this study we used non-overlapping parts of the two long open reading frames coding for nonstructural (NS) and capsid (VP) proteins of all available sequences of the Parvovirus B19 subgenotype 1a genome and found out that the rates of A to G, C to T and A to T mutations are higher in the first long reading frame (NS) of the virus than in the second one (VP). This difference in mutational pressure directions for two parts of the same viral genome can be explained by the fact of transcription of just the first long reading frame during the lifelong latency in nonerythroid cells. Adenine deamination (producing A to G and A to T mutations) and cytosine deamination (producing C to T mutations) occur more frequently in transcriptional bubbles formed by DNA "plus" strand of the first open reading frame. These mutations can be inherited only in case of reactivation of the infectious virus due to the help of Adenovirus that allows latent Parvovirus B19 to start transcription of the second reading frame and then to replicate its genome by the rolling circle mechanism using the specific origin. Results of this study provide evidence that the genomes reactivated from latency make significant contributions to the variability of Parvovirus B19. Copyright © 2017 Elsevier Ltd. All rights reserved.
Montandon, P E; Vasserot, A; Stutz, E
1986-01-01
We retrieved a 1.6 kbp intron separating two exons of the psb C gene which codes for the 44 kDa reaction center protein of photosystem II. This intron is 3 to 4 times the size of all previously sequenced Euglena gracilis chloroplast introns. It contains an open reading frame of 458 codons potentially coding for a basic protein of 54 kDa of yet unknown function. The intron boundaries follow consensus sequences established for chloroplast introns related to class II and nuclear pre-mRNA introns. Its 3'-terminal segment has structural features similar to class II mitochondrial introns with an invariant base A as possible branch point for lariat formation.
From Mosquitos to Humans: Genetic Evolution of Zika Virus
Wang, Lulan; Valderramos, Stephanie G.; Wu, Aiping; Ouyang, Songying; Li, Chunfeng; Brasil, Patricia; Bonaldo, Myrna; Coates, Thomas; Nielsen-Saines, Karin; Jiang, Taijiao; Aliyari, Roghiyh; Cheng, Genhong
2017-01-01
Initially isolated in 1947, Zika virus (ZIKV) has recently emerged as significant public health concern. Sequence analysis of all 41 known ZIKV RNA open reading frames to date indicates that ZIKV has undergone significant changes in both protein and nucleotide sequences during the past half century. PMID:27091703
JavaScript DNA translator: DNA-aligned protein translations.
Perry, William L
2002-12-01
There are many instances in molecular biology when it is necessary to identify ORFs in a DNA sequence. While programs exist for displaying protein translations in multiple ORFs in alignment with a DNA sequence, they are often expensive, exist as add-ons to software that must be purchased, or are only compatible with a particular operating system. JavaScript DNA Translator is a shareware application written in JavaScript, a scripting language interpreted by the Netscape Communicator and Internet Explorer Web browsers, which makes it compatible with several different operating systems. While the program uses a familiar Web page interface, it requires no connection to the Internet since calculations are performed on the user's own computer. The program analyzes one or multiple DNA sequences and generates translations in up to six reading frames aligned to a DNA sequence, in addition to displaying translations as separate sequences in FASTA format. ORFs within a reading frame can also be displayed as separate sequences. Flexible formatting options are provided, including the ability to hide ORFs below a minimum size specified by the user. The program is available free of charge at the BioTechniques Software Library (www.Biotechniques.com).
Origins of genes: "big bang" or continuous creation?
Keese, P K; Gibbs, A
1992-01-01
Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes. PMID:1329098
Wright, Imogen A; Travers, Simon A
2014-07-01
The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Complete genome sequence of yam chlorotic necrosis virus, a novel macluravirus infecting yam
USDA-ARS?s Scientific Manuscript database
Complete genomic sequence of a novel member of the genus Macluravirus was determined from yam plants with chlorotic and necrotic symptoms in China. The genomic RNA consists of 8,261 nucleotides (nt) excluding the 3’-terminal poly (A) tail, containing one long open reading frame (ORF) encoding a larg...
The DNA region encoding biphenyl dioxygenase, the first enzyme in the biphenyl-polychlorinated biphenyl degradation pathway of Pseudomonas species strain LB400, was sequenced. ix open reading frames were identified, four of which are, homologous to the components of toluene dioxy...
Albertini, A M; Caramori, T; Crabb, W D; Scoffone, F; Galizzi, A
1991-01-01
We cloned and sequenced 8.3 kb of Bacillus subtilis DNA corresponding to the flaA locus involved in flagellar biosynthesis, motility, and chemotaxis. The DNA sequence revealed the presence of 10 complete and 2 incomplete open reading frames. Comparison of the deduced amino acid sequences to data banks showed similarities of nine of the deduced products to a number of proteins of Escherichia coli and Salmonella typhimurium for which a role in flagellar functioning has been directly demonstrated. In particular, the sequence data suggest that the flaA operon codes for the M-ring protein, components of the motor switch, and the distal part of the basal-body rod. The gene order is remarkably similar to that described for region III of the enterobacterial flagellar regulon. One of the open reading frames was translated into a protein with 48% amino acid identity to S. typhimurium FliI and 29% identity to the beta subunit of E. coli ATP synthase. PMID:1828465
Jarausch, W; Saillard, C; Dosba, F; Bové, J M
1994-01-01
A 1.8-kb chromosomal DNA fragment of the mycoplasmalike organism (MLO) associated with apple proliferation was sequenced. Three putative open reading frames were observed on this fragment. The protein encoded by open reading frame 2 shows significant homologies with bacterial nitroreductases. From the nucleotide sequence four primer pairs for PCR were chosen to specifically amplify DNA from MLOs associated with European diseases of fruit trees. Primer pairs specific for (i) Malus-affecting MLOs, (ii) Malus- and Prunus-affecting MLOs, and (iii) Malus-, Prunus-, and Pyrus-affecting MLOs were obtained. Restriction enzyme analysis of the amplification products revealed restriction fragment length polymorphisms between Malus-, Prunus, and Pyrus-affecting MLOs as well as between different isolates of the apple proliferation MLO. No amplification with either primer pair could be obtained with DNA from 12 different MLOs experimentally maintained in periwinkle. Images PMID:7916180
A local duplication of the Melanocortin receptor 1 locus in Astyanax
Gross, Joshua B.; Weagley, James; Stahl, Bethany A.; Ma, Li; Espinasa, Luis; McGaugh, Suzanne E.
2017-01-01
In this study, we report evidence of a novel duplication of Melanocortin receptor 1 (Mc1r) in the cavefish genome. This locus was discovered following the observation of excessive allelic diversity in a ~820 bp fragment of Mc1r amplified via degenerate PCR from a natural population of Astyanax aeneus fish from Guerrero, Mexico. The cavefish genome reveals the presence of two closely related Mc1r open reading frames separated by a 1.46 kb intergenic region. One open reading frame corresponds to the previously reported Mc1r receptor, and the other open reading frame (duplicate copy) is 975 bp in length, encoding a receptor of 325 amino acids. Sequence similarity analyses position both copies in the syntenic region of the single Mc1r locus in 16 representative craniate genomes spanning bony fish (including Astyanax) to mammals, suggesting we discovered tandem duplicates of this important gene. The two Mc1r copies share ~89% sequence similarity, and, within Astyanax, are more similar to one another compared to other melanocortin family members. Future studies will inform the precise functional significance of the duplicated Mc1r locus, and if this novel copy number variant may have adaptive significance for the Astyanax lineage. PMID:28738163
Ousterout, David G; Kabadi, Ami M; Thakore, Pratiksha I; Perez-Pinera, Pablo; Brown, Matthew T; Majoros, William H; Reddy, Timothy E; Gersbach, Charles A
2015-01-01
Duchenne muscular dystrophy (DMD) is caused by genetic mutations that result in the absence of dystrophin protein expression. Oligonucleotide-induced exon skipping can restore the dystrophin reading frame and protein production. However, this requires continuous drug administration and may not generate complete skipping of the targeted exon. In this study, we apply genome editing with zinc finger nucleases (ZFNs) to permanently remove essential splicing sequences in exon 51 of the dystrophin gene and thereby exclude exon 51 from the resulting dystrophin transcript. This approach can restore the dystrophin reading frame in ~13% of DMD patient mutations. Transfection of two ZFNs targeted to sites flanking the exon 51 splice acceptor into DMD patient myoblasts led to deletion of this genomic sequence. A clonal population was isolated with this deletion and following differentiation we confirmed loss of exon 51 from the dystrophin mRNA transcript and restoration of dystrophin protein expression. Furthermore, transplantation of corrected cells into immunodeficient mice resulted in human dystrophin expression localized to the sarcolemmal membrane. Finally, we quantified ZFN toxicity in human cells and mutagenesis at predicted off-target sites. This study demonstrates a powerful method to restore the dystrophin reading frame and protein expression by permanently deleting exons. PMID:25492562
Draft Genome Sequence of the d-Xylose-Fermenting Yeast Spathaspora arborariae UFMG-HM19.1AT
Lobo, Francisco P.; Gonçalves, Davi L.; Alves, Sergio L.; Gerber, Alexandra L.; de Vasconcelos, Ana Tereza R.; Basso, Luiz C.; Franco, Glória R.; Soares, Marco A.; Cadete, Raquel M.; Rosa, Carlos A.
2014-01-01
The draft genome sequence of the yeast Spathaspora arborariae UFMG-HM19.1AT (CBS 11463 = NRRL Y-48658) is presented here. The sequenced genome size is 12.7 Mb, consisting of 41 scaffolds containing a total of 5,625 predicted open reading frames, including many genes encoding enzymes and transporters involved in d-xylose fermentation. PMID:24435867
Improve homology search sensitivity of PacBio data by correcting frameshifts.
Du, Nan; Sun, Yanni
2016-09-01
Single-molecule, real-time sequencing (SMRT) developed by Pacific BioSciences produces longer reads than secondary generation sequencing technologies such as Illumina. The long read length enables PacBio sequencing to close gaps in genome assembly, reveal structural variations, and identify gene isoforms with higher accuracy in transcriptomic sequencing. However, PacBio data has high sequencing error rate and most of the errors are insertion or deletion errors. During alignment-based homology search, insertion or deletion errors in genes will cause frameshifts and may only lead to marginal alignment scores and short alignments. As a result, it is hard to distinguish true alignments from random alignments and the ambiguity will incur errors in structural and functional annotation. Existing frameshift correction tools are designed for data with much lower error rate and are not optimized for PacBio data. As an increasing number of groups are using SMRT, there is an urgent need for dedicated homology search tools for PacBio data. In this work, we introduce Frame-Pro, a profile homology search tool for PacBio reads. Our tool corrects sequencing errors and also outputs the profile alignments of the corrected sequences against characterized protein families. We applied our tool to both simulated and real PacBio data. The results showed that our method enables more sensitive homology search, especially for PacBio data sets of low sequencing coverage. In addition, we can correct more errors when comparing with a popular error correction tool that does not rely on hybrid sequencing. The source code is freely available at https://sourceforge.net/projects/frame-pro/ yannisun@msu.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Analysis of genes involved in biosynthesis of the lantibiotic subtilin.
Klein, C; Kaletta, C; Schnell, N; Entian, K D
1992-01-01
Lantibiotics are peptide-derived antibiotics with high antimicrobial activity against pathogenic gram-positive bacteria. They are ribosomally synthesized and posttranslationally modified (N. Schnell, K.-D. Entian, U. Schneider, F. Götz, H. Zähner, R. Kellner, and G. Jung, Nature [London] 333:276-278, 1988). The most important lantibiotics are subtilin and the food preservative nisin, which both have a very similar structure. By using a hybridization probe specific for the structural gene of subtilin, spaS, the DNA region adjacent to spaS was isolated from Bacillus subtilis. Sequence analysis of a 4.9-kb fragment revealed several open reading frames with the same orientation as spaS. Downstream of spaS, no reading frames were present on the isolated XbaI fragment. Upstream of spaS, three reading frames, spaB, spaC, and spaT, were identified which showed strong homology to genes identified near the structural gene of the lantibiotic epidermin. The SpaT protein derived from the spaT sequence was homologous to hemolysin B of Escherichia coli, which indicated its possible function in subtilin transport. Gene deletions within spaB and spaC revealed subtilin-negative mutants, whereas spaT gene disruption mutants still produced subtilin. Remarkably, the spaT mutant colonies revealed a clumpy surface morphology on solid media. After growth on liquid media, spaT mutant cells agglutinated in the mid-logarithmic growth phase, forming longitudinal 3- to 10-fold-enlarged cells which aggregated. Aggregate formation preceded subtilin production and cells lost their viability, possibly as a result of intracellular subtilin accumulation. Our results clearly proved that reading frames spaB and spaC are essential for subtilin biosynthesis whereas spaT mutants are probably deficient in subtilin transport. Images PMID:1539969
Complete genome sequence of Paris mosaic necrosis virus, a distinct member of the genus Potyvirus
USDA-ARS?s Scientific Manuscript database
The complete genomic sequence of a novel potyvirus was determined from Paris polyphylla var. yunnanensis. Its genomic RNA consists of 9,660 nucleotides (nt) excluding the 3’-terminal poly (A) tail, containing a single open reading frame (ORF) encoding a large polyprotein. The virus shares 52.1-69.7%...
Whole-genome sequence of “Candidatus Liberibacter solanacearum” strain R1 from California
USDA-ARS?s Scientific Manuscript database
The draft whole-genome sequence of “Candidatus Liberibacter solanacearum” strain R1, isolated from a tomato plant in California, United States, is reported. The R1 strain genome is 1,204,257 bp in size (G+C content of 35.3%), encoding 1,101 open reading frames and 57 RNA genes....
The DNA region encoding biphenyl dioxygenase, the first enzyme in the biphenyl-polychlorinated biphenyl degradation pathway of Pseudomonas species strain LB400, was sequenced. Six open reading frames were identified, four of which are homologous to the components of toluene dioxy...
Complete Genome Sequence of the Mesoplasma florum W37 Strain
Baby, Vincent; Matteau, Dominick; Knight, Thomas F.
2013-01-01
Mesoplasma florum is a small-genome fast-growing mollicute that is an attractive model for systems and synthetic genomics studies. We report the complete 825,824-bp genome sequence of a second representative of this species, M. florum strain W37, which contains 733 predicted open reading frames and 35 stable RNAs. PMID:24285658
USDA-ARS?s Scientific Manuscript database
The draft genome sequence of “Candidatus Liberibacter asiaticus” strain YCPsy from an Asian citrus psyllid (Diaphorina citri) in Guangdong of China is reported. The YCPsy strain has a genome size of 1,233,647 bp, 36.5% G+C content, 1,171 open reading frames (ORFs), and 53 RNAs....
Whole genome sequence of “Candidatus Profftella armatura” from Diaphorina citri in Guangdong, China
USDA-ARS?s Scientific Manuscript database
The genome of “Candidatus Profftella armatura” strain YCPA, a symbiont of Asian citrus psyllid, from Guangdong, China, was sequenced. The strain chromosome was 457,565 bp with 24.3% G+C content, 364 predicted open reading frames (ORFs), and 38 RNA genes. The strain also contains a 5,458 bp plasmid, ...
Complete Genome Sequence of Pseudomonas aeruginosa Phage AAT-1
Andrade-Domínguez, Andrés
2016-01-01
Aspects of the interaction between phages and animals are of interest and importance for medical applications. Here, we report the genome sequence of the lytic Pseudomonas phage AAT-1, isolated from mammalian serum. AAT-1 is a double-stranded DNA phage, with a genome of 57,599 bp, containing 76 predicted open reading frames. PMID:27563032
A draft genome sequence of “Candidatus Liberibacter asiaticus” from California, USA
USDA-ARS?s Scientific Manuscript database
The draft genome sequence of “Candidatus Liberibacter asiaticus” strain HHCA, collected from a lemon tree in California, USA, is reported. The HHCA strain has a genome size of 1,118,244 bp, with G+C content of 36.6%. The HHCA genome encodes 1,191 predicted open reading frames and 51 RNA genes....
Jado, Isabel; Fenoll, Asunción; Casal, Julio; Pérez, Amalia
2001-01-01
The gene encoding the pneumococcal surface adhesin A (PsaA) protein has been identified in three different viridans group streptococcal species. Comparative studies of the psaA gene identified in different pneumococcal isolates by sequencing PCR products showed a high degree of conservation among these strains. PsaA is encoded by an open reading frame of 930 bp. The analysis of this fragment in Streptococcus mitis, Streptococcus oralis, and Streptococcus anginosus strains revealed a sequence identity of 95, 94, and 90%, respectively, to the corresponding open reading frame of the previously reported Streptococcus pneumoniae serotype 6B strain. Our results confirm that psaA is present and detectable in heterologous bacterial species. The possible implications of these results for the suitability and potential use of PsaA in the identification and diagnosis of pneumococcal diseases are discussed. PMID:11527799
Hoffmann, Bernd; Schütze, Heike; Mettenleiter, Thomas C
2002-03-20
The complete genome of spring viremia of carp virus (SVCV) was cloned and the sequence of 11019 nucleotides was determined. It contains five open reading frames (ORF's) encoding for the nucleoprotein N; phosphoprotein P; matrix protein M; glycoprotein G; and the viral RNA dependent RNA polymerase L. Genes are organised in the order typical for rhabdoviruses: 3'-N-P-M-G-L-5'. The short leader and trailer regions of SVCV exhibit inverse complementarity and are similar to the respective 3' and 5' ends of the genome of vesicular stomatitis virus. To verify the predicted open reading frames proteins were expressed in bacteria and analysed with a polyclonal anti-SVCV serum. Furthermore, monospecific antisera against the distinct viral proteins were generated. Comparison of genome and protein confirm the assignment of SVCV to the genus Vesiculovirus.
Lohmer, S; Maddaloni, M; Motto, M; Salamini, F; Thompson, R D
1993-01-01
The protein encoded by the Opaque-2 (O2) gene is a transcription factor, translated from an mRNA that possesses an unusually long 5' leader sequence containing three upstream open reading frames (uORFs). The efficiency of translation of O2 mRNA has been tested in vivo by a transient assay in which the level of activation of the b32 promoter, a natural target of O2 protein, is measured. We show that uORF-less O2 alleles possess a higher transactivation value than the wild-type allele and that the reduction in transactivation due to the uORFs is a cis-dominant effect. The data presented indicate that both uORF1 and uORF2 are involved in the reducing effect and suggest that both are likely to be translated. PMID:8439744
Saavedra-Lira, E; Pérez-Montfort, R
1994-05-16
We isolated three overlapping clones from a DNA genomic library of Entamoeba histolytica strain HM1:IMSS, whose translated nucleotide (nt) sequence shows similarities of 51, 48 and 47% with the amino acid (aa) sequences reported for the pyruvate phosphate dikinases from Bacteroides symbiosus, maize and Flaveria trinervia, respectively. The reading frame determined codes for a protein of 886 aa.
Complete Genome Sequence of EtG, the First Phage Sequenced from Erwinia tracheiphila.
Andrade-Domínguez, Andrés; Kolter, Roberto; Shapiro, Lori R
2018-02-22
Erwinia tracheiphila is the causal agent of bacterial wilt of cucurbits. Here, we report the genome sequence of the temperate phage EtG, which was isolated from an E. tracheiphila -infected cucumber plant. Phage EtG has a linear 30,413-bp double-stranded DNA genome with cohesive ends and 45 predicted open reading frames. Copyright © 2018 Andrade-Domínguez et al.
Transposition of an intron in yeast mitochondria requires a protein encoded by that intron.
Macreadie, I G; Scott, R M; Zinn, A R; Butow, R A
1985-06-01
The optional 1143 bp intron in the yeast mitochondrial 21S rRNA gene (omega +) is nearly quantitatively inserted in genetic crosses into 21S rRNA alleles that lack it (omega -). The intron contains an open reading frame that can encode a protein of 235 amino acids, but no function has been ascribed to this sequence. We previously found an in vivo double-strand break in omega - DNA at or close to the intron insertion site only in zygotes of omega + X omega - crosses that appears with the same kinetics as intron insertion. We now show that mutations in the intron open reading frame that would alter the translation product simultaneously inhibit nonreciprocal omega recombination and the in vivo double-strand break in omega - DNA. These results provide evidence that the open reading frame encodes a protein required for intron transposition and support the role of the double-strand break in the process.
Detection of a divergent variant of grapevine virus F by next-generation sequencing.
Molenaar, Nicholas; Burger, Johan T; Maree, Hans J
2015-08-01
The complete genome sequence of a South African isolate of grapevine virus F (GVF) is presented. It was first detected by metagenomic next-generation sequencing of field samples and validated through direct Sanger sequencing. The genome sequence of GVF isolate V5 consists of 7539 nucleotides and contains a poly(A) tail. It has a typical vitivirus genome arrangement that comprises five open reading frames (ORFs), which share only 88.96 % nucleotide sequence identity with the existing complete GVF genome sequence (JX105428).
Selfish DNA in protein-coding genes of Rickettsia.
Ogata, H; Audic, S; Barbe, V; Artiguenave, F; Fournier, P E; Raoult, D; Claverie, J M
2000-10-13
Rickettsia conorii, the aetiological agent of Mediterranean spotted fever, is an intracellular bacterium transmitted by ticks. Preliminary analyses of the nearly complete genome sequence of R. conorii have revealed 44 occurrences of a previously undescribed palindromic repeat (150 base pairs long) throughout the genome. Unexpectedly, this repeat was found inserted in-frame within 19 different R. conorii open reading frames likely to encode functional proteins. We found the same repeat in proteins of other Rickettsia species. The finding of a mobile element inserted in many unrelated genes suggests the potential role of selfish DNA in the creation of new protein sequences.
Using hidden Markov models and observed evolution to annotate viral genomes.
McCauley, Stephen; Hein, Jotun
2006-06-01
ssRNA (single stranded) viral genomes are generally constrained in length and utilize overlapping reading frames to maximally exploit the coding potential within the genome length restrictions. This overlapping coding phenomenon leads to complex evolutionary constraints operating on the genome. In regions which code for more than one protein, silent mutations in one reading frame generally have a protein coding effect in another. To maximize coding flexibility in all reading frames, overlapping regions are often compositionally biased towards amino acids which are 6-fold degenerate with respect to the 64 codon alphabet. Previous methodologies have used this fact in an ad hoc manner to look for overlapping genes by motif matching. In this paper differentiated nucleotide compositional patterns in overlapping regions are incorporated into a probabilistic hidden Markov model (HMM) framework which is used to annotate ssRNA viral genomes. This work focuses on single sequence annotation and applies an HMM framework to ssRNA viral annotation. A description of how the HMM is parameterized, whilst annotating within a missing data framework is given. A Phylogenetic HMM (Phylo-HMM) extension, as applied to 14 aligned HIV2 sequences is also presented. This evolutionary extension serves as an illustration of the potential of the Phylo-HMM framework for ssRNA viral genomic annotation. The single sequence annotation procedure (SSA) is applied to 14 different strains of the HIV2 virus. Further results on alternative ssRNA viral genomes are presented to illustrate more generally the performance of the method. The results of the SSA method are encouraging however there is still room for improvement, and since there is overwhelming evidence to indicate that comparative methods can improve coding sequence (CDS) annotation, the SSA method is extended to a Phylo-HMM to incorporate evolutionary information. The Phylo-HMM extension is applied to the same set of 14 HIV2 sequences which are pre-aligned. The performance improvement that results from including the evolutionary information in the analysis is illustrated.
Iwanowicz, Luke R; Iwanowicz, Deborah D; Adams, Cynthia R; Galbraith, Heather; Aunins, Aaron; Cornman, Robert S
2017-10-12
Here, we report a draft genome sequence of a picorna-like virus associated with brook trout, Salvelinus fontinalis , gill tissue. The draft genome comprises 8,681 nucleotides, excluding the poly(A) tract, and contains two open reading frames. It is most similar to picorna-like viruses that infect invertebrates.
Sequence and analysis of the genome of a baculovirus pathogenic for Lymantria dispar
John Kuzio; Margot N. Pearson; Steve H. Harwood; C. Joel Funk; Jay T. Evans; James M. Slavicek; George F. Rohrmann
1999-01-01
The genome of the Lymantria dispar multinucleocapsid nucleopolyhedrovirus (LdMNPV) was sequenced and analyzed. It is composed of 161,046 bases with a G + C content of 57.5% and contains 163 putative open reading frames (ORFs) of ≥150 nucleotides. Homologs were found to 95 of the 155 genes predicted for the Autographa californica...
USDA-ARS?s Scientific Manuscript database
The draft genome sequence of “Candidatus Liberibacter asiaticus” strain FL17, isolated from an HLB-affected citrus tree in central Florida, was determined. The FL17 genome comprised 1,227,253 bp with a G+C content of 36.5%, 1,175 predicted open reading frames, and 53 RNA genes....
USDA-ARS?s Scientific Manuscript database
The draft genome sequence of “Candidatus Liberibacter asiaticus” strain TX2351 collected from ACP in South Texas has been determined. The TX2351 genome is 1,252,043 bp in size with a 36.5% G+C content, encoding 1,184 predicted open reading frames and 51 RNA genes....
Iwanowicz, Luke R.; Iwanowicz, Deborah; Adams, Cynthia; Galbraith, Heather S.; Aunins, Aaron W.; Cornman, Robert S.
2017-01-01
Here, we report a draft genome sequence of a picorna-like virus associated with brook trout, Salvelinus fontinalis, gill tissue. The draft genome comprises 8,681 nucleotides, excluding the poly(A) tract, and contains two open reading frames. It is most similar to picorna-like viruses that infect invertebrates.
Arrizubieta, Maite; Williams, Trevor; Caballero, Primitivo
2015-01-01
Helicoverpa armigera nucleopolyhedrovirus (HearNPV) has proved effective as the basis for various biological insecticides. Complete genome sequences of five Spanish HearNPV genotypes differed principally in the homologous regions (hrs) and the baculovirus repeat open reading frame (bro) genes, suggesting that they may be involved in the phenotypic differences observed among genotypes. PMID:26067949
Nucleotide sequence of a resistance breaking mutant of southern bean mosaic virus.
Lee, L; Anderson, E J
1998-01-01
SBMV-S is a resistance-breaking mutant of an Arkansas isolate of the bean strain of southern bean mosaic virus (SBMV-BARK) that is able to move systemically in Phaseolus vulgaris cvs. Pinto and Great Northern, whereas the wild-type SBMV-BARK causes local necrotic lesions and is restricted to the inoculated leaves of these hosts. Sequence analysis of the 4136 nucleotide genomes of SBMV-BARK and SBMV-S revealed seven nucleotide differences, but only four deduced amino acid changes. A single amino acid change occurred in the C-terminal region of the putative RNA-dependent RNA polymerase and three differences were identified in the N-terminal portion of the virus coat protein. SBMV-BARK and SBMV-S were compared with other sobemoviruses and were found to contain a high level of nucleotide sequence identity (91.3%) to SBMV-B. Unlike SBMV-B however, SBMV-BARK and SBMV-S contained four putative overlapping open reading frames, making them more similar in genome organization to the cowpea strain, SBMV-C. The possibility exists that mutations or even errors, that resulted in mis-identification of open reading frames, occurred in previously published information on nucleotide sequence and genomic organization for SBMV-B.
Expression of a non-coding RNA in ectromelia virus is required for normal plaque formation.
Esteban, David J; Upton, Chris; Bartow-McKenney, Casey; Buller, R Mark L; Chen, Nanhai G; Schriewer, Jill; Lefkowitz, Elliot J; Wang, Chunlin
2014-02-01
Poxviruses are dsDNA viruses with large genomes. Many genes in the genome remain uncharacterized, and recent studies have demonstrated that the poxvirus transcriptome includes numerous so-called anomalous transcripts not associated with open reading frames. Here, we characterize the expression and role of an apparently non-coding RNA in orthopoxviruses, which we call viral hairpin RNA (vhRNA). Using a bioinformatics approach, we predicted expression of a transcript not associated with an open reading frame that is likely to form a stem-loop structure due to the presence of a 21 nt palindromic sequence. Expression of the transcript as early as 2 h post-infection was confirmed by northern blot and analysis of publicly available vaccinia virus infected cell transcriptomes. The transcription start site was determined by RACE PCE and transcriptome analysis, and early and late promoter sequences were identified. Finally, to test the function of the transcript we generated an ectromelia virus knockout, which failed to form plaques in cell culture. The important role of the transcript in viral replication was further demonstrated using siRNA. Although the function of the transcript remains unknown, our work contributes to evidence of an increasingly complex poxvirus transcriptome, suggesting that transcripts such as vhRNA not associated with an annotated open reading frame can play an important role in viral replication.
Nandakumar, Subhiksha; Bae, Eunhae H; Khan, Arifa S
2017-08-17
The full-length genome sequence of a simian foamy virus (SFVmmu_K3T), isolated from a rhesus macaque ( Macaca mulatta ), was obtained using high-throughput sequencing. SFVmmu_K3T consisted of 12,983 bp and had a genomic organization similar to that of other SFVs, with long terminal repeats (LTRs) and open reading frames for Gag, Pol, Env, Tas, and Bet.
Large diversity of the piggyBac-like elements in the genome of Tribolium castaneum
Wang, Jianjun; Du, Yuzhou; Wang, Suzhi; Brown, Sue; Park, Yoonseong
2011-01-01
The piggyBac transposable element, originally discovered in the cabbage looper, Trichoplusia ni, has been widely used in insect transgenesis including the red flour beetle Tribolium castaneum. We surveyed piggyBac-like (PLE) sequences in the genome of Tribolium castaneum by homology searches using as queries the diverse PLE sequences that have been described previously. The search yielded a total of 32 piggyBac-like elements (TcPLEs) which were classified into 14 distinct groups. Most of the TcPLEs contain defective functional motifs in that they are lacking inverted terminal repeats or have disrupted open reading frames. Only one single copy of TcPLE1 appears to be intact with imperfect 16 bp inverted terminal repeats flanking an open reading frame encoding a transposase of 571 amino acid residues. Many copies of TcPLEs were found to be inserted into or close to other transposon-like sequences. This large diversity of TcPLEs with generally low copy numbers suggests multiple invasions of the TcPLEs over a long evolutionary time without extensive multiplications or occurrence of rapid loss of TcPLEs copies. PMID:18342253
ORF157 from the Archaeal Virus Acidianus Filamentous Virus 1 Defines a New Class of Nuclease▿
Goulet, Adeline; Pina, Mery; Redder, Peter; Prangishvili, David; Vera, Laura; Lichière, Julie; Leulliot, Nicolas; van Tilbeurgh, Herman; Ortiz-Lombardia, Miguel; Campanacci, Valérie; Cambillau, Christian
2010-01-01
Acidianus filamentous virus 1 (AFV1) (Lipothrixviridae) is an enveloped filamentous virus that was characterized from a crenarchaeal host. It infects Acidianus species that thrive in the acidic hot springs (>85°C and pH <3) of Yellowstone National Park, WY. The AFV1 20.8-kb, linear, double-stranded DNA genome encodes 40 putative open reading frames whose sequences generally show little similarity to other genes in the sequence databases. Because three-dimensional structures are more conserved than sequences and hence are more effective at revealing function, we set out to determine protein structures from putative AFV1 open reading frames (ORF). The crystal structure of ORF157 reveals an α+β protein with a novel fold that remotely resembles the nucleotidyltransferase topology. In vitro, AFV1-157 displays a nuclease activity on linear double-stranded DNA. Alanine substitution mutations demonstrated that E86 is essential to catalysis. AFV1-157 represents a novel class of nuclease, but its exact role in vivo remains to be determined. PMID:20200253
A new open reading frame in the genome of the cyanobacterium Synechocystis sp. PCC 6803
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lysenko, E.S.; Ogarkova, O.A.; Tarasov, V.A.
1995-02-01
A new open reading frame ORF242, coding for a 26.47-kDa polypeptide, was found in a DNA fragment of the cyanobacterium Synechocystis 6803, transforming a photosynthetic mutant to photoautotrophy and having homology with plant chloroplast DNA. In the 5{prime} flanking region of ORF242, consensus sequences characteristic of a functioning gene were found. One copy of ORF242 is present in the Synechocystis 6803 genome. Insertion inactivation of ORF242 does not lead to a decrease in photosynthetic activity in cells of cyanobacteria but may influence the ratio between active complexes of photosystems I and II. 22 refs., 6 figs., 2 tabs.
1992-01-01
Mice expressing the minor lymphocyte stimulation antigens, Mls-1a, -2a, or -3a, singly on the B10.BR background have been generated. Mls phenotypes correlate with the integration of mouse mammary tumor viruses (MTV) in the mouse genome. The open reading frames within the 3' long terminal repeats of the integrated MTVs 1, 3, 6, and 13 encode V beta 3-specific superantigens. Sequence data for these viral superantigens is presented, indicating that it is the COOH-terminal portion of the viral superantigen that interacts with the T cell receptor V beta element. PMID:1309854
The VP35 and VP40 proteins of filoviruses. Homology between Marburg and Ebola viruses.
Bukreyev, A A; Volchkov, V E; Blinov, V M; Netesov, S V
1993-05-03
The fragments of genomic RNA sequences of Marburg (MBG) and Ebola (EBO) viruses are reported. These fragments were found to encode the VP35 and VP40 proteins. The canonic sequences were revealed before and after each open reading frame. It is suggested that these sequences are mRNA extremities and at the same time the regulatory elements for mRNA transcription. Homology between the MBG and EBO proteins was discovered.
Bryant, D A; de Lorimier, R; Lambert, D H; Dubbs, J M; Stirewalt, V L; Stevens, S E; Porter, R D; Tam, J; Jay, E
1985-01-01
The genes for the alpha- and beta-subunit apoproteins of allophycocyanin (AP) were isolated from the cyanelle genome of Cyanophora paradoxa and subjected to nucleotide sequence analysis. The AP beta-subunit apoprotein gene was localized to a 7.8-kilobase-pair Pst I restriction fragment from cyanelle DNA by hybridization with a tetradecameric oligonucleotide probe. Sequence analysis using that oligonucleotide and its complement as primers for the dideoxy chain-termination sequencing method confirmed the presence of both AP alpha- and beta-subunit genes on this restriction fragment. Additional oligonucleotide primers were synthesized as sequencing progressed and were used to determine rapidly the nucleotide sequence of a 1336-base-pair region of this cloned fragment. This strategy allowed the sequencing to be completed without a detailed restriction map and without extensive and time-consuming subcloning. The sequenced region contains two open reading frames whose deduced amino acid sequences are 81-85% homologous to cyanobacterial and red algal AP subunits whose amino acid sequences have been determined. The two open reading frames are in the same orientation and are separated by 39 base pairs. AP alpha is 5' to AP beta and both coding sequences are preceded by a polypurine, Shine-Dalgarno-type sequence. Sequences upstream from AP alpha closely resemble the Escherichia coli consensus promoter sequences and also show considerable homology to promoter sequences for several chloroplast-encoded psbA genes. A 56-base-pair palindromic sequence downstream from the AP beta gene could play a role in the termination of transcription or translation. The allophycocyanin apoprotein subunit genes are located on the large single-copy region of the cyanelle genome. PMID:2987916
USDA-ARS?s Scientific Manuscript database
The genome sequence of the constricta strain of Potato yellow dwarf virus (CYDV) was determined to be 12,792 nucleotides long and organized into seven open reading frames with the gene order 3’-N-X-P-Y-M-G-L-5’, which encodes the nucleocapsid, phosphoprotein, movement, matrix, glycoprotein and RNA-d...
2017-01-01
ABSTRACT Here, we report a draft genome sequence of a picorna-like virus associated with brook trout, Salvelinus fontinalis, gill tissue. The draft genome comprises 8,681 nucleotides, excluding the poly(A) tract, and contains two open reading frames. It is most similar to picorna-like viruses that infect invertebrates. PMID:29025930
USDA-ARS?s Scientific Manuscript database
The draft genome sequence of Xylella fastidiosa subsp. multiplex Strain Griffin-1 isolated from a red oak tree (Quercus rubra) in Georgia, U.S.A. is reported. The bacterium has a genome size of 2,387,314 bp with 51.7% G+C content and comprises 2,903 predicted open reading frames (ORFs), and 50 RNA g...
USDA-ARS?s Scientific Manuscript database
This study reports a de novo assembled draft genome sequence of Xylella fastidiosa subsp. multiplex strain BB01 causing blueberry bacterial leaf scorch in Georgia, USA. The BB01 genome is 2,517,579 bp with a G+C content of 51.8% and 2,943 open reading frames (ORFs) and 48 RNA genes....
USDA-ARS?s Scientific Manuscript database
The genome sequence of Flavobacterium psychrophilum strain CSF259-93, isolated from rainbow trout (Oncorhynchus mykiss), consists of a single circular genome of 2,900,735 bp and 2,701 predicted open reading frames (ORFs). Strain CSF259-93 has been used to select a line of rainbow trout with increase...
Complete Genome Sequence of Pseudomonas aeruginosa Phage AAT-1.
Andrade-Domínguez, Andrés; Kolter, Roberto
2016-08-25
Aspects of the interaction between phages and animals are of interest and importance for medical applications. Here, we report the genome sequence of the lytic Pseudomonas phage AAT-1, isolated from mammalian serum. AAT-1 is a double-stranded DNA phage, with a genome of 57,599 bp, containing 76 predicted open reading frames. Copyright © 2016 Andrade-Domínguez and Kolter.
Durand, Pierre M; Oelofse, Andries J; Coetzer, Theresa L
2006-11-04
The completed genome sequences of the malaria parasites P. falciparum, P. y. yoelii and P. vivax have revealed some unusual features. P. falciparum contains the most AT rich genome sequenced so far--over 90% in some regions. In comparison, P. y. yoelii is approximately 77% and P. vivax is approximately 55% AT rich. The evolutionary reasons for these findings are unknown. Mobile genetic elements have a considerable impact on genome evolution but a thorough investigation of these elements in Plasmodium has not been undertaken. We therefore performed a comprehensive genome analysis of these elements and their derivatives in the three Plasmodium species. Whole genome analysis was performed using bioinformatic methods. Forty potential protein encoding sequences with features of transposable elements were identified in P. vivax, eight in P. y. yoelii and only six in P. falciparum. Further investigation of the six open reading frames in P. falciparum revealed that only one is potentially an active mobile genetic element. Most of the open reading frames identified in all three species are hypothetical proteins. Some represent annotated host proteins such as the putative telomerase reverse transcriptase genes in P. y. yoelii and P. falciparum. One of the P. vivax open reading frames identified in this study demonstrates similarity to telomerase reverse transcriptase and we conclude it to be the orthologue of this gene. There is a divergence in the frequencies of mobile genetic elements in the three Plasmodium species investigated. Despite the limitations of whole genome analytical methods, it is tempting to speculate that mobile genetic elements might have been a driving force behind the compositional bias of the P. falciparum genome.
Williams, N P; Mueller, P P; Hinnebusch, A G
1988-01-01
Translational control of GCN4 expression in the yeast Saccharomyces cerevisiae is mediated by multiple AUG codons present in the leader of GCN4 mRNA, each of which initiates a short open reading frame of only two or three codons. Upstream AUG codons 3 and 4 are required to repress GCN4 expression in normal growth conditions; AUG codons 1 and 2 are needed to overcome this repression in amino acid starvation conditions. We show that the regulatory function of AUG codons 1 and 2 can be qualitatively mimicked by the AUG codons of two heterologous upstream open reading frames (URFs) containing the initiation regions of the yeast genes PGK and TRP1. These AUG codons inhibit GCN4 expression when present singly in the mRNA leader; however, they stimulate GCN4 expression in derepressing conditions when inserted upstream from AUG codons 3 and 4. This finding supports the idea that AUG codons 1 and 2 function in the control mechanism as translation initiation sites and further suggests that suppression of the inhibitory effects of AUG codons 3 and 4 is a general consequence of the translation of URF 1 and 2 sequences upstream. Several observations suggest that AUG codons 3 and 4 are efficient initiation sites; however, these sequences do not act as positive regulatory elements when placed upstream from URF 1. This result suggests that efficient translation is only one of the important properties of the 5' proximal URFs in GCN4 mRNA. We propose that a second property is the ability to permit reinitiation following termination of translation and that URF 1 is optimized for this regulatory function. Images PMID:3065626
Schuster, W; Brennicke, A
1991-01-01
An intact gene for the ribosomal protein S19 (rps19) is absent from Oenothera mitochondria. The conserved rps19 reading frame found in the mitochondrial genome is interrupted by a termination codon. This rps19 pseudogene is cotranscribed with the downstream rps3 gene and is edited on both sides of the translational stop. Editing, however, changes the amino acid sequence at positions that were well conserved before editing. Other strange editings create translational stops in open reading frames coding for functional proteins. In coxI and rps3 mRNAs CGA codons are edited to UGA stop codons only five and three codons, respectively, downstream to the initiation codon. These aberrant editings in essential open reading frames and in the rps19 pseudogene appear to have been shifted to these positions from other editing sites. These observations suggest a requirement for a continuous evolutionary constraint on the editing specificities in plant mitochondria. Images PMID:1762921
A versatile and efficient high-throughput cloning tool for structural biology.
Geertsma, Eric R; Dutzler, Raimund
2011-04-19
Methods for the cloning of large numbers of open reading frames into expression vectors are of critical importance for challenging structural biology projects. Here we describe a system termed fragment exchange (FX) cloning that facilitates the high-throughput generation of expression constructs. The method is based on a class IIS restriction enzyme and negative selection markers. FX cloning combines attractive features of established recombination- and ligation-independent cloning methods: It allows the straightforward transfer of an open reading frame into a variety of expression vectors and is highly efficient and very economic in its use. In addition, FX cloning avoids the common but undesirable feature of significantly extending target open reading frames with cloning related sequences, as it leaves a minimal seam of only a single extra amino acid to either side of the protein. The method has proven to be very robust and suitable for all common pro- and eukaryotic expression systems. It considerably speeds up the generation of expression constructs compared to traditional methods and thus facilitates a broader expression screening.
Cioffi, Anna Valentina; Ferrara, Diana; Cubellis, Maria Vittoria; Aniello, Francesco; Corrado, Marcella; Liguori, Francesca; Amoroso, Alessandro; Fucci, Laura; Branno, Margherita
2002-08-01
Analysis of the genome structure of the Paracentrotus lividus (sea urchin) DNA methyltransferase (DNA MTase) gene showed the presence of an open reading frame, named METEX, in intron 7 of the gene. METEX expression is developmentally regulated, showing no correlation with DNA MTase expression. In fact, DNA MTase transcripts are present at high concentrations in the early developmental stages, while METEX is expressed at late stages of development. Two METEX cDNA clones (Met1 and Met2) that are different in the 3' end have been isolated in a cDNA library screening. The putative translated protein from Met2 cDNA clone showed similarity with Escherichia coli endonuclease III on the basis of sequence and predictive three-dimensional structure. The protein, overexpressed in E. coli and purified, had functional properties similar to the endonuclease specific for apurinic/apyrimidinic (AP) sites on the basis of the lyase activity. Therefore the open reading frame, present in intron 7 of the P. lividus DNA MTase gene, codes for a functional AP endonuclease designated SuAP1.
Human milk metagenome: a functional capacity analysis
2013-01-01
Background Human milk contains a diverse population of bacteria that likely influences colonization of the infant gastrointestinal tract. Recent studies, however, have been limited to characterization of this microbial community by 16S rRNA analysis. In the present study, a metagenomic approach using Illumina sequencing of a pooled milk sample (ten donors) was employed to determine the genera of bacteria and the types of bacterial open reading frames in human milk that may influence bacterial establishment and stability in this primal food matrix. The human milk metagenome was also compared to that of breast-fed and formula-fed infants’ feces (n = 5, each) and mothers’ feces (n = 3) at the phylum level and at a functional level using open reading frame abundance. Additionally, immune-modulatory bacterial-DNA motifs were also searched for within human milk. Results The bacterial community in human milk contained over 360 prokaryotic genera, with sequences aligning predominantly to the phyla of Proteobacteria (65%) and Firmicutes (34%), and the genera of Pseudomonas (61.1%), Staphylococcus (33.4%) and Streptococcus (0.5%). From assembled human milk-derived contigs, 30,128 open reading frames were annotated and assigned to functional categories. When compared to the metagenome of infants’ and mothers’ feces, the human milk metagenome was less diverse at the phylum level, and contained more open reading frames associated with nitrogen metabolism, membrane transport and stress response (P < 0.05). The human milk metagenome also contained a similar occurrence of immune-modulatory DNA motifs to that of infants’ and mothers’ fecal metagenomes. Conclusions Our results further expand the complexity of the human milk metagenome and enforce the benefits of human milk ingestion on the microbial colonization of the infant gut and immunity. Discovery of immune-modulatory motifs in the metagenome of human milk indicates more exhaustive analyses of the functionality of the human milk metagenome are warranted. PMID:23705844
Yedavalli, Venkat R. K.; Chappey, Colombe; Matala, Erik; Ahmad, Nafees
1998-01-01
The human immunodeficiency virus type 1 (HIV-1) vif gene is conserved among most lentiviruses, suggesting that vif is important for natural infection. To determine whether an intact vif gene is positively selected during mother-to-infant transmission, we analyzed vif sequences from five infected mother-infant pairs following perinatal transmission. The coding potential of the vif open reading frame directly derived from uncultured peripheral blood mononuclear cell DNA was maintained in most of the 78,912 bp sequenced. We found that 123 of the 137 clones analyzed showed an 89.8% frequency of intact vif open reading frames. There was a low degree of heterogeneity of vif genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vif sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Furthermore, the epidemiologically linked mother-infant pair vif sequences displayed similar patterns that were not seen in vif sequences from epidemiologically unlinked individuals. The functional domains, including the two cysteines at positions 114 and 133, a serine phosphorylation site at position 144, and the C-terminal basic amino acids essential for vif protein function, were highly conserved in most of the sequences. Phylogenetic analyses of 137 mother-infant pair vif sequences and 187 other available vif sequences from HIV-1 databases revealed distinct clusters for vif sequences from each mother-infant pair and for other vif sequences. Taken together, these findings suggest that vif plays an important role in HIV-1 infection and replication in mothers and their perinatally infected infants. PMID:9445004
The genome organisation and taxonomy of Sugarcane striate mosaic associated virus.
Thompson, N; Randles, J W
2001-08-01
Sugarcane striate mosaic associated virus (SCSMaV) has slightly flexuous 950 nm x 15 nm filamentous particles and is associated with sugarcane striate mosaic disease in central Queensland, Australia. We report the full sequence of its RNA genome, which comprises 5 open reading frames representing the polymerase, movement function proteins encoded in a triple gene block and coat protein. Phylogenetic analyses based on either the full nucleotide sequence, the polymerase protein, or the coat protein all placed SCSMaV in an intermediate position between the genera Foveavirus and Carlavirus, but outside both genera. In addition, the absence of a sixth open reading frame excludes it from the genus Carlavirus, and the coat protein is approximately half the size of the type member for the genus Foveavirus. Although SCSMaV was most closely allied to Cherry green ring mottle virus by genome analysis, the two viruses are morphologically and biologically dissimilar. SCSMaV may therefore represent a new plant virus taxon.
Generating an Open Reading Frame (ORF) Entry Clone and Destination Clone.
Reece-Hoyes, John S; Walhout, Albertha J M
2018-01-02
This protocol describes using the Gateway recombinatorial cloning system to create an Entry clone carrying an open reading frame (ORF) and then to transfer the ORF into a Destination vector. In this example, BP recombination is used to clone an ORF from a cDNA source into the Donor vector pDONR 221. The ORF from the resulting Entry clone is then transferred into the Destination vector pDEST-15; the product (the Destination clone) will express the ORF as an amino-terminal GST-fusion. The technique can be used as a guide for cloning any other DNA fragment of interest-a promoter sequence or 3' untranslated region (UTR), for example-with substitutions of different genetic material such as genomic DNA, att sites, and vectors as required. The series of constructions and transformations requires 9-15 d, not including time that may be required for sequence confirmation, if desired/necessary. © 2018 Cold Spring Harbor Laboratory Press.
Cheng, Bing; Furtado, Agnelo
2017-01-01
Abstract Polyploidization contributes to the complexity of gene expression, resulting in numerous related but different transcripts. This study explored the transcriptome diversity and complexity of the tetraploid Arabica coffee (Coffea arabica) bean. Long-read sequencing (LRS) by Pacbio Isoform sequencing (Iso-seq) was used to obtain full-length transcripts without the difficulty and uncertainty of assembly required for reads from short-read technologies. The tetraploid transcriptome was annotated and compared with data from the sub-genome progenitors. Caffeine and sucrose genes were targeted for case analysis. An isoform-level tetraploid coffee bean reference transcriptome with 95 995 distinct transcripts (average 3236 bp) was obtained. A total of 88 715 sequences (92.42%) were annotated with BLASTx against NCBI non-redundant plant proteins, including 34 719 high-quality annotations. Further BLASTn analysis against NCBI non-redundant nucleotide sequences, Coffea canephora coding sequences with UTR, C. arabica ESTs, and Rfam resulted in 1213 sequences without hits, were potential novel genes in coffee. Longer UTRs were captured, especially in the 5΄UTRs, facilitating the identification of upstream open reading frames. The LRS also revealed more and longer transcript variants in key caffeine and sucrose metabolism genes from this polyploid genome. Long sequences (>10 kilo base) were poorly annotated. LRS technology shows the limitation of previous studies. It provides an important tool to produce a reference transcriptome including more of the diversity of full-length transcripts to help understand the biology and support the genetic improvement of polyploid species such as coffee. PMID:29048540
Whole-Genome Sequence of "Candidatus Profftella armatura" from Diaphorina citri in Guangdong, China.
Wu, F; Deng, X; Liang, G; Huang, J; Cen, Y; Chen, J
2015-11-05
The genome of "Candidatus Profftella armatura" strain YCPA from Diaphorina citri in Guangdong, China, was sequenced. The strain has a chromosome of 457,565 bp, 24.3% G+C content, 364 predicted open reading frames (ORFs), and 38 RNAs, and a plasmid, pYCPA54, of 5,458 bp with 23.9% G+C content and 5 ORFs. Copyright © 2015 Wu et al.
Dobinson, K F; Harris, R E; Hamer, J E
1993-01-01
The fungal phytopathogen Magnaporthe grisea parasitizes a wide variety of gramineous hosts. In the course of investigating the genetic relationship between pathogen genotype and host specificity we identified a retroelement that is present in some strains of M. grisea that infect finger millet and goosegrass (members of the plant genus Eleusine). The element, designated grasshopper (grh), is present in multiple copies and dispersed throughout the genome. DNA sequence analysis showed that grasshopper contains 198 base pair direct, long terminal repeats (LTRs) with features characteristic of retroviral and retrotransposon LTRs. Within the element we identified an open reading frame with sequences homologous to the reverse transcriptase, RNaseH, and integrase domains of retroelement pol genes. Comparison of the open reading frame with sequences from other retroelements showed that grh is related to the gypsy family of retrotransposons. Comparisons of the distribution of the grasshopper element with other dispersed repeated DNA sequences in M. grisea indicated that grasshopper was present in a broadly dispersed subgroup of Eleusine pathogens, suggesting that the element was acquired subsequent to the evolution of this host-specific form. We present arguments that the amplification of different retroelements within populations of M. grisea is a consequence of the clonal organization of the fungal populations.
Fission yeast retrotransposon Tf1 integration is targeted to 5' ends of open reading frames.
Behrens, R; Hayles, J; Nurse, P
2000-12-01
Target site selection of transposable elements is usually not random but involves some specificity for a DNA sequence or a DNA binding host factor. We have investigated the target site selection of the long terminal repeat-containing retrotransposon Tf1 from the fission yeast Schizosaccharomyces pombe. By monitoring induced transposition events we found that Tf1 integration sites were distributed throughout the genome. Mapping these insertions revealed that Tf1 did not integrate into open reading frames, but occurred preferentially in longer intergenic regions with integration biased towards a region 100-420 bp upstream of the translation start site. Northern blot analysis showed that transcription of genes adjacent to Tf1 insertions was not significantly changed.
Fission yeast retrotransposon Tf1 integration is targeted to 5′ ends of open reading frames
Behrens, Ralf; Hayles, Jacky; Nurse, Paul
2000-01-01
Target site selection of transposable elements is usually not random but involves some specificity for a DNA sequence or a DNA binding host factor. We have investigated the target site selection of the long terminal repeat-containing retrotransposon Tf1 from the fission yeast Schizosaccharomyces pombe. By monitoring induced transposition events we found that Tf1 integration sites were distributed throughout the genome. Mapping these insertions revealed that Tf1 did not integrate into open reading frames, but occurred preferentially in longer intergenic regions with integration biased towards a region 100–420 bp upstream of the translation start site. Northern blot analysis showed that transcription of genes adjacent to Tf1 insertions was not significantly changed. PMID:11095681
Wu, L-P; Yang, T; Liu, H-W; Postman, J; Li, R
2018-05-01
A large contig with sequence similarities to several nucleorhabdoviruses was identified by high-throughput sequencing analysis from a black currant (Ribes nigrum L.) cultivar. The complete genome sequence of this new nucleorhabdovirus is 14,432 nucleotides long. Its genomic organization is very similar to those of unsegmented plant rhabdoviruses, containing six open reading frames in the order 3'-N-P-P3-M-G-L-5. The virus, which is provisionally named "black currant-associated rhabdovirus", is 41-52% identical in its genome nucleotide sequence to other nucleorhabdoviruses and may represent a new species in the genus Nucleorhabdovirus.
Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng
2017-01-01
To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae. Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae. Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat. PMID:28932215
Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng
2017-01-01
To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae . Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae . Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat.
Earl, P L; Jones, E V; Moss, B
1986-01-01
A 5400-base-pair segment of the vaccinia virus genome was sequenced and an open reading frame of 938 codons was found precisely where the DNA polymerase had been mapped by transfer of a phosphonoacetate-resistance marker. A single nucleotide substitution changing glycine at position 347 to aspartic acid accounts for the drug resistance of the mutant vaccinia virus. The 5' end of the DNA polymerase mRNA was located 80 base pairs before the methionine codon initiating the open reading frame. Correspondence between the predicted Mr 108,577 polypeptide and the 110,000 purified enzyme indicates that little or no proteolytic processing occurs. Extensive homology, extending over 435 amino acids, was found upon comparing the DNA polymerase of vaccinia virus and DNA polymerase of Epstein-Barr virus. A highly conserved sequence of 14 amino acids in the carboxyl-terminal regions of the above DNA polymerases is also present at a similar location in adenovirus DNA polymerase. This structure, which is predicted to form a turn flanked by beta-pleated sheets, may form part of an essential binding or catalytic site that accounts for its presence in DNA polymerases of poxviruses, herpesviruses, and adenoviruses. Images PMID:3012524
Phylogenetic tree construction using trinucleotide usage profile (TUP).
Chen, Si; Deng, Lih-Yuan; Bowman, Dale; Shiau, Jyh-Jen Horng; Wong, Tit-Yee; Madahian, Behrouz; Lu, Henry Horng-Shing
2016-10-06
It has been a challenging task to build a genome-wide phylogenetic tree for a large group of species containing a large number of genes with long nucleotides sequences. The most popular method, called feature frequency profile (FFP-k), finds the frequency distribution for all words of certain length k over the whole genome sequence using (overlapping) windows of the same length. For a satisfactory result, the recommended word length (k) ranges from 6 to 15 and it may not be a multiple of 3 (codon length). The total number of possible words needed for FFP-k can range from 4 6 =4096 to 4 15 . We propose a simple improvement over the popular FFP method using only a typical word length of 3. A new method, called Trinucleotide Usage Profile (TUP), is proposed based only on the (relative) frequency distribution using non-overlapping windows of length 3. The total number of possible words needed for TUP is 4 3 =64, which is much less than the total count for the recommended optimal "resolution" for FFP. To build a phylogenetic tree, we propose first representing each of the species by a TUP vector and then using an appropriate distance measure between pairs of the TUP vectors for the tree construction. In particular, we propose summarizing a DNA sequence by a matrix of three rows corresponding to three reading frames, recording the frequency distribution of the non-overlapping words of length 3 in each of the reading frame. We also provide a numerical measure for comparing trees constructed with various methods. Compared to the FFP method, our empirical study showed that the proposed TUP method is more capable of building phylogenetic trees with a stronger biological support. We further provide some justifications on this from the information theory viewpoint. Unlike the FFP method, the TUP method takes the advantage that the starting of the first reading frame is (usually) known. Without this information, the FFP method could only rely on the frequency distribution of overlapping words, which is the average (or mixture) of the frequency distributions of three possible reading frames. Consequently, we show (from the entropy viewpoint) that the FFP procedure could dilute important gene information and therefore provides less accurate classification.
The Apis mellifera filamentous virus genome
USDA-ARS?s Scientific Manuscript database
A complete reference genome of the Apis mellifera Filamentous virus (AmFV) was determined using Illumina Hiseq sequencing. The AmFV genome is a double strand DNA molecule of approximately 498’500 nucleotides with a GC content of 50.8%. It encompasses 251 non overlapping open reading frames (ORFs), e...
Effects of the HN gene c-terminal extensions on the Newcastle disease virus virulence
USDA-ARS?s Scientific Manuscript database
The hemagglutinin-neuraminidase (HN) of Newcastle disease virus (NDV) is a multifunctional protein that has receptor recognition, neuraminidase and fusion promotion activities. Sequence analysis revealed that the HN gene of many extremely low virulence NDV strains encodes a larger open reading frame...
PRIMARY STRUCTURE OF THE P450 LANOSTEROL DEMETHYLASE GENE FROM SACCHAROMYCES CEREVISIAE
We have sequenced the structural gene and flanking regions for lanosterol 14 alpha-demethylase (14DM) from Saccharomyces cerevisiae. An open reading frame of 530 codons encodes a 60.7-kDa protein. When this gene is disrupted by integrative transformation, the resulting strain req...
Masuda, Isao; Matsuzaki, Motomichi; Kita, Kiyoshi
2010-10-01
Diverse mitochondrial (mt) genetic systems have evolved independently of the more uniform nuclear system and often employ modified genetic codes. The organization and genetic system of dinoflagellate mt genomes are particularly unusual and remain an evolutionary enigma. We determined the sequence of full-length cytochrome c oxidase subunit 1 (cox1) mRNA of the earliest diverging dinoflagellate Perkinsus and show that this gene resides in the mt genome. Apparently, this mRNA is not translated in a single reading frame with standard codon usage. Our examination of the nucleotide sequence and three-frame translation of the mRNA suggest that the reading frame must be shifted 10 times, at every AGG and CCC codon, to yield a consensus COX1 protein. We suggest two possible mechanisms for these translational frameshifts: a ribosomal frameshift in which stalled ribosomes skip the first bases of these codons or specialized tRNAs recognizing non-triplet codons, AGGY and CCCCU. Regardless of the mechanism, active and efficient machinery would be required to tolerate the frameshifts predicted in Perkinsus mitochondria. To our knowledge, this is the first evidence of translational frameshifts in protist mitochondria and, by far, is the most extensive case in mitochondria.
MetAMOS: a modular and open source metagenomic assembly and analysis pipeline
2013-01-01
We describe MetAMOS, an open source and modular metagenomic assembly and analysis pipeline. MetAMOS represents an important step towards fully automated metagenomic analysis, starting with next-generation sequencing reads and producing genomic scaffolds, open-reading frames and taxonomic or functional annotations. MetAMOS can aid in reducing assembly errors, commonly encountered when assembling metagenomic samples, and improves taxonomic assignment accuracy while also reducing computational cost. MetAMOS can be downloaded from: https://github.com/treangen/MetAMOS. PMID:23320958
Genome sequences of a mouse-avirulent and a mouse-virulent strain of Ross River virus.
Faragher, S G; Meek, A D; Rice, C M; Dalgarno, L
1988-04-01
The nucleotide sequence of the genomic RNA of a mouse-avirulent strain of Ross River virus, RRV NB5092 (isolated in 1969), has been determined and the corresponding sequence for the prototype mouse-virulent strain, RRV T48 (isolated in 1959), has been completed. The RRV NB5092 genome is approximately 11,674 nucleotides in length, compared with 11,853 nucleotides for RRV T48. RRV NB5092 and RRV T48 have the same genome organization. For both viruses an untranslated region of 80 nucleotides at the 5' end of the genome is followed by a 7440-nucleotide open reading frame which is interrupted after 5586 nucleotides by a single opal termination codon. By homology with other alphaviruses, the 5586-nucleotide open reading frame encodes the nonstructural proteins nsP1, nsP2, and nsP3; a fourth nonstructural protein, nsP4, is produced by read-through of the opal codon. The RRV nonstructural proteins show strong homology with the corresponding proteins of Sindbis virus and Semliki Forest virus in terms of size, net charge, and hydropathy characteristics. However, homology is not uniform between or within the proteins; nsP1, nsP2, and nsP4 contain extended domains which are highly conserved between alphaviruses, while the C-terminal region of nsP3 shows little conservation in sequence or length between alphaviruses. An untranslated "junction" region of 44 nucleotides (for RRV NB5092) or 47 nucleotides (for RRV T48) separates the nonstructural and structural protein coding regions. The structural proteins (capsid-E3-E2-6K-E1) are translated from an open reading frame of 3762 nucleotides which is followed by a 3'-untranslated region of approximately 348 nucleotides (for RRV NB5092) or 524 nucleotides (for RRV T48). Excluding deletions and insertions, the genomes of RRV NB5092 and RRV T48 differ at 284 nucleotides, representing a sequence divergence of 2.38%. Sequence deletions or insertions were found only in the noncoding regions and include a 173-nucleotide deletion in the 3'-untranslated region of RRV NB5092, compared with RRV T48. In the coding regions, most of the nucleotide differences are silent; there are 36 amino acid differences in the nonstructural proteins and 12 in the structural proteins. The distribution of amino acid differences between the two RRV strains correlates with the location of domains which are poorly conserved in sequence between alphaviruses. The possible role of amino acid differences in envelope glycoproteins E1 and E2 in determining the different antigenic and biological properties of RRV NB5092 and RRV T48 is discussed.
Zurawski, Gerard; Bohnert, Hans J.; Whitfeld, Paul R.; Bottomley, Warwick
1982-01-01
The gene for the so-called Mr 32,000 rapidly labeled photosystem II thylakoid membrane protein (here designated psbA) of spinach (Spinacia oleracea) chloroplasts is located on the chloroplast DNA in the large single-copy region immediately adjacent to one of the inverted repeat sequences. In this paper we show that the size of the mRNA for this protein is ≈ 1.25 kilobases and that the direction of transcription is towards the inverted repeat unit. The nucleotide sequence of the gene and its flanking regions is presented. The only large open reading frame in the sequence codes for a protein of Mr 38,950. The nucleotide sequence of psbA from Nicotiana debneyi also has been determined, and comparison of the sequences from the two species shows them to be highly conserved (>95% homology) throughout the entire reading frame. Conservation of the amino acid sequence is absolute, there being no changes in a total of 353 residues. This leads us to conclude that the primary translation product of psbA must be a protein of Mr 38,950. The protein is characterized by the complete absence of lysine residues and is relatively rich in hydrophobic amino acids, which tend to be clustered. Transcription of spinach psbA starts about 86 base pairs before the first ATG codon. Immediately upstream from this point there is a sequence typical of that found in E. coli promoters. An almost identical sequence occurs in the equivalent region of N. debneyi DNA. Images PMID:16593262
Tahir, Muhammad N; Lockhart, Ben; Grinstead, Samuel; Mollov, Dimitre
2017-04-01
Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high-throughput sequencing (HTS). The nearly full genome sequence of a panicovirus was identified from one HTS scaffold. Sanger sequencing was used to confirm the HTS results and complete the genome sequence of 4404 nt. This virus was provisionally named Bermuda grass latent virus (BGLV). Its predicted open reading frames follow the typical arrangement of the genus Panicovirus. Based on sequence comparisons and phylogenetic analyses BGLV differs from other viruses and therefore taxonomically it is a new member of the genus Panicovirus, family Tombusviridae.
Reinhardt, Josephine A.; Wanjiru, Betty M.; Brant, Alicia T.; Saelao, Perot; Begun, David J.; Jones, Corbin D.
2013-01-01
How non-coding DNA gives rise to new protein-coding genes (de novo genes) is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs), while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important. PMID:24146629
AbouHaidar, Mounir Georges; Venkataraman, Srividhya; Golshani, Ashkan; Liu, Bolin; Ahmad, Tauqeer
2014-10-07
The highly structured (64% GC) covalently closed circular (CCC) RNA (220 nt) of the virusoid associated with rice yellow mottle virus codes for a 16-kDa highly basic protein using novel modalities for coding, translation, and gene expression. This CCC RNA is the smallest among all known viroids and virusoids and the only one that codes proteins. Its sequence possesses an internal ribosome entry site and is directly translated through two (or three) completely overlapping ORFs (shifting to a new reading frame at the end of each round). The initiation and termination codons overlap UGAUGA (underline highlights the initiation codon AUG within the combined initiation-termination sequence). Termination codons can be ignored to obtain larger read-through proteins. This circular RNA with no noncoding sequences is a unique natural supercompact "nanogenome."
Tao, Jie; Li, Benqiang; Zhang, Chunling; Liu, Huili
2016-11-10
Two porcine epidemic diarrhea virus (PEDV) strains, JSLS-1/2015 and JS-2/2015, were isolated from piglets with watery diarrhea in South China. Two genomic sequences were highly homologous to the attenuated DR13 strain. Furthermore, JSLS-1/2015 contains a 24-amino-acid deletion in open reading frame 1b, which was first reported in PEDV isolates. Copyright © 2016 Tao et al.
Structure, sequence and expression of the hepatitis delta (δ) viral genome
NASA Astrophysics Data System (ADS)
Wang, Kang-Sheng; Choo, Qui-Lim; Weiner, Amy J.; Ou, Jing-Hsiung; Najarian, Richard C.; Thayer, Richard M.; Mullenbach, Guy T.; Denniston, Katherine J.; Gerin, John L.; Houghton, Michael
1986-10-01
Biochemical and electron microscopic data indicate that the human hepatitis δ viral agent contains a covalently closed circular and single-stranded RNA genome that has certain similarities with viroid-like agents from plants. The sequence of the viral genome (1,678 nucleotides) has been determined and an open reading frame within the complementary strand has been shown to encode an antigen that binds specifically to antisera from patients with chronic hepatitis δ viral infections.
USDA-ARS?s Scientific Manuscript database
Clematis chlorotic mottle virus (ClCMV) is a previously undescribed virus associated with yellow mottling and veining, chlorotic ring spots, line pattern mosaics, and flower distortion and discoloration on ornamental Clematis. The ClCMV genome is 3,880nt in length with 5 putative open reading frames...
cDNA encoding a polypeptide including a hevein sequence
Raikhel, N.V.; Broekaert, W.F.; Namhai Chua; Kush, A.
1993-02-16
A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids.
Southern Tomato Virus: The Link between the Families Totiviridae and Partitiviridae
USDA-ARS?s Scientific Manuscript database
A dsRNA virus with a genome of 3.5 kb was isolated from field and greenhouse-grown tomato plants of different cultivars and geographic locations in North America. Cloning and sequencing of the viral genome showed the presence of two partially overlapping open reading frames (ORFs) and a genomic orga...
Bijective transformation circular codes and nucleotide exchanging RNA transcription.
Michel, Christian J; Seligmann, Hervé
2014-04-01
The C(3) self-complementary circular code X identified in genes of prokaryotes and eukaryotes is a set of 20 trinucleotides enabling reading frame retrieval and maintenance, i.e. a framing code (Arquès and Michel, 1996; Michel, 2012, 2013). Some mitochondrial RNAs correspond to DNA sequences when RNA transcription systematically exchanges between nucleotides (Seligmann, 2013a,b). We study here the 23 bijective transformation codes ΠX of X which may code nucleotide exchanging RNA transcription as suggested by this mitochondrial observation. The 23 bijective transformation codes ΠX are C(3) trinucleotide circular codes, seven of them are also self-complementary. Furthermore, several correlations are observed between the Reading Frame Retrieval (RFR) probability of bijective transformation codes ΠX and the different biological properties of ΠX related to their numbers of RNAs in GenBank's EST database, their polymerization rate, their number of amino acids and the chirality of amino acids they code. Results suggest that the circular code X with the functions of reading frame retrieval and maintenance in regular RNA transcription, may also have, through its bijective transformation codes ΠX, the same functions in nucleotide exchanging RNA transcription. Associations with properties such as amino acid chirality suggest that the RFR of X and its bijective transformations molded the origins of the genetic code's machinery. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Origin of noncoding DNA sequences: molecular fossils of genome evolution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Naora, H.; Miyahara, K.; Curnow, R.N.
The total amount of noncoding sequences on chromosomes of contemporary organisms varies significantly from species to species. The authors propose a hypothesis for the origin of these noncoding sequences that assumes that (i) an approx. 0.55-kilobase (kb)-long reading frame composed the primordial gene and (ii) a 20-kb-long single-stranded polynucleotide is the longest molecule (as a genome) that was polymerized at random and without a specific template in the primordial soup/cell. The statistical distribution of stop codons allows examination of the probability of generating reading frames of approx. 0.55 kb in this primordial polynucleotide. This analysis reveals that with three stopmore » codons, a run of at least 0.55-kb equivalent length of nonstop codons would occur in 4.6% of 20-kb-long polynucleotide molecules. They attempt to estimate the total amount of noncoding sequences that would be present on the chromosomes of contemporary species assuming that present-day chromosomes retain the prototype primordial genome structure. Theoretical estimates thus obtained for most eukaryotes do not differ significantly from those reported for these specific organisms, with only a few exceptions. Furthermore, analysis of possible stop-codon distributions suggests that life on earth would not exist, at least in its present form, had two or four stop codons been selected early in evolution.« less
Dupont, L; Boizet-Bonhoure, B; Coddeville, M; Auvray, F; Ritzenthaler, P
1995-01-01
Temperate phage mv4 integrates its DNA into the chromosome of Lactobacillus delbrueckii subsp. bulgaricus strains via site-specific recombination. Nucleotide sequencing of a 2.2-kb attP-containing phage fragment revealed the presence of four open reading frames. The larger open reading frame, close to the attP site, encoded a 427-amino-acid polypeptide with similarity in its C-terminal domain to site-specific recombinases of the integrase family. Comparison of the sequences of attP, bacterial attachment site attB, and host-phage junctions attL and attR identified a 17-bp common core sequence, where strand exchange occurs during recombination. Analysis of the attB sequence indicated that the core region overlaps the 3' end of a tRNA(Ser) gene. Phage mv4 DNA integration into the tRNA(Ser) gene preserved an intact tRNA(Ser) gene at the attL site. An integration vector based on the mv4 attP site and int gene was constructed. This vector transforms a heterologous host, L. plantarum, through site-specific integration into the tRNA(Ser) gene of the genome and will be useful for development of an efficient integration system for a number of additional bacterial species in which an identical tRNA gene is present. PMID:7836291
Demonstration of retrotransposition of the Tf1 element in fission yeast.
Levin, H L; Boeke, J D
1992-03-01
Tf1, a retrotransposon from fission yeast, has LTRs and coding sequences resembling the protease, reverse transcriptase and integrase domains of retroviral pol genes. A unique aspect of Tf1 is that it contains a single open reading frame whereas other retroviruses and retrotransposons usually possess two or more open reading frames. To determine whether Tf1 can transpose, we overproduced Tf1 transcripts encoded by a plasmid copy of the element marked with a neo gene. Approximately 0.1-4.0% of the cell population acquired chromosomally inherited resistance to G418. DNA blot analysis demonstrated that such strains had acquired both Tf1 and neo specific sequences within a restriction fragment of the same size; the size of this restriction fragment varied between different isolates. Structural analysis of the cloned DNA flanking the Tf1-neo element of two transposition candidates with the same regions in the parent strain showed that the ability to grow on G418 was due to transposition of Tf1-neo and not other types of recombination events.
Greenblatt, R.J.; Quackenbush, S.L.; Casey, R.N.; Rovnak, J.; Balazs, G.H.; Work, Thierry M.; Casey, J.W.; Sutton, C.A.
2005-01-01
Fibropapillomatosis (FP) of marine turtles is an emerging neoplastic disease associated with infection by a novel turtle herpesvirus, fibropapilloma-associated turtle herpesvirus (FPTHV). This report presents 23 kb of the genome of an FPTHV infecting a Hawaiian green turtle (Chelonia mydas). By sequence homology, the open reading frames in this contig correspond to herpes simplex virus genes UL23 through UL36. The order, orientation, and homology of these putative genes indicate that FPTHV is a member of the Alphaherpesvirinae. The UL27-, UL30-, and UL34-homologous open reading frames from FPTHVs infecting nine FP-affected marine turtles from seven geographic areas and three turtle species (C. mydas, Caretta caretta, and Lepidochelys olivacea) were compared. A high degree of nucleotide sequence conservation was found among these virus variants. However, geographic variations were also found: the FPTHVs examined here form four groups, corresponding to the Atlantic Ocean, West pacific, mid-Pacific, and east Pacific. Our results indicate that FPTHV was established in marine turtle populations prior to the emergence of FP as it is currently known.
Immortalized Muscle Cell Model to Test the Exon Skipping Efficacy for Duchenne Muscular Dystrophy
Nguyen, Quynh
2017-01-01
Duchenne muscular dystrophy (DMD) is a lethal genetic disorder that most commonly results from mutations disrupting the reading frame of the dystrophin (DMD) gene. Among the therapeutic approaches employed, exon skipping using antisense oligonucleotides (AOs) is one of the most promising strategies. This strategy aims to restore the reading frame, thus producing a truncated, yet functioning dystrophin protein. In 2016, the Food and Drug Administration (FDA) conditionally approved the first AO-based drug, eteplirsen (Exondys 51), developed for DMD exon 51 skipping. An accurate and reproducible method to quantify exon skipping efficacy is essential for evaluating the therapeutic potential of different AOs sequences. However, previous in vitro screening studies have been hampered by the limited proliferative capacity and insufficient amounts of dystrophin expressed by primary muscle cell lines that have been the main system used to evaluate AOs sequences. In this paper, we illustrate the challenges associated with primary muscle cell lines and describe a novel approach that utilizes immortalized cell lines to quantitatively evaluate the exon skipping efficacy in in vitro studies. PMID:29035327
Homology of aspartyl- and lysyl-tRNA synthetases.
Gampel, A; Tzagoloff, A
1989-01-01
The yeast nuclear gene MSD1 coding for mitochondrial aspartyl-tRNA synthetase has been cloned and sequenced. The identity of the gene is confirmed by the following evidence. (i) The primary structure of the protein derived from the gene sequence is similar to that of the yeast cytoplasmic aspartyl-tRNA synthetase. (ii) In situ disruption of MSD1 in a respiratory-competent haploid strain of yeast induces a pleiotropic phenotype consistent with a lesion in mitochondrial protein synthesis. (iii) Mitochondria from a mutant with a disrupted chromosomal copy of MSD1 are unable to acylate mitochondrial aspartyl-tRNA. The primary structures of the cytoplasmic and mitochondrial aspartyl-tRNA synthetases are similar to the yeast cytoplasmic lysyl-tRNA synthetase, suggesting that the two types of synthetases may have a common evolutionary origin. Searches of the current protein banks also have revealed a high degree of sequence similarity of the lysyl-tRNA synthetase to the product of the Escherichia coli herC gene and to the partial sequence of a protein encoded by an unidentified reading frame located adjacent to the E. coli frdA gene. Based on the sequence similarities and the map positions of the herC and frdA loci, we propose herC to be the structural gene of the constitutively expressed lysyl-tRNA synthetase of E. coli and the unidentified reading frame to be the structural gene of the heat-inducible lysyl-tRNA synthetase. Images PMID:2668951
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jackson, P.J.; Walthers, E.A.; Richmond, K.L.
1997-04-01
PCR analysis of 198 Bacillus anthracis isolates revealed a variable region of DNA sequence differing in length among the isolates. Five Polymorphisms differed by the presence Of two to six copies of the 12-bp tandem repeat 5{prime}-CAATATCAACAA-3{prime}. This variable-number tandem repeat (VNTR) region is located within a larger sequence containing one complete open reading frame that encodes a putative 30-kDa protein. Length variation did not change the reading frame of the encoded protein and only changed the copy number of a 4-amino-acid sequence (QYQQ) from 2 to 6. The structure of the VNTR region suggests that these multiple repeats aremore » generated by recombination or polymerase slippage. Protein structures predicted from the reverse-translated DNA sequence suggest that any structural changes in the encoded protein are confined to the region encoded by the VNTR sequence. Copy number differences in the VNTR region were used to define five different B. anthracis alleles. Characterization of 198 isolates revealed allele frequencies of 6.1, 17.7, 59.6, 5.6, and 11.1% sequentially from shorter to longer alleles. The high degree of polymorphism in the VNTR region provides a criterion for assigning isolates to five allelic categories. There is a correlation between categories and geographic distribution. Such molecular markers can be used to monitor the epidemiology of anthrax outbreaks in domestic and native herbivore populations. 22 refs., 4 figs., 3 tabs.« less
Guo, Chun-Teng; McClean, Stephen; Shaw, Chris; Rao, Ping-Fan; Ye, Ming-Yu; Bjourson, Anthony J
2013-05-01
One novel Kunitz BPTI-like peptide designated as BBPTI-1, with chymotrypsin inhibitory activity was identified from the venom of Burmese Daboia russelii siamensis. It was purified by three steps of chromatography including gel filtration, cation exchange and reversed phase. A partial N-terminal sequence of BBPTI-1, HDRPKFCYLPADPGECLAHMRSF was obtained by automated Edman degradation and a Ki value of 4.77nM determined. Cloning of BBPTI-1 including the open reading frame and 3' untranslated region was achieved from cDNA libraries derived from lyophilized venom using a 3' RACE strategy. In addition a cDNA sequence, designated as BBPTI-5, was also obtained. Alignment of cDNA sequences showed that BBPTI-5 exhibited an identical sequence to BBPTI-1 cDNA except for an eight nucleotide deletion in the open reading frame. Gene variations that represented deletions in the BBPTI-5 cDNA resulted in a novel protease inhibitor analog. Amino acid sequence alignment revealed that deduced peptides derived from cloning of their respective precursor cDNAs from libraries showed high similarity and homology with other Kunitz BPTI proteinase inhibitors. BBPTI-1 and BBPTI-5 consist of 60 and 66 amino acid residues respectively, including six conserved cysteine residues. As these peptides have been reported to have influence on the processes of coagulation, fibrinolysis and inflammation, their potential application in biomedical contexts warrants further investigation. Copyright © 2013 Elsevier Inc. All rights reserved.
Abraham, S; Solomon, W B
2000-09-19
We used a subtractive hybridization protocol to identify novel expressed sequence tags (ESTs) corresponding to mRNAs whose expression was induced upon exposure of the human leukemia cell line K562 to the phorbol ester 12-O-tetradecanolyphorbol-13-acetate (TPA). The complete open reading frame of one of the novel ESTs, named TIG-1, was obtained by screening K562 cell and placental cDNA libraries. The deduced open reading frame of the TIG-1 cDNA encodes for a glutamine repeat-rich protein with a predicted molecular weight of 63kDa. The predicted open reading frame also contains a consensus bipartite nuclear localization signal, though no specific DNA-binding domain is found. The corresponding TIG-1 mRNA is ubiquitously expressed. Placental tissue expresses the TIG-1 mRNA 200 times more than the lowest expressing tissues such as kidney and lung. There is also preferential TIG-1 mRNA expression in cells of bone-marrow lineage.In-vitro transcription/translation of the TIG-1 cDNA yielded a polypeptide with an apparent molecular weight of 97kDa. Using polyclonal antibodies obtained from a rabbit immunized with the carboxy-terminal portion of bacterially expressed TIG-1 protein, a polypeptide with molecular weight of 97kDa was identified by Western blot analyses of protein lysates obtained from K562 cells. Cotransfection assays of K562 cells, using a GAL4-TIG-1 fusion gene and GAL4 operator-CAT, indicate that the TIG-1 protein may have transcriptional regulatory activity when tethered to DNA. We hypothesize that this novel glutamine-rich protein participates in a protein complex that regulates gene transcription. It has been demonstrated by Naar et al. (Naar, A.M., Beaurang, P.A., Zhou, S., Abraham, S., Solomon, W.B., Tjian, R., 1999, Composite co-activator ARC mediates chromatin-directed transcriptional activation. Nature 398, 828-830) that the amino acid sequences of peptide fragments obtained from a polypeptide found in a complex of proteins that alters chromatin structure (ARC) are identical to portions of the deduced open reading frame of TIG-1 mRNA.
Tuteja, Reetu; Saxena, Rachit K; Davila, Jaime; Shah, Trushar; Chen, Wenbin; Xiao, Yong-Li; Fan, Guangyi; Saxena, K B; Alverson, Andrew J; Spillane, Charles; Town, Christopher; Varshney, Rajeev K
2013-10-01
The hybrid pigeonpea (Cajanus cajan) breeding technology based on cytoplasmic male sterility (CMS) is currently unique among legumes and displays major potential for yield increase. CMS is defined as a condition in which a plant is unable to produce functional pollen grains. The novel chimeric open reading frames (ORFs) produced as a results of mitochondrial genome rearrangements are considered to be the main cause of CMS. To identify these CMS-related ORFs in pigeonpea, we sequenced the mitochondrial genomes of three C. cajan lines (the male-sterile line ICPA 2039, the maintainer line ICPB 2039, and the hybrid line ICPH 2433) and of the wild relative (Cajanus cajanifolius ICPW 29). A single, circular-mapping molecule of length 545.7 kb was assembled and annotated for the ICPA 2039 line. Sequence annotation predicted 51 genes, including 34 protein-coding and 17 RNA genes. Comparison of the mitochondrial genomes from different Cajanus genotypes identified 31 ORFs, which differ between lines within which CMS is present or absent. Among these chimeric ORFs, 13 were identified by comparison of the related male-sterile and maintainer lines. These ORFs display features that are known to trigger CMS in other plant species and to represent the most promising candidates for CMS-related mitochondrial rearrangements in pigeonpea.
Mercado-Blanco, J; García, F; Fernández-López, M; Olivares, J
1993-01-01
Melanin production by Rhizobium meliloti GR4 is linked to nonsymbiotic plasmid pRmeGR4b (140 MDa). Transfer of this plasmid to GR4-cured derivatives or to Agrobacterium tumefaciens enables these bacteria to produce melanin. Sequence analysis of a 3.5-kb PstI fragment of plasmid pRmeGR4b has revealed the presence of a open reading frame 1,481-bp that codes for a protein whose sequence shows strong homology to two conserved regions involved in copper binding in tyrosinases and hemocyanins. In vitro-coupled transcription-translation experiments showed that this open reading frame codes for a 55-kDa polypeptide. Melanin production in GR4 is not under the control of the RpoN-NifA regulatory system, unlike that in R. leguminosarum bv. phaseoli 8002. The GR4 tyrosinase gene could be expressed in Escherichia coli under the control of the lacZ promoter. For avoiding confusion with mel genes (for melibiose), a change of the name of the previously reported mel genes of R. leguminosarum bv. phaseoli and other organisms to mep genes (for melanin production) is proposed. Images PMID:8366027
Tuteja, Reetu; Saxena, Rachit K.; Davila, Jaime; Shah, Trushar; Chen, Wenbin; Xiao, Yong-Li; Fan, Guangyi; Saxena, K. B.; Alverson, Andrew J.; Spillane, Charles; Town, Christopher; Varshney, Rajeev K.
2013-01-01
The hybrid pigeonpea (Cajanus cajan) breeding technology based on cytoplasmic male sterility (CMS) is currently unique among legumes and displays major potential for yield increase. CMS is defined as a condition in which a plant is unable to produce functional pollen grains. The novel chimeric open reading frames (ORFs) produced as a results of mitochondrial genome rearrangements are considered to be the main cause of CMS. To identify these CMS-related ORFs in pigeonpea, we sequenced the mitochondrial genomes of three C. cajan lines (the male-sterile line ICPA 2039, the maintainer line ICPB 2039, and the hybrid line ICPH 2433) and of the wild relative (Cajanus cajanifolius ICPW 29). A single, circular-mapping molecule of length 545.7 kb was assembled and annotated for the ICPA 2039 line. Sequence annotation predicted 51 genes, including 34 protein-coding and 17 RNA genes. Comparison of the mitochondrial genomes from different Cajanus genotypes identified 31 ORFs, which differ between lines within which CMS is present or absent. Among these chimeric ORFs, 13 were identified by comparison of the related male-sterile and maintainer lines. These ORFs display features that are known to trigger CMS in other plant species and to represent the most promising candidates for CMS-related mitochondrial rearrangements in pigeonpea. PMID:23792890
Wang, C S; Chao, S Y; Ku, C C; Wen, C M; Shih, H H
2009-06-01
Viruses belonging to the genus Megalocytivirus in the family Iridoviridae are one of the major agents causing mass mortalities in marine and freshwater fish in Asian countries. Outbreaks of iridovirus disease have been reported among various fish species in Taiwan. However, the genotypes of these iridoviruses have not yet been determined. In this study, seven megalocytivirus isolates from four fish species: king grouper, Epinephelus lanceolatus (Bloch), barramundi perch, Lates calcarifer (Bloch), silver sea bream, Rhabdosargus sarba (Forsskal), and common ponyfish, Leiognathus equulus (Forsskal), cultured in three different regions of Taiwan were collected. The full open reading frame encoding the viral major capsid protein gene was amplified using PCR. The PCR products of approximately 1581 bp were cloned and the nucleotide sequences were phylogenetically analysed. Results showed that all seven PCR products contained a unique open reading frame with 1362 nucleotides and encoded a structural protein with 453 amino acids. Even though the nucleotide sequences were not identical, these seven megalocytiviruses were classified into one cluster and showed very high homology with red sea bream iridovirus (RSIV) with more than 97% identity. Thus, the seven iridovirus strains isolated from cultured marine fish in Taiwan were closer to the RSIV genotype than the infectious spleen and kidney necrosis virus genotype.
A Fast Event Preprocessor and Sequencer for the Simbol-X Low Energy Detector
NASA Astrophysics Data System (ADS)
Schanz, T.; Tenzer, C.; Maier, D.; Kendziorra, E.; Santangelo, A.
2009-05-01
The Simbol-X Low Energy Detector (LED), a 128×128 pixel DEPFET (Depleted Field Effect Transistor) array, will be read out at a very high rate (8000 frames/second) and, therefore, requires a very fast on board electronics. We present an FPGA-based LED camera electronics consisting of an Event Preprocessor (EPP) for on board data preprocessing and filtering of the Simbol-X low-energy detector and a related Sequencer (SEQ) to generate the necessary signals to control the readout.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S
2013-06-25
A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Genome analysis and identification of gelatinase encoded gene in Enterobacter aerogenes
NASA Astrophysics Data System (ADS)
Shahimi, Safiyyah; Mutalib, Sahilah Abdul; Khalid, Rozida Abdul; Repin, Rul Aisyah Mat; Lamri, Mohd Fadly; Bakar, Mohd Faizal Abu; Isa, Mohd Noor Mat
2016-11-01
In this study, bioinformatic analysis towards genome sequence of E. aerogenes was done to determine gene encoded for gelatinase. Enterobacter aerogenes was isolated from hot spring water and gelatinase species-specific bacterium to porcine and fish gelatin. This bacterium offers the possibility of enzymes production which is specific to both species gelatine, respectively. Enterobacter aerogenes was partially genome sequenced resulting in 5.0 mega basepair (Mbp) total size of sequence. From pre-process pipeline, 87.6 Mbp of total reads, 68.8 Mbp of total high quality reads and 78.58 percent of high quality percentage was determined. Genome assembly produced 120 contigs with 67.5% of contigs over 1 kilo base pair (kbp), 124856 bp of N50 contig length and 55.17 % of GC base content percentage. About 4705 protein gene was identified from protein prediction analysis. Two candidate genes selected have highest similarity identity percentage against gelatinase enzyme available in Swiss-Prot and NCBI online database. They were NODE_9_length_26866_cov_148.013245_12 containing 1029 base pair (bp) sequence with 342 amino acid sequence and NODE_24_length_155103_cov_177.082458_62 which containing 717 bp sequence with 238 amino acid sequence, respectively. Thus, two paired of primers (forward and reverse) were designed, based on the open reading frame (ORF) of selected genes. Genome analysis of E. aerogenes resulting genes encoded gelatinase were identified.
Kawano, Yasuhiro; Neeley, Shane; Adachi, Kei; Nakai, Hiroyuki
2013-01-01
Overlapping open reading frames (ORFs) in viral genomes undergo co-evolution; however, how individual amino acids coded by overlapping ORFs are structurally, functionally, and co-evolutionarily constrained remains difficult to address by conventional homologous sequence alignment approaches. We report here a new experimental and computational evolution-based methodology to address this question and report its preliminary application to elucidating a mode of co-evolution of the frame-shifted overlapping ORFs in the adeno-associated virus (AAV) serotype 2 viral genome. These ORFs encode both capsid VP protein and non-structural assembly-activating protein (AAP). To show proof of principle of the new method, we focused on the evolutionarily conserved QVKEVTQ and KSKRSRR motifs, a pair of overlapping heptapeptides in VP and AAP, respectively. In the new method, we first identified a large number of capsid-forming VP3 mutants and functionally competent AAP mutants of these motifs from mutant libraries by experimental directed evolution under no co-evolutionary constraints. We used Illumina sequencing to obtain a large dataset and then statistically assessed the viability of VP and AAP heptapeptide mutants. The obtained heptapeptide information was then integrated into an evolutionary algorithm, with which VP and AAP were co-evolved from random or native nucleotide sequences in silico. As a result, we demonstrate that these two heptapeptide motifs could exhibit high degeneracy if coded by separate nucleotide sequences, and elucidate how overlap-evoked co-evolutionary constraints play a role in making the VP and AAP heptapeptide sequences into the present shape. Specifically, we demonstrate that two valine (V) residues and β-strand propensity in QVKEVTQ are structurally important, the strongly negative and hydrophilic nature of KSKRSRR is functionally important, and overlap-evoked co-evolution imposes strong constraints on serine (S) residues in KSKRSRR, despite high degeneracy of the motifs in the absence of co-evolutionary constraints.
High-Efficiency "-1" and "-2" Ribosomal Frameshiftings Revealed by Force Spectroscopy.
Tsai, Te-Wei; Yang, Haopeng; Yin, Heng; Xu, Shoujun; Wang, Yuhong
2017-06-16
Ribosomal frameshifting is a rare but ubiquitous process that is being studied extensively. Meanwhile, frameshifting motifs without any secondary mRNA structures were identified but rarely studied experimentally. We report unambiguous observation of highly efficient "-1" and "-2" frameshiftings on a GA 7 G slippery mRNA without the downstream secondary structure, using force-induced remnant magnetization spectroscopy combined with unique probing schemes. The result represents the first experimental evidence of multiple frameshifting steps. It is also one of the rare reports of the "-2" frameshifting. Our assay removed the ambiguity of transcriptional slippage involvement in other frameshifting assays. Two significant insights for the frameshifting mechanism were revealed. First, EF-G·GTP is indispensable to frameshifting. Although EFG·GDPCP has been shown to prompt translocation before, we found that it could not induce frameshifting. This implies that the GTP hydrolysis is responsible for the codon-anticodon re-pairing in frameshifting, which corroborates our previous mechanical force measurement of EF-G·GTP. Second, translation in all three reading frames of the slippery sequence can be induced by the corresponding in-frame aminoacyl tRNAs. Although A-site tRNA is known to affect the partition between "0" and "-1" frameshifting, it has not been reported that all three reading frames can be translated by their corresponding tRNAs. The in vitro results were confirmed by toe-printing assay and protein sequencing.
Overlapping reading frames at the LYS5 locus in the yeast Yarrowia lipolytica.
Xuan, J W; Fournier, P; Declerck, N; Chasles, M; Gaillardin, C
1990-01-01
Mutants affected at the LYS5 locus of Yarrowia lipolytica lack detectable dehydrogenase (SDH) activity. The LYS5 gene has previously been cloned, and we present here the sequence of the 2.5-kilobase-pair (kb) DNA fragment complementing the lys5 mutation. Two large antiparallel open reading frames (ORF1 and ORF2) were observed, flanked by potential transcription signals. Both ORFs appear to be transcribed, but several lines of evidence suggest that only ORF2 is translated and encodes SDH. (i) The global amino acid compositions of Saccharomyces cerevisiae SDH and of the putative ORF2 product are similar and that of ORF1 is dissimilar. (ii) An in-frame translational fusion of ORF2 with the Escherichia coli lacZ gene was introduced into yeast cells and resulted in a beta-galactosidase activity regulated similarly to SDH; no beta-galactosidase activity was obtained with an in-frame fusion of ORF1 with lacZ. (iii) The introduction of a stop codon at the beginning of ORF2 prevented SDH expression in yeast cells, whereas no phenotypic effect was observed when ORF1 translation was blocked. Images PMID:2388625
PRIMARY STRUCTURE OF THE CYTOCHROME P450 LANOSTEROL 14A-DEMETHYLASE GENE FROM CANDIDA TROPICALIS
We report the nucleotide sequence of the gene and flanking DNA for the cytochrome P450 lanosterol 14 alpha-demethylase (14DM) from the yeast Candida tropicalis ATCC750. An open reading frame (ORF) of 528 codons encoding a 60.9-kD protein is identified. This ORF includes a charact...
Complete Genome Sequence of a Genomovirus Associated with Common Bean Plant Leaves in Brazil.
Lamas, Natalia Silva; Fontenele, Rafaela Salgado; Melo, Fernando Lucas; Costa, Antonio Felix; Varsani, Arvind; Ribeiro, Simone Graça
2016-11-10
A new genomovirus has been identified in three common bean plants in Brazil. This virus has a circular genome of 2,220 nucleotides and 3 major open reading frames. It shares 80.7% genome-wide pairwise identity with a genomovirus recovered from Tongan fruit bat guano. Copyright © 2016 Lamas et al.
Genome Sequence of JangDynasty, a Newly Isolated Mycobacteriophage
Jang, Casey; Kalaj, Nancy; Hwang, Brian; Hughes, Lorelei; Yang, Connie; Pak, Thomas; Kim, John; Han, Dong Yoon; Tedjakusnadi, Jason; Fernandez, Nicholas; Dean, Natasha; Muthiah, Arun; Sutter, Nathaniel B.
2018-01-01
ABSTRACT JangDynasty is a bacteriophage that infects Mycobacterium smegmatis mc2155. It has a genome length of 70,883 bp, with 124 predicted open reading frames (ORFs), 42 of which have known functions. JangDynasty belongs to cluster O, and like other cluster O phages, it is a siphovirus with a prolate capsid. PMID:29798914
USDA-ARS?s Scientific Manuscript database
Cinnamoyl-CoA reductase (CCR) is an important enzyme for lignin biosynthesis as it catalyzes the first specific committed step in monolignol biosynthesis. We have cloned a full length coding sequence of CCR from kenaf (Hibiscus cannabinus L.), which contains a 1,020-bp open reading frame (ORF), enco...
AbouHaidar, Mounir Georges; Venkataraman, Srividhya; Golshani, Ashkan; Liu, Bolin; Ahmad, Tauqeer
2014-01-01
The highly structured (64% GC) covalently closed circular (CCC) RNA (220 nt) of the virusoid associated with rice yellow mottle virus codes for a 16-kDa highly basic protein using novel modalities for coding, translation, and gene expression. This CCC RNA is the smallest among all known viroids and virusoids and the only one that codes proteins. Its sequence possesses an internal ribosome entry site and is directly translated through two (or three) completely overlapping ORFs (shifting to a new reading frame at the end of each round). The initiation and termination codons overlap UGAUGA (underline highlights the initiation codon AUG within the combined initiation-termination sequence). Termination codons can be ignored to obtain larger read-through proteins. This circular RNA with no noncoding sequences is a unique natural supercompact “nanogenome.” PMID:25253891
Le Chevanton, L; Leblon, G
1989-04-15
We cloned the ura5 gene coding for the orotate phosphoribosyl transferase from the ascomycete Sordaria macrospora by heterologous probing of a Sordaria genomic DNA library with the corresponding Podospora anserina sequence. The Sordaria gene was expressed in an Escherichia coli pyrE mutant strain defective for the same enzyme, and expression was shown to be promoted by plasmid sequences. The nucleotide sequence of the 1246-bp DNA fragment encompassing the region of homology with the Podospora gene has been determined. This sequence contains an open reading frame of 699 nucleotides. The deduced amino acid sequence shows 72% similarity with the corresponding Podospora protein.
Complete genome sequence of duck Tembusu virus, isolated from Muscovy ducks in southern China.
Zhu, Wanjun; Chen, Jidang; Wei, Chunya; Wang, Heng; Huang, Zhen; Zhang, Minze; Tang, Fengfeng; Xie, Jiexiong; Liang, Huanbin; Zhang, Guihong; Su, Shuo
2012-12-01
We report here the complete genomic sequence of the duck Tembusu virus (DTMUV) WJ-1 strain, isolated from Muscovy ducks. This is the first complete genome sequence of DTMUV reported in southern China. Compared with the other strains (TA, GH-2, YY5, and ZJ-407) that were previously found in eastern China, WJ-1 bears a few differences in the nucleotide and amino acid sequences. We found that there are 47 mutations of amino acids encoded by the whole open reading frame (ORF) among these five strains. The whole-genome sequence of DTMUV will help in understanding the epidemiology and molecular characteristics of duck Tembusu virus in southern China.
Takai, Ken; Horikoshi, Koki
1999-01-01
Molecular phylogenetic analysis of a naturally occurring microbial community in a deep-subsurface geothermal environment indicated that the phylogenetic diversity of the microbial population in the environment was extremely limited and that only hyperthermophilic archaeal members closely related to Pyrobaculum were present. All archaeal ribosomal DNA sequences contained intron-like sequences, some of which had open reading frames with repeated homing-endonuclease motifs. The sequence similarity analysis and the phylogenetic analysis of these homing endonucleases suggested the possible phylogenetic relationship among archaeal rRNA-encoded homing endonucleases. PMID:10584021
Nucleotide sequence of the gene determining plasmid-mediated citrate utilization.
Ishiguro, N; Sato, G
1985-01-01
The citrate utilization determinant from transposon Tn3411 has been cloned and sequenced, and its polypeptide products have been characterized in minicell experiments. The nucleotide sequence was determined for a 2,047-base-pair BglII restriction endonuclease fragment that includes the citrate determinant. This region contains an open reading frame that would encode a 431-amino-acid very hydrophobic polypeptide and which is preceded by a reasonable ribosomal binding site. However, the single polypeptide found in minicell experiments had an apparent molecular weight of 35,000 on sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Images PMID:2999087
Self-complementary circular codes in coding theory.
Fimmel, Elena; Michel, Christian J; Starman, Martin; Strüngmann, Lutz
2018-04-01
Self-complementary circular codes are involved in pairing genetic processes. A maximal [Formula: see text] self-complementary circular code X of trinucleotides was identified in genes of bacteria, archaea, eukaryotes, plasmids and viruses (Michel in Life 7(20):1-16 2017, J Theor Biol 380:156-177, 2015; Arquès and Michel in J Theor Biol 182:45-58 1996). In this paper, self-complementary circular codes are investigated using the graph theory approach recently formulated in Fimmel et al. (Philos Trans R Soc A 374:20150058, 2016). A directed graph [Formula: see text] associated with any code X mirrors the properties of the code. In the present paper, we demonstrate a necessary condition for the self-complementarity of an arbitrary code X in terms of the graph theory. The same condition has been proven to be sufficient for codes which are circular and of large size [Formula: see text] trinucleotides, in particular for maximal circular codes ([Formula: see text] trinucleotides). For codes of small-size [Formula: see text] trinucleotides, some very rare counterexamples have been constructed. Furthermore, the length and the structure of the longest paths in the graphs associated with the self-complementary circular codes are investigated. It has been proven that the longest paths in such graphs determine the reading frame for the self-complementary circular codes. By applying this result, the reading frame in any arbitrary sequence of trinucleotides is retrieved after at most 15 nucleotides, i.e., 5 consecutive trinucleotides, from the circular code X identified in genes. Thus, an X motif of a length of at least 15 nucleotides in an arbitrary sequence of trinucleotides (not necessarily all of them belonging to X) uniquely defines the reading (correct) frame, an important criterion for analyzing the X motifs in genes in the future.
Nucleotide sequence of the gag gene and gag-pol junction of feline leukemia virus.
Laprevotte, I; Hampe, A; Sherr, C J; Galibert, F
1984-01-01
The nucleotide sequence of the gag gene of feline leukemia virus and its flanking sequences were determined and compared with the corresponding sequences of two strains of feline sarcoma virus and with that of the Moloney strain of murine leukemia virus. A high degree of nucleotide sequence homology between the feline leukemia virus and murine leukemia virus gag genes was observed, suggesting that retroviruses of domestic cats and laboratory mice have a common, proximal evolutionary progenitor. The predicted structure of the complete feline leukemia virus gag gene precursor suggests that the translation of nonglycosylated and glycosylated gag gene polypeptides is initiated at two different AUG codons. These initiator codons fall in the same reading frame and are separated by a 222-base-pair segment which encodes an amino terminal signal peptide. The nucleotide sequence predicts the order of amino acids in each of the individual gag-coded proteins (p15, p12, p30, p10), all of which derive from the gag gene precursor. Stable stem-and-loop secondary structures are proposed for two regions of viral RNA. The first falls within sequences at the 5' end of the viral genome, together with adjacent palindromic sequences which may play a role in dimer linkage of RNA subunits. The second includes coding sequences at the gag-pol junction and is proposed to be involved in translation of the pol gene product. Sequence analysis of the latter region shows that the gag and pol genes are translated in different reading frames. Classical consensus splice donor and acceptor sequences could not be localized to regions which would permit synthesis of the expected gag-pol precursor protein. Alternatively, we suggest that the pol gene product (RNA-dependent DNA polymerase) could be translated by a frameshift suppressing mechanism which could involve cleavage modification of stems and loops in a manner similar to that observed in tRNA processing. PMID:6328019
Laing, William A.; Martínez-Sánchez, Marcela; Wright, Michele A.; Bulley, Sean M.; Brewster, Di; Dare, Andrew P.; Rassam, Maysoon; Wang, Daisy; Storey, Roy; Macknight, Richard C.; Hellens, Roger P.
2015-01-01
Ascorbate (vitamin C) is an essential antioxidant and enzyme cofactor in both plants and animals. Ascorbate concentration is tightly regulated in plants, partly to respond to stress. Here, we demonstrate that ascorbate concentrations are determined via the posttranscriptional repression of GDP-l-galactose phosphorylase (GGP), a major control enzyme in the ascorbate biosynthesis pathway. This regulation requires a cis-acting upstream open reading frame (uORF) that represses the translation of the downstream GGP open reading frame under high ascorbate concentration. Disruption of this uORF stops the ascorbate feedback regulation of translation and results in increased ascorbate concentrations in leaves. The uORF is predicted to initiate at a noncanonical codon (ACG rather than AUG) and encode a 60- to 65-residue peptide. Analysis of ribosome protection data from Arabidopsis thaliana showed colocation of high levels of ribosomes with both the uORF and the main coding sequence of GGP. Together, our data indicate that the noncanonical uORF is translated and encodes a peptide that functions in the ascorbate inhibition of translation. This posttranslational regulation of ascorbate is likely an ancient mechanism of control as the uORF is conserved in GGP genes from mosses to angiosperms. PMID:25724639
Laing, William A; Martínez-Sánchez, Marcela; Wright, Michele A; Bulley, Sean M; Brewster, Di; Dare, Andrew P; Rassam, Maysoon; Wang, Daisy; Storey, Roy; Macknight, Richard C; Hellens, Roger P
2015-03-01
Ascorbate (vitamin C) is an essential antioxidant and enzyme cofactor in both plants and animals. Ascorbate concentration is tightly regulated in plants, partly to respond to stress. Here, we demonstrate that ascorbate concentrations are determined via the posttranscriptional repression of GDP-l-galactose phosphorylase (GGP), a major control enzyme in the ascorbate biosynthesis pathway. This regulation requires a cis-acting upstream open reading frame (uORF) that represses the translation of the downstream GGP open reading frame under high ascorbate concentration. Disruption of this uORF stops the ascorbate feedback regulation of translation and results in increased ascorbate concentrations in leaves. The uORF is predicted to initiate at a noncanonical codon (ACG rather than AUG) and encode a 60- to 65-residue peptide. Analysis of ribosome protection data from Arabidopsis thaliana showed colocation of high levels of ribosomes with both the uORF and the main coding sequence of GGP. Together, our data indicate that the noncanonical uORF is translated and encodes a peptide that functions in the ascorbate inhibition of translation. This posttranslational regulation of ascorbate is likely an ancient mechanism of control as the uORF is conserved in GGP genes from mosses to angiosperms. © 2015 American Society of Plant Biologists. All rights reserved.
Termination and read-through proteins encoded by genome segment 9 of Colorado tick fever virus.
Mohd Jaafar, Fauziah; Attoui, Houssam; De Micco, Philippe; De Lamballerie, Xavier
2004-08-01
Genome segment 9 (Seg-9) of Colorado tick fever virus (CTFV) is 1884 bp long and contains a large open reading frame (ORF; 1845 nt in length overall), although a single in-frame stop codon (at nt 1052-1054) reduces the ORF coding capacity by approximately 40 %. However, analyses of highly conserved RNA sequences in the vicinity of the stop codon indicate that it belongs to a class of 'leaky terminators'. The third nucleotide positions in codons situated both before and after the stop codon, shows the highest variability, suggesting that both regions are translated during virus replication. This also suggests that the stop signal is functionally leaky, allowing read-through translation to occur. Indeed, both the truncated 'termination' protein and the full-length 'read-through' protein (VP9 and VP9', respectively) were detected in CTFV-infected cells, in cells transfected with a plasmid expressing only Seg-9 protein products, and in the in vitro translation products from undenatured Seg-9 ssRNA. The ratios of full-length and truncated proteins generated suggest that read-through may be down-regulated by other viral proteins. Western blot analysis of infected cells and purified CTFV showed that VP9 is a structural component of the virion, while VP9' is a non-structural protein.
Automatic draft reading based on image processing
NASA Astrophysics Data System (ADS)
Tsujii, Takahiro; Yoshida, Hiromi; Iiguni, Youji
2016-10-01
In marine transportation, a draft survey is a means to determine the quantity of bulk cargo. Automatic draft reading based on computer image processing has been proposed. However, the conventional draft mark segmentation may fail when the video sequence has many other regions than draft marks and a hull, and the estimated waterline is inherently higher than the true one. To solve these problems, we propose an automatic draft reading method that uses morphological operations to detect draft marks and estimate the waterline for every frame with Canny edge detection and a robust estimation. Moreover, we emulate surveyors' draft reading process for getting the understanding of a shipper and a receiver. In an experiment in a towing tank, the draft reading error of the proposed method was <1 cm, showing the advantage of the proposed method. It is also shown that accurate draft reading has been achieved in a real-world scene.
Using Reading Frames: An Example from "The Waste Land."
ERIC Educational Resources Information Center
Chandran, Narayana
1995-01-01
Discusses the use of reading frames in teaching "The Waste Land" in India. Suggests that there is nothing more exciting in the classroom than a reading frame that affords correlated, intertextual recognitions. (RS)
Aymerich, T; Holo, H; Håvarstein, L S; Hugas, M; Garriga, M; Nes, I F
1996-01-01
A new bacteriocin has been isolated from an Enterococcus faecium strain. The bacteriocin, termed enterocin A, was purified to homogeneity as judged by sodium dodecyl sulfate-polyacrylamide gel electrophoresis, N-terminal amino acid sequencing, and mass spectrometry analysis. By combining the data obtained from amino acid and DNA sequencing, the primary structure of enterocin A was determined. It consists of 47 amino acid residues, and the molecular weight was calculated to be 4,829, assuming that the four cysteine residues form intramolecular disulfide bridges. This molecular weight was confirmed by mass spectrometry analysis. The amino acid sequence of enterocin A shared significant homology with a group of bacteriocins (now termed pediocin-like bacteriocins) isolated from a variety of lactic acid-producing bacteria, which include members of the genera Lactobacillus, Pediococcus, Leuconostoc, and Carnobacterium. Sequencing of the structural gene of enterocin A, which is located on the bacterial chromosome, revealed an N-terminal leader sequence of 18 amino acid residues, which was removed during the maturation process. The enterocin A leader belongs to the double-glycine leaders which are found among most other small nonlantibiotic bacteriocins, some lantibiotics, and colicin V. Downstream of the enterocin A gene was located a second open reading frame, encoding a putative protein of 103 amino acid residues. This gene may encode the immunity factor of enterocin A, and it shares 40% identity with a similar open reading frame in the operon of leucocin AUL 187, another pediocin-like bacteriocin. PMID:8633865
Baker, C S; Vant, M D; Dalebout, M L; Lento, G M; O'Brien, S J; Yuhki, N
2006-05-01
The molecular diversity and phylogenetic relationships of two class II genes of the baleen whale major histocompatibility complex were investigated and compared to toothed whales and out-groups. Amplification of the DQB exon 2 provided sequences showing high within-species and between-species nucleotide diversity and uninterrupted reading frames consistent with functional class II loci found in related mammals (e.g., ruminants). Cloning of amplified products indicated gene duplication in the humpback whale and triplication in the southern right whale, with average nucleotide diversity of 5.9 and 6.3%, respectively, for alleles of each species. Significantly higher nonsynonymous divergence at sites coding for peptide binding (32% for humpback and 40% for southern right) suggested that these loci were subject to positive (overdominant) selection. A population survey of humpback whales detected 23 alleles, differing by up to 21% of their inferred amino acid sequences. Amplification of the DRB exon 2 resulted in two groups of sequences. One was most similar to the DRB3 of the cow and present in all whales screened to date, including toothed whales. The second was most similar to the DRB2 of the cow and was found only in the bowhead and right whales. Both loci showed low diversity among species and apparent loss of function or altered function including interruption of reading frames. Finally, comparison of inferred protein sequence of the DRB3-like locus suggested convergence with the DQB, perhaps resulting from intergenic conversion or recombination.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Damiani, R.D. Jr.; Wessler, S.R.
1993-09-01
The R/B genes of maize encode a family of basic helix-loop-helix proteins that determine where and when the anthocyanin-pigment pathway will be expressed in the plant. Previous studies showed that allelic diversity among family members reflects differences in gene expression, specifically in transcription initiation. The authors present evidence that the R gene Lc is under translational control. They demonstrate that the 235-nt transcript leader of Lc represses expression 25- to 30-fold in an in vivo assay. Repression is mediated by the presence in cis of a 38-codon upstream open reading frame. Furthermore, the coding capacity of the upstream open readingmore » frame influences the magnitude of repression. It is proposed that translational control does not contribute to tissue specificity but prevents overexpression of the Lc protein. The diversity of promoter and 5' untranslated leader sequences among the R/B genes provides an opportunity to study the coevolution of transcriptional and translational mechanisms of gene regulation. 36 refs., 5 figs.« less
Uda, Kouji; Ishida, Mikako; Matsui, Tohru; Suzuki, Tomohiko
2010-10-01
Arginine kinase (AK), which catalyzes the reversible transfer of phosphate from ATP to arginine to yield phosphoarginine and ADP, is widely distributed throughout the invertebrates. We determined the cDNA sequence of AK from the tardigrade (water bear) Macrobiotus occidentalis, cloned the sequence into pET30b plasmid, and expressed it in Escherichia coli as a 6x His-tag—fused protein. The cDNA is 1377 bp, has an open reading frame of 1080 bp, and has 5′- and 3′-untranslated regions of 116 and 297 bp, respectively. The open reading frame encodes a 359-amino acid protein containing the 12 residues considered necessary for substrate binding in Limulus AK. This is the first AK sequence from a tardigrade. From fragmented and non-annotated sequences available from DNA databases, we assembled 46 complete AK sequences: 26 from arthropods (including 19 from Insecta), 11 from nematodes, 4 from mollusks, 2 from cnidarians and 2 from onychophorans. No onychophoran sequences have been reported previously. The phylogenetic trees of 104 AKs indicated clearly that Macrobiotus AK (from the phylum Tardigrada) shows close affinity with Epiperipatus and Euperipatoides AKs (from the phylum Onychophora), and therefore forms a sister group with the arthropod AKs. Recombinant 6x His-tagged Macrobiotus AK was successfully expressed as a soluble protein, and the kinetic constants (K(m), K(d), V(ma) and k(cat)) were determined for the forward reaction. Comparison of these kinetic constants with those of AKs from other sources (arthropods, mollusks and nematodes) indicated that Macrobiotus AK is unique in that it has the highest values for k(cat) and K(d)K(m) (indicative of synergistic substrate binding) of all characterized AKs.
Pedersen, M S; Fahnøe, U; Hansen, T A; Pedersen, A G; Jenssen, H; Bukh, J; Schønning, K
2018-06-01
The current treatment options for hepatitis C virus (HCV), based on direct acting antivirals (DAA), are dependent on virus genotype and previous treatment experience. Treatment failures have been associated with detection of resistance-associated substitutions (RASs) in the DAA targets of HCV, the NS3, NS5A and NS5 B proteins. To develop a next generation sequencing based method that provides genotype and detection of HCV NS3, NS5A, and NS5 B RASs without prior knowledge of sample genotype. In total, 101 residual plasma samples from patients with HCV covering 10 different viral subtypes across 4 genotypes with viral loads of 3.84-7.61 Log IU/mL were included. All samples were de-identified and consequently prior treatment status for patients was unknown. Almost full open reading frame amplicons (∼ 9 kb) were generated using RT-PCR with a single primer set. The resulting amplicons were sequenced with high throughput sequencing and analysed using an in-house developed script for detecting RASs. The method successfully amplified and sequenced 94% (95/101) of samples with an average coverage of 14,035; four of six failed samples were genotype 4a. Samples analysed twice yielded reproducible nucleotide frequencies across all sites. RASs were detected in 21/95 (22%) samples at a 15% threshold. The method identified one patient infected with two genotype 2b variants, and the presence of subgenomic deletion variants in 8 (8.4%) of 95 successfully sequenced samples. The presented method may provide identification of HCV genotype, RASs detection, and detect multiple HCV infection without prior knowledge of sample genotype. Copyright © 2018 Elsevier B.V. All rights reserved.
The ferredoxin-thioredoxin reductase variable subunit gene from Anacystis nidulans.
Szekeres, M; Droux, M; Buchanan, B B
1991-03-01
The ferredoxin-thioredoxin reductase variable subunit gene of Anacystis nidulans was cloned, and its nucleotide sequence was determined. A single-copy 219-bp open reading frame encoded a protein of 73 amino acid residues, with a calculated Mr of 8,400. The monocistronic transcripts were represented in a 400-base and a less abundant 300-base mRNA form.
USDA-ARS?s Scientific Manuscript database
The Drosophila melanogaster 91-R and 91-C strains are of common origin, however, 91-R has been intensely selected for dichlorodiphenyltrichloroethane (DDT) resistance over six decades while 91-C has been maintained as the non-selected control strain. These fly strains represent a unique genetic res...
Genome Sequence of JangDynasty, a Newly Isolated Mycobacteriophage.
Jang, Casey; Kalaj, Nancy; Hwang, Brian; Hughes, Lorelei; Yang, Connie; Pak, Thomas; Kim, John; Han, Dong Yoon; Tedjakusnadi, Jason; Fernandez, Nicholas; Dean, Natasha; Muthiah, Arun; Sutter, Nathaniel B; Diaz, Arturo
2018-05-24
JangDynasty is a bacteriophage that infects Mycobacterium smegmatis mc 2 155. It has a genome length of 70,883 bp, with 124 predicted open reading frames (ORFs), 42 of which have known functions. JangDynasty belongs to cluster O, and like other cluster O phages, it is a siphovirus with a prolate capsid. Copyright © 2018 Jang et al.
Production of foot-and-mouth disease virus capsid proteins by the TEV protease.
Puckette, Michael; Smith, Justin D; Gabbert, Lindsay; Schutta, Christopher; Barrera, José; Clark, Benjamin A; Neilan, John G; Rasmussen, Max
2018-06-10
Protective immunity to viral pathogens often includes production of neutralizing antibodies to virus capsid proteins. Many viruses produce capsid proteins by expressing a precursor polyprotein and related protease from a single open reading frame. The foot-and-mouth disease virus (FMDV) expresses a 3C protease (3Cpro) that cleaves a P1 polyprotein intermediate into individual capsid proteins, but the FMDV 3Cpro also degrades many host cell proteins and reduces the viability of host cells, including subunit vaccine production cells. To overcome the limitations of using the a wild-type 3Cpro in FMDV subunit vaccine expression systems, we altered the protease restriction sequences within a FMDV P1 polyprotein to enable production of FMDV capsid proteins by the Tobacco Etch Virus NIa protease (TEVpro). Separate TEVpro and modified FMDV P1 proteins were produced from a single open reading frame by an intervening FMDV 2A sequence. The modified FMDV P1 polyprotein was successfully processed by the TEVpro in both mammalian and bacterial cells. More broadly, this method of polyprotein production and processing may be adapted to other recombinant expression systems, especially plant-based expression. Published by Elsevier B.V.
Isolation of a novel human papillomavirus (type 51) from a cervical condyloma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nuovo, G.J.; Crum, C.P.; Levine, R.U.
1988-04-01
The authors cloned the DNA from a novel human papillomavirus (HPV) present in a cervical condyloma. When DNA from this isolate was hybridized at high stringency with HPV types 1 through 50 (HPV-1 through HPV-50), it showed weak homology with HPV-6 and -16 and stronger homology with HPV-26. A detailed restriction endonuclease map was prepared which showed marked differences from the maps for other HPVs that have been isolated from the female genital tract. Reassociation kinetic analysis revealed that HPV-26 and this new isolate were less than 10% homologous; hence, the new isolate is a noel strain of HPV. Themore » approximate positions of the open reading frames of the new strain were surmised by hybridization with probes derived from individual open reading frames of HPV-16. In an analysis of 175 genital biopsies from patients with abnormal Papanicolaou smears, sequences hybridizing under highly stringent conditions to probes from this novel HPV type were found in 4.2, 6.1, and 2.4% of biopsies containing normal squamous epithelium, condylomata, and intraepithelial neoplasia, respectively. In addition, sequences homologous to probes from this novel isolate were detected in one of five cervical carcinomas examined.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vieira, P.; De Waal-Malefyt, R.; Dang, M.N.
1991-02-15
The authors demonstrated the existence of human cytokine synthesis inhibitory factor (DSIF) (interleukin 10 (IL-10)). cDNA clones encoding human IL-10 (hIL-10) were isolated from a tetanus toxin-specific human T-cell clone. Like mouse IL-10, hIL-10 exhibits strong DNA and amino acid sequence homology to an open reading frame in the Epstein-Barr virus, BDRFL. hIL-10 and the BCRFI product inhibit cytokine synthesis by activated human peripheral blood mononuclear cells and by a mouse Th1 clone. Both hIL-10 and mouse IL-10 sustain the viability of a mouse mast cell line in culture, but BCRFI lacks comparable activity in this way, suggesting that BCRFImore » may have conserved only a subset of hIL-10 activities.« less
Vernal, Javier; Serpa, Viviane I; Tavares, Carolina; Souza, Emanuel M; Pedrosa, Fábio O; Terenzi, Hernán
2007-05-01
An open reading frame encoding a protein similar in size and sequence to the Escherichia coli single-stranded DNA binding protein (SSB protein) was identified in the Herbaspirillum seropedicae genome. This open reading frame was cloned into the expression plasmid pET14b. The SSB protein from H. seropedicae, named Hs_SSB, was overexpressed in E. coli strain BL21(DE3) and purified to homogeneity. Mass spectrometry data confirmed the identity of this protein. The apparent molecular mass of the native Hs_SSB was estimated by gel filtration, suggesting that the native protein is a tetramer made up of four similar subunits. The purified protein binds to single-stranded DNA (ssDNA) in a similar manner to other SSB proteins. The production of this recombinant protein in good yield opens up the possibility of obtaining its 3D-structure and will help further investigations into DNA metabolism.
Singh, Aditya; Bhatia, Prateek
2016-12-01
Sanger sequencing platforms, such as applied biosystems instruments, generate chromatogram files. Generally, for 1 region of a sequence, we use both forward and reverse primers to sequence that area, in that way, we have 2 sequences that need to be aligned and a consensus generated before mutation detection studies. This work is cumbersome and takes time, especially if the gene is large with many exons. Hence, we devised a rapid automated command system to filter, build, and align consensus sequences and also optionally extract exonic regions, translate them in all frames, and perform an amino acid alignment starting from raw sequence data within a very short time. In full capabilities of Automated Mutation Analysis Pipeline (ASAP), it is able to read "*.ab1" chromatogram files through command line interface, convert it to the FASTQ format, trim the low-quality regions, reverse-complement the reverse sequence, create a consensus sequence, extract the exonic regions using a reference exonic sequence, translate the sequence in all frames, and align the nucleic acid and amino acid sequences to reference nucleic acid and amino acid sequences, respectively. All files are created and can be used for further analysis. ASAP is available as Python 3.x executable at https://github.com/aditya-88/ASAP. The version described in this paper is 0.28.
Lorenz, Felix K. M.; Wilde, Susanne; Voigt, Katrin; Kieback, Elisa; Mosetter, Barbara; Schendel, Dolores J.; Uckert, Wolfgang
2015-01-01
Codon optimization of nucleotide sequences is a widely used method to achieve high levels of transgene expression for basic and clinical research. Until now, immunological side effects have not been described. To trigger T cell responses against human papillomavirus, we incubated T cells with dendritic cells that were pulsed with RNA encoding the codon-optimized E7 oncogene. All T cell receptors isolated from responding T cell clones recognized target cells expressing the codon-optimized E7 gene but not the wild type E7 sequence. Epitope mapping revealed recognition of a cryptic epitope from the +3 alternative reading frame of codon-optimized E7, which is not encoded by the wild type E7 sequence. The introduction of a stop codon into the +3 alternative reading frame protected the transgene product from recognition by T cell receptor gene-modified T cells. This is the first experimental study demonstrating that codon optimization can render a transgene artificially immunogenic through generation of a dominant cryptic epitope. This finding may be of great importance for the clinical field of gene therapy to avoid rejection of gene-corrected cells and for the design of DNA- and RNA-based vaccines, where codon optimization may artificially add a strong immunogenic component to the vaccine. PMID:25799237
Lisboa, Bianca Cristina Garcia; Machado, Tamara da Rocha; Pimenta, Daniel Carvalho; Han, Sang Won
2007-02-01
Human cytidine deaminase (HCD) catalyzes the deamination of cytidine or deoxycytidine to uridine or deoxyuridine, respectively. The genomic sequence of HCD is formed by 31 kb with 4 exons and several alternative splicing signals, but an alternative form of HCD has yet to be reported. Here we describe the cloning and characterization of a small form of HCD, HSCD, and it is likely to be a product of alternative splicing of HCD. The alignment of DNA sequences shows that the HSCD matches HCD in 2 parts, except for a deletion of 170 bp. Based on the HCD genome organization, exons 1 and 4 should be joined and all sequences of introns and exons 2 and 3 should be deleted by splicing. This alternative splicing shifted the translation of the reading frame from the point of splicing. The estimated molecular mass is 9.8 kDa, and this value was confirmed by Western blot and mass spectroscopy after expressing the gene fused with glutathionine-S-transferase in the pGEX vector. The deletion and shift of the reading frame caused a loss of HCD activity, which was confirmed by enzyme assay and also with NIH3T3 cells modified to express HSCD and challenged against cytosine arabinoside. In this work we describe the identification and characterization of HSCD, which is the product of alternative splicing of the HCD gene.
Bäumer, Sebastian; Lentes, Sabine; Gottschalk, Gerhard; Deppenmeier, Uwe
2002-03-01
Analysis of genome sequence data from the methanogenic archaeon Methanosarcina mazei Gö1 revealed the existence of two open reading frames encoding proton-translocating pyrophosphatases (PPases). These open reading frames are linked by a 750-bp intergenic region containing TC-rich stretches and are transcribed in opposite directions. The corresponding polypeptides are referred to as Mvp1 and Mvp2 and consist of 671 and 676 amino acids, respectively. Both enzymes represent extremely hydrophobic, integral membrane proteins with 15 predicted transmembrane segments and an overall amino acid sequence similarity of 50.1%. Multiple sequence alignments revealed that Mvp1 is closely related to eukaryotic PPases, whereas Mvp2 shows highest homologies to bacterial PPases. Northern blot experiments with RNA from methanol-grown cells harvested in the mid-log growth phase indicated that only Mvp2 was produced under these conditions. Analysis of washed membranes showed that Mvp2 had a specific activity of 0.34 U mg (protein)(-1). Proton translocation experiments with inverted membrane vesicles prepared from methanol-grown cells showed that hydrolysis of 1 mol of pyrophosphate was coupled to the translocation of about 1 mol of protons across the cytoplasmic membrane. Appropriate conditions for mvp1 expression could not be determined yet. The pyrophosphatases of M. mazei Gö1 represent the first examples of this enzyme class in methanogenic archaea and may be part of their energy-conserving system.
Translational control of Nrf2 within the open reading frame
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perez-Leal, Oscar, E-mail: operez@temple.edu; Barrero, Carlos A.; Merali, Salim, E-mail: smerali@temple.edu
2013-07-19
Highlights: •Identification of a novel Nrf2 translational repression mechanism. •The repressor is within the 3′ portion of the Nrf2 ORF. •The translation of Nrf2 or eGFP is reduced by the regulatory element. •The translational repression can be reversed with synonymous codon substitutions. •The molecular mechanism requires the mRNA sequence, but not the encoded amino acids. -- Abstract: Nuclear Factor Erythroid 2-Related Factor 2 (Nrf2) is a transcription factor that is essential for the regulation of an effective antioxidant and detoxifying response. The regulation of its activity can occur at transcription, translation and post-translational levels. Evidence suggests that under environmental stressmore » conditions, new synthesis of Nrf2 is required – a process that is regulated by translational control and is not fully understood. Here we described the identification of a novel molecular process that under basal conditions strongly represses the translation of Nrf2 within the open reading frame (ORF). This mechanism is dependent on the mRNA sequence within the 3′ portion of the ORF of Nrf2 but not in the encoded amino acid sequence. The Nrf2 translational repression can be reversed with the use of synonymous codon substitutions. This discovery suggests an additional layer of control to explain the reason for the low Nrf2 concentration under quiescent state.« less
The complete genomic sequence of a tentative new polerovirus identified in barley in South Korea.
Zhao, Fumei; Lim, Seungmo; Yoo, Ran Hee; Igori, Davaajargal; Kim, Sang-Min; Kwak, Do Yeon; Kim, Sun Lim; Lee, Bong Choon; Moon, Jae Sun
2016-07-01
The complete nucleotide sequence of a new barley polerovirus, tentatively named barley virus G (BVG), which was isolated in Gimje, South Korea, has been determined using an RNA sequencing technique combined with polymerase chain reaction methods. The viral genomic RNA of BVG is 5,620 nucleotides long and contains six typical open reading frames commonly observed in other poleroviruses. Sequence comparisons revealed that BVG is most closely related to maize yellow dwarf virus-RMV, with the highest amino acid identities being less than 90 % for all of the corresponding proteins. These results suggested that BVG is a member of a new species in the genus Polerovirus.
Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E
1985-01-01
The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815
NASA Technical Reports Server (NTRS)
Lopez, J. C.; Ryan, S.; Blankenship, R. E.
1996-01-01
The sequence of the Chloroflexus aurantiacus open reading frame thought to be the C. aurantiacus homolog of the Rhodobacter capsulatus bchG gene is reported. The BchG gene product catalyzes esterification of bacteriochlorophyllide a by geranylgeraniol-PPi during bacteriochlorophyll a biosynthesis. Homologs from Arabidopsis thaliana, Synechocystis sp. strain PCC6803, and C. aurantiacus were identified in database searches. Profile analysis identified three related polyprenyltransferase enzymes which attach an aliphatic alcohol PPi to an aromatic substrate. This suggests a broader relationship between chlorophyll synthases and other polyprenyltransferases.
cDNA encoding a polypeptide including a hevein sequence
Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil
1993-02-16
A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.
Coexpression of the KCNA3B gene product with Kv1.5 leads to a novel A-type potassium channel.
Leicher, T; Bähring, R; Isbrandt, D; Pongs, O
1998-12-25
Shaker-related voltage-gated potassium (Kv) channels may be heterooligomers consisting of membrane-integral alpha-subunits associated with auxiliary cytoplasmic beta-subunits. In this study we have cloned the human Kvbeta3.1 subunit and the corresponding KCNA3B gene. Identification of sequence-tagged sites in the gene mapped KCNA3B to band p13.1 of human chromosome 17. Comparison of the KCNA1B, KCNA2B, and KCNA3B gene structures showed that the three Kvbeta genes have very disparate lengths varying from >/=350 kb (KCNA1B) to approximately 7 kb (KCNA3B). Yet, the exon patterns of the three genes, which code for the seven known mammalian Kvbeta subunits, are very similar. The Kvbeta1 and Kvbeta2 splice variants are generated by alternative use of 5'-exons. Mouse Kvbeta4, a potential splice variant of Kvbeta3, is a read-through product where the open reading frame starts within the sequence intervening between Kvbeta3 exons 7 and 8. The human KCNA3B sequence does not contain a mouse Kvbeta4-like open reading frame. Human Kvbeta3 mRNA is specifically expressed in the brain, where it is predominantly detected in the cerebellum. The heterologous coexpression of human Kv1.5 and Kvbeta3.1 subunits in Chinese hamster ovary cells yielded a novel Kv channel mediating very fast inactivating (A-type) outward currents upon depolarization. Thus, the expression of Kvbeta3.1 subunits potentially extends the possibilities to express diverse A-type Kv channels in the human brain.
Jailani, A Abdul Kader; Solanki, Vikas; Roy, Anirban; Sivasudha, T; Mandal, Bikash
2017-04-02
A highly infectious clone of Cucumber green mottle mosaic virus (CGMMV), a cucurbit-infecting tobamovirus was utilized for designing of gene expression vectors. Two versions of vector were examined for their efficacy in expressing the green fluorescent protein (GFP) in Nicotiana benthamiana. When the GFP gene was inserted at the stop codon of coat protein (CP) gene of the CGMMV genome without any read-through codon, systemic expression of GFP, as well as virion formation and systemic symptoms expression were obtained in N. benthamiana. The qRT-PCR analysis showed 23 fold increase of GFP over actin at 10days post inoculation (dpi), which increased to 45 fold at 14dpi and thereafter the GFP expression was significantly declined. Further, we show that when the most of the CP sequence is deleted retaining only the first 105 nucleotides, the shortened vector containing GFP in frame of original CP open reading frame (ORF) resulted in 234 fold increase of GFP expression over actin at 5dpi in N. benthamiana without the formation of virions and disease symptoms. Our study demonstrated that a simple manipulation of CP gene in the CGMMV genome while preserving the translational frame of CP resulted in developing a virus-free, rapid and efficient foreign protein expression system in the plant. The CGMMV based vectors developed in this study may be potentially useful for the production of edible vaccines in cucurbits. Copyright © 2017 Elsevier B.V. All rights reserved.
Chien, Maw-Sheng; Gilbert , Teresa L.; Huang, Chienjin; Landolt, Marsha L.; O'Hara, Patrick J.; Winton, James R.
1992-01-01
The complete sequence coding for the 57-kDa major soluble antigen of the salmonid fish pathogen, Renibacterium salmoninarum, was determined. The gene contained an opening reading frame of 1671 nucleotides coding for a protein of 557 amino acids with a calculated Mr value of 57190. The first 26 amino acids constituted a signal peptide. The deduced sequence for amino acid residues 27–61 was in agreement with the 35 N-terminal amino acid residues determined by microsequencing, suggesting the protein in synthesized as a 557-amino acid precursor and processed to produce a mature protein of Mr 54505. Two regions of the protein contained imperfect direct repeats. The first region contained two copies of an 81-residue repeat, the second contained five copies of an unrelated 25-residue repeat. Also, a perfect inverted repeat (including three in-frame UAA stop codons) was observed at the carboxyl-terminus of the gene.
Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius
Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.
2010-01-01
Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665
Hiesel, Rudolf; Schobel, Werner; Schuster, Wolfgang; Brennicke, Axel
1987-01-01
Two loci encoding subunit III of the cytochrome oxidase (COX) in Oenothera mitochondria have been identified from a cDNA library of mitochondrial transcripts. A 657-bp sequence block upstream from the open reading frame is also present in the two copies of the COX subunit I gene and is presumably involved in homologous sequence rearrangement. The proximal points of sequence rearrangements are located 3 bp upstream from the COX I and 1139 bp upstream from the COX III initiation codons. The 5'-termini of both COX I and COX III mRNAs have been mapped in this common sequence confining the promoter region for the Oenothera mitochondrial COX I and COX III genes to the homologous sequence block. ImagesFig. 5. PMID:15981332
Sequencing and phylogenetic analysis of tobacco virus 2, a polerovirus from Nicotiana tabacum.
Zhou, Benguo; Wang, Fang; Zhang, Xuesong; Zhang, Lina; Lin, Huafeng
2017-07-01
The complete genome sequence of a new virus, provisionally named tobacco virus 2 (TV2), was determined and identified from leaves of tobacco (Nicotiana tabacum) exhibiting leaf mosaic, yellowing, and deformity, in Anhui Province, China. The genome sequence of TV2 comprises 5,979 nucleotides, with 87% nucleotide sequence identity to potato leafroll virus (PLRV). Its genome organization is similar to that of PLRV, containing six open reading frames (ORFs) that potentially encode proteins with putative functions in cell-to-cell movement and suppression of RNA silencing. Phylogenetic analysis of the nucleotide sequence placed TV2 alongside members of the genus Polerovirus in the family Luteoviridae. To the best our knowledge, this study is the first report of a complete genome sequence of a new polerovirus identified in tobacco.
Nowrousian, Minou; Würtz, Christian; Pöggeler, Stefanie; Kück, Ulrich
2004-03-01
One of the most challenging parts of large scale sequencing projects is the identification of functional elements encoded in a genome. Recently, studies of genomes of up to six different Saccharomyces species have demonstrated that a comparative analysis of genome sequences from closely related species is a powerful approach to identify open reading frames and other functional regions within genomes [Science 301 (2003) 71, Nature 423 (2003) 241]. Here, we present a comparison of selected sequences from Sordaria macrospora to their corresponding Neurospora crassa orthologous regions. Our analysis indicates that due to the high degree of sequence similarity and conservation of overall genomic organization, S. macrospora sequence information can be used to simplify the annotation of the N. crassa genome.
Porcine parvovirus: DNA sequence and genome organization.
Ranz, A I; Manclús, J J; Díaz-Aroca, E; Casal, J I
1989-10-01
We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-1, FPV and CPV.
Peng, Jing; Peng, Futian; Zhu, Chunfu; Wei, Shaochong
2008-06-01
A putative isopentenyltransferase (IPT) encoding gene was identified from a pingyitiancha (Malus hupehensis Rehd.) expressed sequence tag database, and the full-length gene was cloned by RACE. Based on expression profile and sequence alignment, the nucleotide sequence of the clone, named MhIPT3, was most similar to AtIPT3, an IPT gene in Arabidopsis. The full-length cDNA contained a 963-bp open reading frame encoding a protein of 321 amino acids with a molecular mass of 37.3 kDa. Sequence analysis of genomic DNA revealed the absence of introns in the frame. Quantitative real-time PCR analysis demonstrated that the gene was expressed in roots, stems and leaves. Application of nitrate to roots of nitrogen-deprived seedlings strongly induced expression of MhIPT3 and was accompanied by the accumulation of cytokinins, whereas MhIPT3 expression was little affected by ammonium application to roots of nitrogen-deprived seedlings. Application of nitrate to leaves also up-regulated the expression of MhIPT3 and corresponded closely with the accumulation of isopentyladenine and isopentyladenosine in leaves.
A hard-to-read font reduces the framing effect in a large sample.
Korn, Christoph W; Ries, Juliane; Schalk, Lennart; Oganian, Yulia; Saalbach, Henrik
2018-04-01
How can apparent decision biases, such as the framing effect, be reduced? Intriguing findings within recent years indicate that foreign language settings reduce framing effects, which has been explained in terms of deeper cognitive processing. Because hard-to-read fonts have been argued to trigger deeper cognitive processing, so-called cognitive disfluency, we tested whether hard-to-read fonts reduce framing effects. We found no reliable evidence for an effect of hard-to-read fonts on four framing scenarios in a laboratory (final N = 158) and an online study (N = 271). However, in a preregistered online study with a rather large sample (N = 732), a hard-to-read font reduced the framing effect in the classic "Asian disease" scenario (in a one-sided test). This suggests that hard-read-fonts can modulate decision biases-albeit with rather small effect sizes. Overall, our findings stress the importance of large samples for the reliability and replicability of modulations of decision biases.
A novel helper phage enabling construction of genome-scale ORF-enriched phage display libraries.
Gupta, Amita; Shrivastava, Nimisha; Grover, Payal; Singh, Ajay; Mathur, Kapil; Verma, Vaishali; Kaur, Charanpreet; Chaudhary, Vijay K
2013-01-01
Phagemid-based expression of cloned genes fused to the gIIIP coding sequence and rescue using helper phages, such as VCSM13, has been used extensively for constructing large antibody phage display libraries. However, for randomly primed cDNA and gene fragment libraries, this system encounters reading frame problems wherein only one of 18 phages display the translated foreign peptide/protein fused to phagemid-encoded gIIIP. The elimination of phages carrying out-of-frame inserts is vital in order to improve the quality of phage display libraries. In this study, we designed a novel helper phage, AGM13, which carries trypsin-sensitive sites within the linker regions of gIIIP. This renders the phage highly sensitive to trypsin digestion, which abolishes its infectivity. For open reading frame (ORF) selection, the phagemid-borne phages are rescued using AGM13, so that clones with in-frame inserts express fusion proteins with phagemid-encoded trypsin-resistant gIIIP, which becomes incorporated into the phages along with a few copies of AGM13-encoded trypsin-sensitive gIIIP. In contrast, clones with out-of-frame inserts produce phages carrying only AGM13-encoded trypsin-sensitive gIIIP. Trypsin treatment of the phage population renders the phages with out-of-frame inserts non-infectious, whereas phages carrying in-frame inserts remain fully infectious and can hence be enriched by infection. This strategy was applied efficiently at a genome scale to generate an ORF-enriched whole genome fragment library from Mycobacterium tuberculosis, in which nearly 100% of the clones carried in-frame inserts after selection. The ORF-enriched libraries were successfully used for identification of linear and conformational epitopes for monoclonal antibodies specific to mycobacterial proteins.
The ferredoxin-thioredoxin reductase variable subunit gene from Anacystis nidulans.
Szekeres, M; Droux, M; Buchanan, B B
1991-01-01
The ferredoxin-thioredoxin reductase variable subunit gene of Anacystis nidulans was cloned, and its nucleotide sequence was determined. A single-copy 219-bp open reading frame encoded a protein of 73 amino acid residues, with a calculated Mr of 8,400. The monocistronic transcripts were represented in a 400-base and a less abundant 300-base mRNA form. Images PMID:1705544
In Vivo-Induced Genes in Pseudomonas aeruginosa
Handfield, Martin; Lehoux, Dario E.; Sanschagrin, François; Mahan, Michael J.; Woods, Donald E.; Levesque, Roger C.
2000-01-01
In vivo expression technology was used for testing Pseudomonas aeruginosa in the rat lung model of chronic infection and in a mouse model of systemic infection. Three of the eight ivi proteins found showed sequence identity to known virulence factors involved in iron acquisition via an open reading frame (called pvdI) implicated in pyoverdine biosynthesis, membrane biogenesis (FtsY), and adhesion (Hag2). PMID:10722644
Draft Genome Sequence of Marinobacter sp. Strain ANT_B65, Isolated from Antarctic Marine Sponge.
de França, Paula; Camilo, Esther; Fantinatti-Garboginni, Fabiana
2018-01-04
Marinobacter sp. strain ANT_B65 was isolated from sponge collected in King George Island, Antarctica. The draft genome of 4,173,840 bp encodes 3,743 protein-coding open reading frames. The genome will provide insights into the strain's potential use in the production of natural products. Copyright © 2018 de França et al.
USDA-ARS?s Scientific Manuscript database
The open reading frames of 19 cytochrome P450 monooxygenase (CYP) genes were sequenced from Chironomus tentans, a commonly used freshwater invertebrate model. Functional analysis of CtCYP6EX3 confirmed its atrazine-induced oxidative activation for chlorpyrifos by using a nanoparticle-based RNA inter...
Akins, R A; Grant, D M; Stohl, L L; Bottorff, D A; Nargang, F E; Lambowitz, A M
1988-11-05
The Mauriceville and Varkud mitochondrial plasmids of Neurospora are closely related, closed circular DNAs (3.6 and 3.7 kb, respectively; 1 kb = 10(3) bases or base-pairs), whose characteristics suggest relationships to mitochondrial DNA introns and retrotransposons. Here, we characterized the structure of the Varkud plasmid, determined its complete nucleotide sequence and mapped its major transcripts. The Mauriceville and Varkud plasmids have more than 97% positional identity. Both plasmids contain a 710 amino acid open reading frame that encodes a reverse transcriptase-like protein. The amino acid sequence of this open reading frame is strongly conserved between the two plasmids (701/710 amino acids) as expected for a functionally important protein. Both plasmids have a 0.4 kb region that contains five PstI palindromes and a direct repeat of approximately 160 base-pairs. Comparison of sequences in this region suggests that the Varkud plasmid has diverged less from a common ancestor than has the Mauriceville plasmid. Two major transcripts of the Varkud plasmid were detected by Northern hybridization experiments: a full-length linear RNA of 3.7 kb and an additional prominent transcript of 4.9 kb, 1.2 kb longer than monomer plasmid. Remarkably, we find that the 4.9 kb transcript is a hybrid RNA consisting of the full-length 3.7 kb Varkud plasmid transcript plus a 5' leader of 1.2 kb that is derived from the 5' end of the mitochondrial small rRNA. This and other findings suggest that the Varkud plasmid, like certain RNA viruses, has a mechanism for joining heterologous RNAs to the 5' end of its major transcript, and that, under some circumstances, nucleotide sequences in mitochondria may be recombined at the RNA level.
Burke, W D; Calalang, C C; Eickbush, T H
1987-01-01
Two classes of DNA elements interrupt a fraction of the rRNA repeats of Bombyx mori. We have analyzed by genomic blotting and sequence analysis one class of these elements which we have named R2. These elements occupy approximately 9% of the rDNA units of B. mori and appear to be homologous to the type II rDNA insertions detected in Drosophila melanogaster. Approximately 25 copies of R2 exist within the B. mori genome, of which at least 20 are located at a precise location within otherwise typical rDNA units. Nucleotide sequence analysis has revealed that the 4.2-kilobase-pair R2 element has a single large open reading frame, occupying over 82% of the total length of the element. The central region of this 1,151-amino-acid open reading frame shows homology to the reverse transcriptase enzymes found in retroviruses and certain transposable elements. Amino acid homology of this region is highest to the mobile line 1 elements of mammals, followed by the mitochondrial type II introns of fungi, and the pol gene of retroviruses. Less homology exists with transposable elements of D. melanogaster and Saccharomyces cerevisiae. Two additional regions of sequence homology between L1 and R2 elements were also found outside the reverse transcriptase region. We suggest that the R2 elements are retrotransposons that are site specific in their insertion into the genome. Such mobility would enable these elements to occupy a small fraction of the rDNA units of B. mori despite their continual elimination from the rDNA locus by sequence turnover. Images PMID:2439905
Characterization of sams genes of Amoeba proteus and the endosymbiotic X-bacteria.
Jeon, Taeck J; Jeon, Kwang W
2003-01-01
As a result of harboring obligatory bacterial endosymbionts, the xD strain of Amoeba proteus no longer produces its own S-adenosylmethionine synthetase (SAMS). When symbiont-free D amoebae are infected with symbionts (X-bacteria), the amount of amoeba SAMS decreases to a negligible level within four weeks, but about 47% of the SAMS activity, which apparently comes from another source, is still detected. Complete nucleotide sequences of sams genes of D and xD amoebae are presented and show that there are no differences between the two. Long-established xD amoebae contain an intact sams gene and thus the loss of xD amoeba's SAMS is not due to the loss of the gene itself. The open reading frame of the amoeba's sams gene has 1,281 nucleotides, encoding SAMS of 426 amino acids with a mass of 48 kDa and pI of 6.5. The amino acid sequence of amoeba SAMS is longer than the SAMS of other organisms by having an extra internal stretch of 28 amino acids. The 5'-flanking region of amoeba sams contains consensus-binding sites for several transcription factors that are related to the regulation of sams genes in E. coli and yeast. The complete nucleotide sequence of the symbiont's sams gene is also presented. The open reading frame of X-bacteria sams is 1,146 nucleotides long, encoding SAMS of 381 amino acids with a mass of 41 kDa and pI of 6.0. The X-bacteria SAMS has 45% sequence identity with that of A. proteus.
deFUME: Dynamic exploration of functional metagenomic sequencing data.
van der Helm, Eric; Geertz-Hansen, Henrik Marcus; Genee, Hans Jasper; Malla, Sailesh; Sommer, Morten Otto Alexander
2015-07-31
Functional metagenomic selections represent a powerful technique that is widely applied for identification of novel genes from complex metagenomic sources. However, whereas hundreds to thousands of clones can be easily generated and sequenced over a few days of experiments, analyzing the data is time consuming and constitutes a major bottleneck for experimental researchers in the field. Here we present the deFUME web server, an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, tailored to meet the requirements of non-bioinformaticians. The web-server integrates multiple analysis steps into one single workflow: read assembly, open reading frame prediction, and annotation with BLAST, InterPro and GO classifiers. Analysis results are visualized in an online dynamic web-interface. The deFUME webserver provides a fast track from raw sequence to a comprehensive visual data overview that facilitates effortless inspection of gene function, clustering and distribution. The webserver is available at cbs.dtu.dk/services/deFUME/and the source code is distributed at github.com/EvdH0/deFUME.
Numerical classification of coding sequences
NASA Technical Reports Server (NTRS)
Collins, D. W.; Liu, C. C.; Jukes, T. H.
1992-01-01
DNA sequences coding for protein may be represented by counts of nucleotides or codons. A complete reading frame may be abbreviated by its base count, e.g. A76C158G121T74, or with the corresponding codon table, e.g. (AAA)0(AAC)1(AAG)9 ... (TTT)0. We propose that these numerical designations be used to augment current methods of sequence annotation. Because base counts and codon tables do not require revision as knowledge of function evolves, they are well-suited to act as cross-references, for example to identify redundant GenBank entries. These descriptors may be compared, in place of DNA sequences, to extract homologous genes from large databases. This approach permits rapid searching with good selectivity.
NASA Astrophysics Data System (ADS)
Cho, Hoonkyung; Chun, Joohwan; Song, Sungchan
2016-09-01
The dim moving target tracking from the infrared image sequence in the presence of high clutter and noise has been recently under intensive investigation. The track-before-detect (TBD) algorithm processing the image sequence over a number of frames before decisions on the target track and existence is known to be especially attractive in very low SNR environments (⩽ 3 dB). In this paper, we shortly present a three-dimensional (3-D) TBD with dynamic programming (TBD-DP) algorithm using multiple IR image sensors. Since traditional two-dimensional TBD algorithm cannot track and detect the along the viewing direction, we use 3-D TBD with multiple sensors and also strictly analyze the detection performance (false alarm and detection probabilities) based on Fisher-Tippett-Gnedenko theorem. The 3-D TBD-DP algorithm which does not require a separate image registration step uses the pixel intensity values jointly read off from multiple image frames to compute the merit function required in the DP process. Therefore, we also establish the relationship between the pixel coordinates of image frame and the reference coordinates.
Molecular cloning of chitinase 33 (chit33) gene from Trichoderma atroviride
Matroudi, S.; Zamani, M.R.; Motallebi, M.
2008-01-01
In this study Trichoderma atroviride was selected as over producer of chitinase enzyme among 30 different isolates of Trichoderma sp. on the basis of chitinase specific activity. From this isolate the genomic and cDNA clones encoding chit33 have been isolated and sequenced. Comparison of genomic and cDNA sequences for defining gene structure indicates that this gene contains three short introns and also an open reading frame coding for a protein of 321 amino acids. The deduced amino acid sequence includes a 19 aa putative signal peptide. Homology between this sequence and other reported Trichoderma Chit33 proteins are discussed. The coding sequence of chit33 gene was cloned in pEt26b(+) expression vector and expressed in E. coli. PMID:24031242
Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L
1988-01-01
Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437
Pseudoexon activation increases phenotype severity in a Becker muscular dystrophy patient.
Greer, Kane; Mizzi, Kayla; Rice, Emily; Kuster, Lukas; Barrero, Roberto A; Bellgard, Matthew I; Lynch, Bryan J; Foley, Aileen Reghan; O Rathallaigh, Eoin; Wilton, Steve D; Fletcher, Sue
2015-07-01
We report a dystrophinopathy patient with an in-frame deletion of DMD exons 45-47, and therefore a genetic diagnosis of Becker muscular dystrophy, who presented with a more severe than expected phenotype. Analysis of the patient DMD mRNA revealed an 82 bp pseudoexon, derived from intron 44, that disrupts the reading frame and is expected to yield a nonfunctional dystrophin. Since the sequence of the pseudoexon and canonical splice sites does not differ from the reference sequence, we concluded that the genomic rearrangement promoted recognition of the pseudoexon, causing a severe dystrophic phenotype. We characterized the deletion breakpoints and identified motifs that might influence selection of the pseudoexon. We concluded that the donor splice site was strengthened by juxtaposition of intron 47, and loss of intron 44 silencer elements, normally located downstream of the pseudoexon donor splice site, further enhanced pseudoexon selection and inclusion in the DMD transcript in this patient.
Nucleotide sequence and genetic organization of barley stripe mosaic virus RNA gamma.
Gustafson, G; Hunter, B; Hanau, R; Armour, S L; Jackson, A O
1987-06-01
The complete nucleotide sequences of RNA gamma from the Type and ND18 strains of barley stripe mosaic virus (BSMV) have been determined. The sequences are 3164 (Type) and 2791 (ND18) nucleotides in length. Both sequences contain a 5'-noncoding region (87 or 88 nucleotides) which is followed by a long open reading frame (ORF1). A 42-nucleotide intercistronic region separates ORF1 from a second, shorter open reading frame (ORF2) located near the 3'-end of the RNA. There is a high degree of homology between the Type and ND18 strains in the nucleotide sequence of ORF1. However, the Type strain contains a 366 nucleotide direct tandem repeat within ORF1 which is absent in the ND18 strain. Consequently, the predicted translation product of Type RNA gamma ORF1 (mol wt 87,312) is significantly larger than that of ND18 RNA gamma ORF1 (mol wt 74,011). The amino acid sequence of the ORF1 polypeptide contains homologies with putative RNA polymerases from other RNA viruses, suggesting that this protein may function in replication of the BSMV genome. The nucleotide sequence of RNA gamma ORF2 is nearly identical in the Type and ND18 strains. ORF2 codes for a polypeptide with a predicted molecular weight of 17,209 (Type) or 17,074 (ND18) which is known to be translated from a subgenomic (sg) RNA. The initiation point of this sgRNA has been mapped to a location 27 nucleotides upstream of the ORF2 initiation codon in the intercistronic region between ORF1 and ORF2. The sgRNA is not coterminal with the 3'-end of the genomic RNA, but instead contains heterogeneous poly(A) termini up to 150 nucleotides long (J. Stanley, R. Hanau, and A. O. Jackson, 1984, Virology 139, 375-383). In the genomic RNA gamma, ORF2 is followed by a short poly(A) tract and a 238-nucleotide tRNA-like structure.
A mechanism for exon skipping caused by nonsense or missense mutations in BRCA1 and other genes.
Liu, H X; Cartegni, L; Zhang, M Q; Krainer, A R
2001-01-01
Point mutations can generate defective and sometimes harmful proteins. The nonsense-mediated mRNA decay (NMD) pathway minimizes the potential damage caused by nonsense mutations. In-frame nonsense codons located at a minimum distance upstream of the last exon-exon junction are recognized as premature termination codons (PTCs), targeting the mRNA for degradation. Some nonsense mutations cause skipping of one or more exons, presumably during pre-mRNA splicing in the nucleus; this phenomenon is termed nonsense-mediated altered splicing (NAS), and its underlying mechanism is unclear. By analyzing NAS in BRCA1, we show here that inappropriate exon skipping can be reproduced in vitro, and results from disruption of a splicing enhancer in the coding sequence. Enhancers can be disrupted by single nonsense, missense and translationally silent point mutations, without recognition of an open reading frame as such. These results argue against a nuclear reading-frame scanning mechanism for NAS. Coding-region single-nucleotide polymorphisms (cSNPs) within exonic splicing enhancers or silencers may affect the patterns or efficiency of mRNA splicing, which may in turn cause phenotypic variability and variable penetrance of mutations elsewhere in a gene.
Detection of the High-Level Aminoglycoside Resistance Gene aph(2")-Ib in Enterococcus faecium
Kao, Susan J.; You, Il; Clewell, Don B.; Donabedian, Susan M.; Zervos, Marcus J.; Petrin, Joanne; Shaw, Karen J.; Chow, Joseph W.
2000-01-01
A new high-level gentamicin resistance gene, designated aph(2")-Ib, was cloned from Enterococcus faecium SF11770. The deduced amino acid sequence of the 897-bp open reading frame of aph(2")-Ib shares homology with the aminoglycoside-modifying enzymes AAC(6′)-APH(2"), APH(2")-Ic, and APH(2")-Id. The observed phosphotransferase activity is designated APH(2")-Ib. PMID:10991878
DOE Office of Scientific and Technical Information (OSTI.GOV)
Claffey, K.P.; Herrera, V.L.; Brecher, P.
1987-12-01
A fatty acid binding protein (FABP) as been identified and characterized in rat heart, but the function and regulation of this protein are unclear. In this study the cDNA for rat heart FABP was cloned from a lambda gt11 library. Sequencing of the cDNA showed an open reading frame coding for a protein with 133 amino acids and a calculated size of 14,776 daltons. Several differences were found between the sequence determined from the cDNA and that reported previously by protein sequencing techniques. Northern blot analysis using rat heart FABP cDNA as a probe established the presence of an abundantmore » mRNA in rat heart about 0.85 kilobases in length. This mRNA was detected, but was not abundant, in fetal heart tissue. Tissue distribution studies showed a similar mRNA species in red, but not white, skeletal muscle. In general, the mRNA tissue distribution was similar to that of the protein detected by Western immunoblot analysis, suggesting that heart FABP expression may be regulated at the transcriptional level. S1 nuclease mapping studies confirmed that the mRNA hybridized to rat heart FABP cDNA was identical in heart and red skeletal muscle throughout the entire open reading frame. The structural differences between heart FABP and other members of this multigene family may be related to the functional requirements of oxidative muscle for fatty acids as a fuel source.« less
Wu, S W; De Lencastre, H
1999-01-01
Screening of a library of Tn551 insertional mutants selected for reduction in the methicillin resistance level of the parental Staphylococcus aureus strain COL resulted in the isolation of mutant RUSA266 in which the minimal inhibitory concentration (MIC) of the parent was reduced from 1,600 to 1.5 micrograms/mL. Cloning and sequencing of the vicinity of the insertion site omega 726 identified an open reading frame (orf1365) encoding a very large polypeptide of more than 1,365 amino acids. A unique feature of the deduced amino acid sequence was the presence of multiple tandem repeats of 75 amino acids in the polypeptide, reminiscent of the structure of high-molecular-weight cell-surface proteins EF* and Emb identified in some streptococcal strains. Mutant RUSA266 with the inactivated gene, which we shall provisionally refer to as mrp (for multiple repeat polypeptide), produced a peptidoglycan with altered muropeptide composition, and both the reduced antibiotic resistance and the altered cell wall composition were co-transduced in back-crosses into the parental strain COL. Additional sequencing upstream of mrp has revealed that this gene was part of a five-gene cluster occupying a 9.2-kb region of the staphylococcal chromosome and was composed of glmM (directly upstream of mrp), two open reading frames orf310 and orf269 coding for two hypothetical proteins, and the gene encoding the staphylococcal arginase (arg). Transcriptional analysis demonstrated that the five genes in the cluster were transcribed together.
The Apis mellifera Filamentous Virus Genome
Gauthier, Laurent; Cornman, Scott; Hartmann, Ulrike; Cousserans, François; Evans, Jay D.; de Miranda, Joachim R.; Neumann, Peter
2015-01-01
A complete reference genome of the Apis mellifera Filamentous virus (AmFV) was determined using Illumina Hiseq sequencing. The AmFV genome is a double stranded DNA molecule of approximately 498,500 nucleotides with a GC content of 50.8%. It encompasses 247 non-overlapping open reading frames (ORFs), equally distributed on both strands, which cover 65% of the genome. While most of the ORFs lacked threshold sequence alignments to reference protein databases, twenty-eight were found to display significant homologies with proteins present in other large double stranded DNA viruses. Remarkably, 13 ORFs had strong similarity with typical baculovirus domains such as PIFs (per os infectivity factor genes: pif-1, pif-2, pif-3 and p74) and BRO (Baculovirus Repeated Open Reading Frame). The putative AmFV DNA polymerase is of type B, but is only distantly related to those of the baculoviruses. The ORFs encoding proteins involved in nucleotide metabolism had the highest percent identity to viral proteins in GenBank. Other notable features include the presence of several collagen-like, chitin-binding, kinesin and pacifastin domains. Due to the large size of the AmFV genome and the inconsistent affiliation with other large double stranded DNA virus families infecting invertebrates, AmFV may belong to a new virus family. PMID:26184284
The Apis mellifera Filamentous Virus Genome.
Gauthier, Laurent; Cornman, Scott; Hartmann, Ulrike; Cousserans, François; Evans, Jay D; de Miranda, Joachim R; Neumann, Peter
2015-07-09
A complete reference genome of the Apis mellifera Filamentous virus (AmFV) was determined using Illumina Hiseq sequencing. The AmFV genome is a double stranded DNA molecule of approximately 498,500 nucleotides with a GC content of 50.8%. It encompasses 247 non-overlapping open reading frames (ORFs), equally distributed on both strands, which cover 65% of the genome. While most of the ORFs lacked threshold sequence alignments to reference protein databases, twenty-eight were found to display significant homologies with proteins present in other large double stranded DNA viruses. Remarkably, 13 ORFs had strong similarity with typical baculovirus domains such as PIFs (per os infectivity factor genes: pif-1, pif-2, pif-3 and p74) and BRO (Baculovirus Repeated Open Reading Frame). The putative AmFV DNA polymerase is of type B, but is only distantly related to those of the baculoviruses. The ORFs encoding proteins involved in nucleotide metabolism had the highest percent identity to viral proteins in GenBank. Other notable features include the presence of several collagen-like, chitin-binding, kinesin and pacifastin domains. Due to the large size of the AmFV genome and the inconsistent affiliation with other large double stranded DNA virus families infecting invertebrates, AmFV may belong to a new virus family.
Hayman, G T; Beck von Bodman, S; Kim, H; Jiang, P; Farrand, S K
1993-01-01
The acc region, subcloned from pTiC58 of classical nopaline and agrocinopine A and B Agrobacterium tumefaciens C58, allowed agrobacteria to grow using agrocinopine B as the sole source of carbon and energy. acc is approximately 6 kb in size. It consists of at least five genes, accA through accE, as defined by complementation analysis using subcloned fragments and transposon insertion mutations of acc carried on different plasmids within the same cell. All five regions are required for agrocin 84 sensitivity, and at least four are required for agrocinopine and agrocin 84 uptake. The complementation results are consistent with the hypothesis that each of the five regions is separately transcribed. Maxicell experiments showed that the first of these genes, accA, encodes a 60-kDa protein. Analysis of osmotic shock fractions showed this protein to be located in the periplasm. The DNA sequence of the accA region revealed an open reading frame encoding a predicted polypeptide of 59,147 Da. The amino acid sequence encoded by this open reading frame is similar to the periplasmic binding proteins OppA and DppA of Escherichia coli and Salmonella typhimurium and OppA of Bacillus subtilis. Images PMID:8366042
The isolation of cDNAs from OATL1 at Xp11.2 using a 480-kb YAC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Geraghty, M.T.; Brody, L.C.; Martin, L.S.
1993-05-01
Using an ornithine-{delta}-aminotransferase (OAT) cDNA, the authors identified five YACs that cover two nonadjacent OAT-related loci in Xp11.2-p11.3, designated OATL1 (distal) and OATL2 (proximal). Because several retinal degenerative disorders map to this region, they used YAC2 (480 kb), which covers the most distal part of OATL1, as a probe to screen a retinal cDNA library. From 8 {times} 10{sup 4} plaques screened, they isolated 13 clones. Two were OAT cDNAs. The remaining 11 were divided into eight groups by cross-hybridization. Groups 1-4 contain cDNAs that originate from single-copy X-linked genes in YAC2. Each has an open reading frame of >500more » bp and detects one or more transcripts on a Northern blot. The gene for each was sublocalized and ordered in YAC2. The cDNAs in groups 5-8 contained two or more Alu sequences, had no open reading frames, and did not detect transcripts. The cDNAs from groups 1-4 provide expressed sequence tags and identify candidate genes for the genetic disorders that map to this region. 28 refs., 5 figs., 1 tab.« less
Untiveros, Milton; Olspert, Allan; Artola, Katrin
2016-01-01
Summary The single‐stranded, positive‐sense RNA genome of viruses in the genus Potyvirus encodes a large polyprotein that is cleaved to yield 10 mature proteins. The first three cleavage products are P1, HCpro and P3. An additional short open reading frame (ORF), called pipo, overlaps the P3 region of the polyprotein ORF. Four related potyviruses infecting sweet potato (Ipomoea batatas) are predicted to contain a third ORF, called pispo, which overlaps the 3′ third of the P1 region. Recently, pipo has been shown to be expressed via polymerase slippage at a conserved GA6 sequence. Here, we show that pispo is also expressed via polymerase slippage at a GA6 sequence, with higher slippage efficiency (∼5%) than at the pipo site (∼1%). Transient expression of recombinant P1 or the ‘transframe’ product, P1N‐PISPO, in Nicotiana benthamiana suppressed local RNA silencing (RNAi), but only P1N‐PISPO inhibited short‐distance movement of the silencing signal. These results reveal that polymerase slippage in potyviruses is not limited to pipo expression, but can be co‐opted for the evolution and expression of further novel gene products. PMID:26757490
Complete genome sequence of keunjorong mosaic virus, a potyvirus from Cynanchum wilfordii.
Nam, Moon; Lee, Joo-Hee; Choi, Hong Soo; Lim, Hyoun-Sub; Moon, Jae Sun; Lee, Su-Heon
2013-08-01
We have determined the complete genome sequence of keunjorong mosaic virus (KjMV). The KjMV genome is composed of 9,611 nucleotides, excluding the 3'-terminal poly(A) tail. It contains two open reading frames (ORFs), with the large one encoding a polyprotein of 3,070 amino acids and the small overlapping ORF encoding a PIPO protein of 81 amino acids. The KjMV genome shared the highest nucleotide sequence identity (57.5 %) with pepper mottle virus and freesia mosaic virus, two members of the genus Potyvirus. Based on the phylogenetic relatedness to known potyviruses, KjMV appears to be a member of a new species in the genus Potyvirus.
NASA Technical Reports Server (NTRS)
Kerley, James J. (Inventor); Eklund, Wayne D. (Inventor)
1992-01-01
A device for holding reading materials for use by readers without arm mobility is presented. The device is adapted to hold the reading materials in position for reading with the pages displayed to enable turning by use of a rubber tipped stick that is held in the mouth and has a pair of rectangular frames. The frames are for holding and positioning the reading materials opened in reading posture with the pages displayed at a substantially unobstructed sighting position for reading. The pair of rectangular frames are connected to one another by a hinge so the angle between the frames may be varied thereby varying the inclination of the reading material. A pair of bent spring mounted wires for holding opposing pages of the reading material open for reading without substantial visual interference of the pages is mounted to the base. The wires are also adjustable to the thickness of the reading material and have a variable friction adjustment. This enables the force of the wires against the pages to be varied and permits the reader to manipulate the pages with the stick.
Pathogenic and multidrug-resistant Escherichia fergusonii from broiler chicken.
Forgetta, V; Rempel, H; Malouin, F; Vaillancourt, R; Topp, E; Dewar, K; Diarra, M S
2012-02-01
An Escherichia spp. isolate, ECD-227, was previously identified from the broiler chicken as a phylogenetically divergent and multidrug-resistant Escherichia coli possessing numerous virulence genes. In this study, whole genome sequencing and comparative genome analysis was used to further characterize this isolate. The presence of known and putative antibiotic resistance and virulence open reading frames were determined by comparison to pathogenic (E. coli O157:H7 TW14359, APEC O1:K1:H7, and UPEC UTI89) and nonpathogenic species (E. coli K-12 MG1655 and Escherichia fergusonii ATCC 35469). The assembled genome size of 4.87 Mb was sequenced to 18-fold depth of coverage and predicted to contain 4,376 open reading frames. Phylogenetic analysis of 537 open reading frames present across 110 enteric bacterial species identifies ECD-227 to be E. fergusonii. The genome of ECD-227 contains 5 plasmids showing similarity to known E. coli and Salmonella enterica plasmids. The presence of virulence and antibiotic resistance genes were identified and localized to the chromosome and plasmids. The mutation in gyrA (S83L) involved in fluoroquinolone resistance was identified. The Salmonella-like plasmids harbor antibiotic resistance genes on a class I integron (aadA, qacEΔ-sul1, aac3-VI, and sulI) as well as numerous virulence genes (iucABCD, sitABCD, cib, traT). In addition to the genome analysis, the virulence of ECD-227 was evaluated in a 1-d-old chick model. In the virulence assay, ECD-227 was found to induce 18 to 30% mortality in 1-d-old chicks after 24 h and 48 h of infection, respectively. This study documents an avian multidrug-resistant and virulent E. fergusonii. The existence of several resistance genes to multiple classes of antibiotics indicates that infection caused by ECD-227 would be difficult to treat using antimicrobials currently available for poultry.
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, D.A.; Zilinskas, B.A.
1991-08-01
The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity)more » with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Panning, B.; Smiley, J.R.
1993-06-01
Alu elements are the single most abundant class of dispersed repeated sequences in the human genome, comprising 5-10% of the mass of human DNA. This report demonstrates that Ad5 infection strongly stimulates Pol III transcription of human Alu elements in HeLa and 293 cells. In contrast to the cases of Ad5-induced Pol III transcriptional activation, this process requires the E1b 58-kDa protein and the products of E4 open reading frames (ORFs) 3 and 6 in addition to the E1a 289-residue product. These findings suggest novel regulatory properties of the Ad5 E1b and E4 proteins and raise the possibility that analogousmore » cellular trans-acting factors serve to modulate Alu expression in vivo.« less
Minimum probe length for unique identification of all open reading frames in a microbial genome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sokhansanj, B A; Ng, J; Fitch, J P
2000-03-05
In this paper, we determine the minimum hybridization probe length to uniquely identify at least 95% of the open reading frame (ORF) in an organism. We analyze the whole genome sequences of 17 species, 11 bacteria, 4 archaea, and 2 eukaryotes. We also present a mathematical model for minimum probe length based on assuming that all ORFs are random, of constant length, and contain an equal distribution of bases. The model accurately predicts the minimum probe length for all species, but it incorrectly predicts that all ORFs may be uniquely identified. However, a probe length of just 9 bases ismore » adequate to identify over 95% of the ORFs for all 15 prokaryotic species we studied. Using a minimum probe length, while accepting that some ORFs may not be identified and that data will be lost due to hybridization error, may result in significant savings in microarray and oligonucleotide probe design.« less
Splicing of a group II intron involved in the conjugative transfer of pRS01 in lactococci.
Mills, D A; McKay, L L; Dunny, G M
1996-06-01
Analysis of a region involved in the conjugative transfer of the lactococcal conjugative element pRS01 has revealed a bacteria] group II intron. Splicing of this lactococcal intron (designated Ll.ltrB) in vivo resulted in the ligation of two exon messages (ltrBE1 and ltrBE2) which encoded a putative conjugative relaxase essential for the transfer of pRS01. Like many group II introns, the Ll.ltrB intron possessed an open reading frame (ltrA) with homology to reverse transcriptases. Remarkably, sequence analysis of ltrA suggested a greater similarity to open reading frames encoded by eukaryotic mitochondrial group II introns than to those identified to date from other bacteria. Several insertional mutations within ltrA resulted in plasmids exhibiting a conjugative transfer-deficient phenotype. These results provide the first direct evidence for splicing of a prokaryotic group II intron in vivo and suggest that conjugative transfer is a mechanism for group II intron dissemination in bacteria.
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, Richard A.; Brown, Joseph M.; Colby, Sean M.
ATLAS (Automatic Tool for Local Assembly Structures) is a comprehensive multiomics data analysis pipeline that is massively parallel and scalable. ATLAS contains a modular analysis pipeline for assembly, annotation, quantification and genome binning of metagenomics and metatranscriptomics data and a framework for reference metaproteomic database construction. ATLAS transforms raw sequence data into functional and taxonomic data at the microbial population level and provides genome-centric resolution through genome binning. ATLAS provides robust taxonomy based on majority voting of protein coding open reading frames rolled-up at the contig level using modified lowest common ancestor (LCA) analysis. ATLAS provides robust taxonomy based onmore » majority voting of protein coding open reading frames rolled-up at the contig level using modified lowest common ancestor (LCA) analysis. ATLAS is user-friendly, easy install through bioconda maintained as open-source on GitHub, and is implemented in Snakemake for modular customizable workflows.« less
URF6, Last Unidentified Reading Frame of Human mtDNA, Codes for an NADH Dehydrogenase Subunit
NASA Astrophysics Data System (ADS)
Chomyn, Anne; Cleeter, Michael W. J.; Ragan, C. Ian; Riley, Marcia; Doolittle, Russell F.; Attardi, Giuseppe
1986-10-01
The polypeptide encoded in URF6, the last unassigned reading frame of human mitochondrial DNA, has been identified with antibodies to peptides predicted from the DNA sequence. Antibodies prepared against highly purified respiratory chain NADH dehydrogenase from beef heart or against the cytoplasmically synthesized 49-kilodalton iron-sulfur subunit isolated from this enzyme complex, when added to a deoxycholate or a Triton X-100 mitochondrial lysate of HeLa cells, specifically precipitated the URF6 product together with the six other URF products previously identified as subunits of NADH dehydrogenase. These results strongly point to the URF6 product as being another subunit of this enzyme complex. Thus, almost 60% of the protein coding capacity of mammalian mitochondrial DNA is utilized for the assembly of the first enzyme complex of the respiratory chain. The absence of such information in yeast mitochondrial DNA dramatizes the variability in gene content of different mitochondrial genomes.
Computational discovery of small open reading frames in Bacillus lehensis
NASA Astrophysics Data System (ADS)
Zainuddin, Nurhafizhoh; Illias, Rosli Md.; Mahadi, Nor Muhammad; Firdaus-Raih, Mohd
2015-09-01
Bacillus lehensis is a Gram-positive and endospore-forming alkalitolerant bacterial strain. In recent years there has been increasing interest in alkaliphilic bacteria and their ability to grow under extreme conditions as well as their ability to serve various important functions in industrial biology especially enzyme production. Small open reading frames (sORFs) have emerged as important regulators in various biological roles such as tumor progression, hormone signalling and stress response. Over the past decade, many biocomputational tools have been developed to predict genes in bacterial genomes. In this study, three softwares were used to predict sORF (≤ 80 aa) in B. lehensis by using whole genome sequence. We used comparative analysis to identify the sORFs in B. lehensis that conserved across all other bacterial genomes. We extended the analysis by doing the homology analysis against protein database. This study established the sORFs in B. lehensis that are conserved across bacteria which might has important biological function which still remain elusive in biological field.
Popcorn Story Frames from a Multicultural Perspective.
ERIC Educational Resources Information Center
DiLella, Carol Ann
Popcorn story frames from a multicultural perspective are holistic outlines that in the reading/writing process facilitate comprehension for all cultures learning to read and write stories. Popcorn story frames are structured and modeled in a horizontal fashion just like popcorn pops in a horizontal fashion. The frames are designed for learners…
Georges, Arthur; Li, Qiye; Lian, Jinmin; O'Meally, Denis; Deakin, Janine; Wang, Zongji; Zhang, Pei; Fujita, Matthew; Patel, Hardip R; Holleley, Clare E; Zhou, Yang; Zhang, Xiuwen; Matsubara, Kazumi; Waters, Paul; Graves, Jennifer A Marshall; Sarre, Stephen D; Zhang, Guojie
2015-01-01
The lizards of the family Agamidae are one of the most prominent elements of the Australian reptile fauna. Here, we present a genomic resource built on the basis of a wild-caught male ZZ central bearded dragon Pogona vitticeps. The genomic sequence for P. vitticeps, generated on the Illumina HiSeq 2000 platform, comprised 317 Gbp (179X raw read depth) from 13 insert libraries ranging from 250 bp to 40 kbp. After filtering for low-quality and duplicated reads, 146 Gbp of data (83X) was available for assembly. Exceptionally high levels of heterozygosity (0.85 % of single nucleotide polymorphisms plus sequence insertions or deletions) complicated assembly; nevertheless, 96.4 % of reads mapped back to the assembled scaffolds, indicating that the assembly included most of the sequenced genome. Length of the assembly was 1.8 Gbp in 545,310 scaffolds (69,852 longer than 300 bp), the longest being 14.68 Mbp. N50 was 2.29 Mbp. Genes were annotated on the basis of de novo prediction, similarity to the green anole Anolis carolinensis, Gallus gallus and Homo sapiens proteins, and P. vitticeps transcriptome sequence assemblies, to yield 19,406 protein-coding genes in the assembly, 63 % of which had intact open reading frames. Our assembly captured 99 % (246 of 248) of core CEGMA genes, with 93 % (231) being complete. The quality of the P. vitticeps assembly is comparable or superior to that of other published squamate genomes, and the annotated P. vitticeps genome can be accessed through a genome browser available at https://genomics.canberra.edu.au.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA
2011-01-18
A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
Kennedy, Melissa A; Moore, Emily; Wilkes, Rebecca P; Citino, Scott B; Kania, Stephen A
2006-04-01
To analyze the 7a7b genes of the feline coronavirus (FCoV) of cheetahs, which are believed to play a role in virulence of this virus. Biologic samples collected during a 4-year period from 5 cheetahs at the same institution and at 1 time point from 4 cheetahs at different institutions. Samples were first screened for FCoV via a reverse transcription-PCR procedure involving primers that encompassed the 3'-untranslated region. Samples that yielded positive assay results were analyzed by use of primers that targeted the 7a7b open reading frames. The nucleotide sequences of the 7a7b amplification products were determined and analyzed. In most isolates, substantial deletional mutations in the 7a gene were detected that would result in aberrant or no expression of the 7a product because of altered reading frames. Although the 7b gene was also found to contain mutations, these were primarily point mutations resulting in minor amino acid changes. The coronavirus associated with 1 cheetah with feline infectious peritonitis had intact 7a and 7b genes. The data suggest that mutations arise readily in the 7a region and may remain stable in FCoV of cheetahs. In contrast, an intact 7b gene may be necessary for in vivo virus infection and replication. Persistent infection with FCoV in a cheetah population results in continued virus circulation and may lead to a quasispecies of virus variants.
Ohm-Laursen, Line; Nielsen, Morten; Larsen, Stine R; Barington, Torben
2006-01-01
Antibody diversity is created by imprecise joining of the variability (V), diversity (D) and joining (J) gene segments of the heavy and light chain loci. Analysis of rearrangements is complicated by somatic hypermutations and uncertainty concerning the sources of gene segments and the precise way in which they recombine. It has been suggested that D genes with irregular recombination signal sequences (DIR) and chromosome 15 open reading frames (OR15) can replace conventional D genes, that two D genes or inverted D genes may be used and that the repertoire can be further diversified by heavy chain V gene (VH) replacement. Safe conclusions require large, well-defined sequence samples and algorithms minimizing stochastic assignment of segments. Two computer programs were developed for analysis of heavy chain joints. JointHMM is a profile hidden Markow model, while JointML is a maximum-likelihood-based method taking the lengths of the joint and the mutational status of the VH gene into account. The programs were applied to a set of 6329 clonally unrelated rearrangements. A conventional D gene was found in 80% of unmutated sequences and 64% of mutated sequences, while D-gene assignment was kept below 5% in artificial (randomly permutated) rearrangements. No evidence for the use of DIR, OR15, multiple D genes or VH replacements was found, while inverted D genes were used in less than 1‰ of the sequences. JointML was shown to have a higher predictive performance for D-gene assignment in mutated and unmutated sequences than four other publicly available programs. An online version 1·0 of JointML is available at http://www.cbs.dtu.dk/services/VDJsolver. PMID:17005006
Fukumori, F; Saint, C P
1997-01-01
A 9,233-bp HindIII fragment of the aromatic amine catabolic plasmid pTDN1, isolated from a derivative of Pseudomonas putida mt-2 (UCC22), confers the ability to degrade aniline on P. putida KT2442. The fragment encodes six open reading frames which are arranged in the same direction. Their 5' upstream region is part of the direct-repeat sequence of pTDN1. Nucleotide sequence of 1.8 kb of the repeat sequence revealed only a single base pair change compared to the known sequence of IS1071 which is involved in the transposition of the chlorobenzoate genes (C. Nakatsu, J. Ng, R. Singh, N. Straus, and C. Wyndham, Proc. Natl. Acad. Sci. USA 88:8312-8316, 1991). Four open reading frames encode proteins with considerable homology to proteins found in other aromatic-compound degradation pathways. On the basis of sequence similarity, these genes are proposed to encode the large and small subunits of aniline oxygenase (tdnA1 and tdnA2, respectively), a reductase (tdnB), and a LysR-type regulatory gene (tdnR). The putative large subunit has a conserved [2Fe-2S]R Rieske-type ligand center. Two genes, tdnQ and tdnT, which may be involved in amino group transfer, are localized upstream of the putative oxygenase genes. The tdnQ gene product shares about 30% similarity with glutamine synthetases; however, a pUC-based plasmid carrying tdnQ did not support the growth of an Escherichia coli glnA strain in the absence of glutamine. TdnT possesses domains that are conserved among amidotransferases. The tdnQ, tdnA1, tdnA2, tdnB, and tdnR genes are essential for the conversion of aniline to catechol. PMID:8990291
Farajzadeh-Sheikh, Ahmad; Jolodar, Abbas; Ghaemmaghami, Shamsedin
2013-01-01
Scorpion venom glands produce some antimicrobial peptides (AMP) that can rapidly kill a broad range of microbes and have additional activities that impact on the quality and effectiveness of innate responses and inflammation. In this study, we reported the identification of a cDNA sequence encoding cysteine-free antimicrobial peptides isolated from venomous glands of this species. Total RNA was extracted from the Iranian mesobuthus eupeus venom glands, and cDNA was synthesized by using the modified oligo (dT). The cDNA was used as the template for applying Semi-nested RT- PCR technique. PCR Products were used for direct nucleotide sequencing and the results were compared with Gen Bank database. A 213 BP cDNA fragment encoding the entire coding region of an antimicrobial toxin from the Iranian scorpion M. Eupeus venom glands were isolated. The full-length sequence of the coding region was 210 BP contained an open reading frame of 70 amino with a predicted molecular mass of 7970.48 Da and theoretical Pi of 9.10. The open reading frame consists of 210 BP encoding a precursor of 70 amino acid residues, including a signal peptide of 23 residues a propertied of 7 residues, and a mature peptide of 34 residues with no disulfide bridge. The peptide has detectable sequence identity to the Lesser Asian mesobuthus eupeus MeVAMP-2 (98%), MeVAMP-9 (60%) and several previously described AMPs from other scorpion venoms including mesobuthus martensii (94%) and buthus occitanus Israelis (82%). The secondary structure of the peptide mainly consisted of α-helical structure which was generally conserved by previously reported scorpion counterparts. The phylogenetic analysis showed that the Iranian MeAMP-like toxin was similar but not identical with that of venom antimicrobial peptides from lesser Asian scorpion mesobuthus eupeus.
Nucleotide sequences of Japanese isolates of citrus vein enation virus.
Nakazono-Nagaoka, Eiko; Fujikawa, Takashi; Iwanami, Toru
2017-03-01
The genomic sequences of five Japanese isolates of citrus vein enation virus (CVEV) isolates that induce vein enation were determined and compared with that of the Spanish isolate VE-1. The nucleotide sequences of all Japanese isolates were 5,983 nt in length. The genomic RNA of Japanese isolates had five potential open reading frames (ORF 0, ORF 1, ORF 2, ORF 3, and ORF 5) in the positive-sense strand. The nucleotide sequence identity among the Japanese isolates and Spanish isolate VE-1 ranged from 98.0% to 99.8%. Comparison of the partial amino acid sequences of ten Japanese isolates and three Spanish isolates suggested that four amino acid residues, at positions of 83, 104, and 113 in ORF 2 and position 41 in ORF 5, might be unique to some Japanese isolates.
Genomewide Function Conservation and Phylogeny in the Herpesviridae
Albà, M. Mar; Das, Rhiju; Orengo, Christine A.; Kellam, Paul
2001-01-01
The Herpesviridae are a large group of well-characterized double-stranded DNA viruses for which many complete genome sequences have been determined. We have extracted protein sequences from all predicted open reading frames of 19 herpesvirus genomes. Sequence comparison and protein sequence clustering methods have been used to construct herpesvirus protein homologous families. This resulted in 1692 proteins being clustered into 243 multiprotein families and 196 singleton proteins. Predicted functions were assigned to each homologous family based on genome annotation and published data and each family classified into seven broad functional groups. Phylogenetic profiles were constructed for each herpesvirus from the homologous protein families and used to determine conserved functions and genomewide phylogenetic trees. These trees agreed with molecular-sequence-derived trees and allowed greater insight into the phylogeny of ungulate and murine gammaherpesviruses. PMID:11156614
Wang, Ping; Ingram-Smith, Cheryl; Hadley, Jill A.; Miller, Karen J.
1999-01-01
Periplasmic cyclic β-glucans of Rhizobium species provide important functions during plant infection and hypo-osmotic adaptation. In Sinorhizobium meliloti (also known as Rhizobium meliloti), these molecules are highly modified with phosphoglycerol and succinyl substituents. We have previously identified an S. meliloti Tn5 insertion mutant, S9, which is specifically impaired in its ability to transfer phosphoglycerol substituents to the cyclic β-glucan backbone (M. W. Breedveld, J. A. Hadley, and K. J. Miller, J. Bacteriol. 177:6346–6351, 1995). In the present study, we have cloned, sequenced, and characterized this mutation at the molecular level. By using the Tn5 flanking sequences (amplified by inverse PCR) as a probe, an S. meliloti genomic library was screened, and two overlapping cosmid clones which functionally complement S9 were isolated. A 3.1-kb HindIII-EcoRI fragment found in both cosmids was shown to fully complement mutant S9. Furthermore, when a plasmid containing this 3.1-kb fragment was used to transform Rhizobium leguminosarum bv. trifolii TA-1JH, a strain which normally synthesizes only neutral cyclic β-glucans, anionic glucans containing phosphoglycerol substituents were produced, consistent with the functional expression of an S. meliloti phosphoglycerol transferase gene. Sequence analysis revealed the presence of two major, overlapping open reading frames within the 3.1-kb fragment. Primer extension analysis revealed that one of these open reading frames, ORF1, was transcribed and its transcription was osmotically regulated. This novel locus of S. meliloti is designated the cgm (cyclic glucan modification) locus, and the product encoded by ORF1 is referred to as CgmB. PMID:10419956
Sawada, Koichi; Kokeguchi, Susumu; Hongyo, Hiroshi; Sawada, Satoko; Miyamoto, Manabu; Maeda, Hiroshi; Nishimura, Fusanori; Takashiba, Shogo; Murayama, Yoji
1999-01-01
Subtractive hybridization was employed to isolate specific genes from virulent Porphyromonas gingivalis strains that are possibly related to abscess formation. The genomic DNA from the virulent strain P. gingivalis W83 was subtracted with DNA from the avirulent strain ATCC 33277. Three clones unique to strain W83 were isolated and sequenced. The cloned DNA fragments were 885, 369, and 132 bp and had slight homology with only Bacillus stearothermophilus IS5377, which is a putative transposase. The regions flanking the cloned DNA fragments were isolated and sequenced, and the gene structure around the clones was revealed. These three clones were located side-by-side in a gene reported as an outer membrane protein. The three clones interrupt the open reading frame of the outer membrane protein gene. This inserted DNA, consisting of three isolated clones, was designated IS1598, which was 1,396 bp (i.e., a 1,158-bp open reading frame) in length and was flanked by 16-bp terminal inverted repeats and a 9-bp duplicated target sequence. IS1598 was detected in P. gingivalis W83, W50, and FDC 381 by Southern hybridization. All three P. gingivalis strains have been shown to possess abscess-forming ability in animal models. However, IS1598 was not detected in avirulent strains of P. gingivalis, including ATCC 33277. The IS1598 may interrupt the synthesis of the outer membrane protein, resulting in changes in the structure of the bacterial outer membrane. The IS1598 isolated in this study is a novel insertion element which might be a specific marker for virulent P. gingivalis strains. PMID:10531208
Purification, cDNA cloning, and regulation of lysophospholipase from rat liver.
Sugimoto, H; Hayashi, H; Yamashita, S
1996-03-29
A lysophospholipase was purified 506-fold from rat liver supernatant. The preparation gave a single 24-kDa protein band on SDS-polyacrylamide gel electrophoresis. The enzyme hydrolyzed lysophosphatidylcholine, lysophosphatidylethanolamine, lysophosphatidylinositol, lysophosphatidylserine, and 1-oleoyl-2-acetyl-sn-glycero-3-phosphocholine at pH 6-8. The purified enzyme was used for the preparation of antibody and peptide sequencing. A cDNA clone was isolated by screening a rat liver lambda gt11 cDNA library with the antibody, followed by the selection of further extended clones from a lambda gt10 library. The isolated cDNA was 2,362 base pairs in length and contained an open reading frame encoding 230 amino acids with a Mr of 24,708. The peptide sequences determined were found in the reading frame. When the cDNA was expressed in Escherichia coli cells as the beta-galactosidase fusion, lysophosphatidylcholine-hydrolyzing activity was markedly increased. The deduced amino acid sequence showed significant similarity to Pseudomonas fluorescence esterase A and Spirulina platensis esterase. The three sequences contained the GXSXG consensus at similar positions. The transcript was found in various tissues with the following order of abundance: spleen, heart, kidney, brain, lung, stomach, and testis = liver. In contrast, the enzyme protein was abundant in the following order: testis, liver, kidney, heart, stomach, lung, brain, and spleen. Thus the mRNA abundance disagreed with the level of the enzyme protein in liver, testis, and spleen. When HL-60 cells were induced to differentiate into granulocytes with dimethyl sulfoxide, the 24-kDa lysophospholipase protein increased significantly, but the mRNA abundance remained essentially unchanged. Thus a posttranscriptional control mechanism is present for the regulation of 24-kDa lysophospholipase.
Li, Guang-Qi; Zang, Xiao-Nan; Zhang, Xue-Cheng; Lu, Ning; Ding, Yan; Gong, Le; Chen, Wen-Chao
2014-03-15
To study the response of Gracilaria lemaneiformis to heat stress, two key enzymes - ubiquitin-activating enzyme (E1) and ubiquitin-conjugating enzyme (E2) - of the Ubiquitin/26S proteasome pathway (UPP) were studied in three strains of G. lemaneiformis-wild type, heat-tolerant cultivar 981 and heat-tolerant cultivar 07-2. The full length DNA sequence of E1 contained only one exon. The open reading frame (ORF) sequence was 981 nucleotides encoding 326 amino acids, which contained conserved ATP binding sites (LYDRQIRLWGLE, ELAKNVLLAGV, LKEMN, VVCAI) and the ubiquitin-activating domains (VVCAI…LMTEAC, VFLDLGDEYSYQ, AIVGGMWGRE). The gene sequence of E2 contained four exons and three introns. The sum of the four exons gave an open reading frame sequence of 444 nucleotides encoding 147 amino acids, which contained a conserved ubiquitin-activating domain (GSICLDIL), ubiquitin-conjugating domains (RIYHPNIN, KVLLSICSLL, DDPLV) and ubiquitin-ligase (E3) recognition sites (KRI, YPF, WSP). Real-time-PCR analysis of transcription levels of E1 and E2 under heat shock conditions (28°C and 32°C) showed that in wild type, transcriptions of E1 and E2 were up-regulated at 28°C, while at 32°C, transcriptions of the two enzymes were below the normal level. In cultivar 981 and cultivar 07-2 of G. lemaneiformis, the transcription levels of the two enzymes were up-regulated at 32°C, and transcription level of cultivar 07-2 was even higher than that of cultivar 981. These results suggest that the UPP plays an important role in high temperature resistance of G. lemaneiformis and the bioactivity of UPP is directly related to the heat-resistant ability of G. lemaneiformis. Copyright © 2013 Elsevier B.V. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Translation of influenza A virus PB1-F2 occurs in a second open reading frame (ORF) of the PB1 gene segment. PB1-F2 has been implicated in regulation of polymerase activity, immunopathology, susceptibility to secondary bacterial infection, and induction of apoptosis. Experimental evidence of PB1-F2 ...
The Role of eIF4E Activity in Breast Cancer
2010-08-01
ORF, open reading frame; qPCR, quantitative PCR; RACE, rapid amplification of cDNA ends; RT, reverse transcriptase ; uORF, upstream ORF; UTR...were also performed using template lacking RT ( reverse transcriptase ): products were either undetectable or greatly reduced (>30000-fold less product...have previously shown that a 5’UTR expressed from the human AXIN2 gene contains a sixty nucleotide sequence that is predicted to form a stable stem
PreTIS: A Tool to Predict Non-canonical 5’ UTR Translational Initiation Sites in Human and Mouse
Reuter, Kerstin; Helms, Volkhard
2016-01-01
Translation of mRNA sequences into proteins typically starts at an AUG triplet. In rare cases, translation may also start at alternative non–AUG codons located in the annotated 5’ UTR which leads to an increased regulatory complexity. Since ribosome profiling detects translational start sites at the nucleotide level, the properties of these start sites can then be used for the statistical evaluation of functional open reading frames. We developed a linear regression approach to predict in–frame and out–of–frame translational start sites within the 5’ UTR from mRNA sequence information together with their translation initiation confidence. Predicted start codons comprise AUG as well as near–cognate codons. The underlying datasets are based on published translational start sites for human HEK293 and mouse embryonic stem cells that were derived by the original authors from ribosome profiling data. The average prediction accuracy of true vs. false start sites for HEK293 cells was 80%. When applied to mouse mRNA sequences, the same model predicted translation initiation sites observed in mouse ES cells with an accuracy of 76%. Moreover, we illustrate the effect of in silico mutations in the flanking sequence context of a start site on the predicted initiation confidence. Our new webservice PreTIS visualizes alternative start sites and their respective ORFs and predicts their ability to initiate translation. Solely, the mRNA sequence is required as input. PreTIS is accessible at http://service.bioinformatik.uni-saarland.de/pretis. PMID:27768687
Bricheux, G; Brugerolle, G
1997-08-01
The parasitic protozoan Trichomonas vaginalis is known to contain the ubiquitous and highly conserved protein actin. A genomic library and a cDNA library have been screened to identify and clone the actin gene(s) of T. vaginalis. The nucleotide sequence of one gene and its flanking regions have been determined. The open reading frame encodes a protein of 376 amino acids. The sequence is not interrupted by any introns and the promoter could be represented by a 10 bp motif close to a consensus motif also found upstream of most sequenced T. vaginalis genes. The five different clones isolated from the cDNA library have similar sequences and encode three actin proteins differing only by one or two amino acids. A phylogenetic analysis of 31 actin sequences by distance matrix and parsimony methods, using centractin as outgroup, gives congruent trees with Parabasala branching above Diplomonadida.
A genetic scale of reading frame coding.
Michel, Christian J
2014-08-21
The reading frame coding (RFC) of codes (sets) of trinucleotides is a genetic concept which has been largely ignored during the last 50 years. A first objective is the definition of a new and simple statistical parameter PrRFC for analysing the probability (efficiency) of reading frame coding (RFC) of any trinucleotide code. A second objective is to reveal different classes and subclasses of trinucleotide codes involved in reading frame coding: the circular codes of 20 trinucleotides and the bijective genetic codes of 20 trinucleotides coding the 20 amino acids. This approach allows us to propose a genetic scale of reading frame coding which ranges from 1/3 with the random codes (RFC probability identical in the three frames) to 1 with the comma-free circular codes (RFC probability maximal in the reading frame and null in the two shifted frames). This genetic scale shows, in particular, the reading frame coding probabilities of the 12,964,440 circular codes (PrRFC=83.2% in average), the 216 C(3) self-complementary circular codes (PrRFC=84.1% in average) including the code X identified in eukaryotic and prokaryotic genes (PrRFC=81.3%) and the 339,738,624 bijective genetic codes (PrRFC=61.5% in average) including the 52 codes without permuted trinucleotides (PrRFC=66.0% in average). Otherwise, the reading frame coding probabilities of each trinucleotide code coding an amino acid with the universal genetic code are also determined. The four amino acids Gly, Lys, Phe and Pro are coded by codes (not circular) with RFC probabilities equal to 2/3, 1/2, 1/2 and 2/3, respectively. The amino acid Leu is coded by a circular code (not comma-free) with a RFC probability equal to 18/19. The 15 other amino acids are coded by comma-free circular codes, i.e. with RFC probabilities equal to 1. The identification of coding properties in some classes of trinucleotide codes studied here may bring new insights in the origin and evolution of the genetic code. Copyright © 2014 Elsevier Ltd. All rights reserved.
Tsakou, Eugenia; Agathagelidis, Andreas; Boudjoghra, Myriam; Raff, Thorsten; Dagklis, Antonis; Chatzouli, Maria; Smilevska, Tatjana; Bourikas, George; Merle-Beral, Helene; Manioudaki-Kavallieratou, Eleni; Anagnostopoulos, Achilles; Brüggemann, Monika; Davi, Frederic; Stamatopoulos, Kostas; Belessi, Chrysoula
2012-01-01
The frequent occurrence of stereotyped heavy complementarity-determining region 3 (VH CDR3) sequences among unrelated cases with chronic lymphocytic leukemia (CLL) is widely taken as evidence for antigen selection. Stereotyped VH CDR3 sequences are often defined by the selective association of certain immunoglobulin heavy diversity (IGHD) genes in specific reading frames with certain immunoglobulin heavy joining (IGHJ ) genes. To gain insight into the mechanisms underlying VH CDR3 restrictions and also determine the developmental stage when restrictions in VH CDR3 are imposed, we analyzed partial IGHD-IGHJ rearrangements (D-J) in 829 CLL cases and compared the productively rearranged D-J joints (that is, in-frame junctions without junctional stop codons) to (a) the productive immunoglobulin heavy variable (IGHV )-IGHD-IGHJ rearrangements (V-D-J) from the same cases and (b) 174 D-J rearrangements from 160 precursor B-cell acute lymphoblastic leukemia cases (pre-B acute lymphoblastic leukemia [ALL]). Partial D-J rearrangements were detected in 272/829 CLL cases (32.8%). Sequence analysis was feasible in 238 of 272 D-J rearrangements; 198 of 238 (83.2%) were productively rearranged. The D-J joints in CLL did not differ significantly from those in pre-B ALL, except for higher frequency of the IGHD7-27 and IGHJ6 genes in the latter. Among CLL carrying productively rearranged D-J, comparison of the IGHD gene repertoire in productive V-D-J versus D-J revealed the following: (a) overuse of IGHD reading frames encoding hydrophilic peptides among V-D-J and (b) selection of the IGHD3-3 and IGHD6-19 genes in V-D-J junctions. These results document that the IGHD and IGHJ gene biases in the CLL expressed VH CDR3 repertoire are not stochastic but are directed by selection operating at the immunoglobulin protein level. PMID:21968789
Grohmann, L; Brennicke, A; Schuster, W
1992-01-01
The Oenothera mitochondrial genome contains only a gene fragment for ribosomal protein S12 (rps12), while other plants encode a functional gene in the mitochondrion. The complete Oenothera rps12 gene is located in the nucleus. The transit sequence necessary to target this protein to the mitochondrion is encoded by a 5'-extension of the open reading frame. Comparison of the amino acid sequence encoded by the nuclear gene with the polypeptides encoded by edited mitochondrial cDNA and genomic sequences of other plants suggests that gene transfer between mitochondrion and nucleus started from edited mitochondrial RNA molecules. Mechanisms and requirements of gene transfer and activation are discussed. Images PMID:1454526
de Bellocq, J Goüy; Leirs, H
2009-09-01
Sequences of the complete open reading frame (ORF) for rodents major histocompatibility complex (MHC) class II genes are rare. Multimammate rat (Mastomys natalensis) complementary DNA (cDNA) encoding the alpha and beta chains of MHC class II DQ gene was cloned from a rapid amplifications of cDNA Emds (RACE) cDNA library. The ORFs consist of 801 and 771 bp encoding 266 and 256 amino acid residues for DQB and DQA, respectively. The genomic structure of Mana-DQ genes is globally analogous to that described for other rodents except for the insertion of a serine residue in the signal peptide of Mana-DQB, which is unique among known rodents.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Machlin, S.M.; Hanson, R.S.
The nucleotide sequence of a cloned 2.5-kilobase-pair SmaI fragment containing the methanol dehydrogenase (MDH) structural gene from Methylobacterium organophilum XX was determined. A single open reading frame with a coding capacity of 626 amino acids (molecular weight, 66,000) was identified on one stand, and N-terminal sequencing of purified MDH revealed that 27 of these residues constituted a putative signal peptide. Primer extension mapping of in vivo transcripts indicated that the start of mRNA synthesis was 160 to 170 base pairs upstream of the ATG codon. Northern (RNA) blot analysis further demonstrated that the transcript was 2.1 kilobase pairs in lengthmore » and therefore appeared to encode only MDH.« less
Molecular cloning and nucleotide sequence of CYP6BF1 from the diamondback moth, Plutella xylostella
Li, Hongshan; Dai, Huaguo; Wei, Hui
2005-01-01
A novel cDNA clong encoding a cytochrome P450 was screened from the insecticide-susceptible strain of Plutella xylostella (L.) (Lepidoptera:Yponomeutidae). The nucleotide sequence of the clone, designated CYP6BF1, was determined. This is the first full-length sequence of the CYP6 family from Plutella xylostella (L.). The cDNA is 1661bp in length and contains an open reading frame from base pairs 26 to 1570, encoding a protein of 514 amino acid residues. It is similar to the other insect P450s in gene family 6, including CYP6AE1 from Depressaria pastinacella, (46%). The GenBank accession number is AY971374. PMID:17119627
Drewlo, Sascha; Brämer, Christian O.; Madkour, Mohamed; Mayer, Frank; Steinbüchel, Alexander
2001-01-01
On complex medium Escherichia coli strains carrying hybrid plasmid pBEC/EE:11.0, pSKBEC/BE:9.0, pSKBEC/PP:3.3, or pSKBEC/PP:2.4 harboring genomic DNA of Ralstonia eutropha HF39 produced a blue pigment characterized as indigo by several chemical and spectroscopic methods. A 1,251-bp open reading frame (bec) was cloned and sequenced. The deduced amino acid sequence of bec showed only weak similarities to short-chain acyl-coenzyme A dehydrogenases, and the gene product catalyzed formation of indoxyl, a reactive preliminary stage for production of indigo. PMID:11282658
Shaw, D R; Richter, H; Giorda, R; Ohmachi, T; Ennis, H L
1989-09-01
A Dictyostelium discoideum repetitive element composed of long repeats of the codon (AAC) is found in developmentally regulated transcripts. The concentration of (AAC) sequences is low in mRNA from dormant spores and growing cells and increases markedly during spore germination and multicellular development. The sequence hybridizes to many different sized Dictyostelium DNA restriction fragments indicating that it is scattered throughout the genome. Four cDNA clones isolated contain (AAC) sequences in the deduced coding region. Interestingly, the (AAC)-rich sequences are present in all three reading frames in the deduced proteins, i.e., AAC (asparagine), ACA (threonine) and CAA (glutamine). Three of the clones contain only one of these in-frame so that the individual proteins carry either asparagine, threonine, or glutamine clusters, not mixtures. However, one clone is both glutamine- and asparagine-rich. The (AAC) portion of the transcripts are reiterated 300 times in the haploid genome while the other portions of the cDNAs represent single copy genes, whose sequences show no similarity other than the (AAC) repeats. The repeated sequence is similar to the opa or M sequence found in Drosophila melanogaster notch and homeo box genes and in fly developmentally regulated transcripts. The transcripts are present on polysomes suggesting that they are translated. Although the function of these repeats is unknown, long amino acid repeats are a characteristic feature of extracellular proteins of lower eukaryotes.
Regis, David P.; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L.; Stefaniak, Maureen E.; Campo, Joseph J.; Carucci, Daniel J.; Roth, David A.; He, Huaping; Felgner, Philip L.; Doolan, Denise L.
2009-01-01
We have evaluated a technology called Transcriptionally Active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data. PMID:18164079
Genomic analysis of the symbiotic marine crenarchaeon, Cenarchaeumsymbiosum
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hallam, Steven J.; Konstantinidis, Konstantinos T.; Brochier,Celine
2006-06-24
Crenarchaea are ubiquitous and abundant microbial constituents of soils, sediments, lakes and ocean waters, yet relatively little is known about their fundamental evolutionary, ecological, and physiological properties. To better describe the ubiquitous nonthermophilic Crenarchaea, we analyzed the genome sequence of one representative, the uncultivated sponge symbiont, Cenarchaeum symbiosum. C. symbiosum genotypes coinhabiting the same host partitioned into two dominant populations, corresponding to previously described a- and b-type ribosomal RNA variants. Although synthetic, overlapping a- and b-type ribotypes harbored significant genetic variability. A single tiling path comprising the dominant a-type genotype was assembled, and used to explore the biological properties ofmore » C. symbiosum and its planktonic relatives. Out of a total of 2,066 predicted open reading frames, 36% were more highly conserved with other Archaea. The remainder partitioned between bacteria (18%), eukaryotes (1.5%) and viruses (0.1%). A total of 525 open reading frames were more highly conserved with sequences derived from marine environmental genomic surveys, most probably representing orthologous genes found in free-living planktonic Crenarchaea. The remaining genes partitioned between functional RNAs (2.4%), and hypotheticals (42%) with limited homology to known functional genes. The latter category likely contains genes specifically involved in mediated archaeal-sponge symbiosis. Phylogenetic analyses placed C. symbiosum as a basal crenarchaeon, sharing specific genomic features in common with either Crenarchaea, Euryarchaea, or both. The genome sequence of C. symbiosum reflect a unique and unusual evolutionary, physiological, and ecological history, one remarkably distinct from that of any other previously known microbial lineage.« less
Regis, David P; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L; Stefaniak, Maureen E; Campo, Joseph J; Carucci, Daniel J; Roth, David A; He, Huaping; Felgner, Philip L; Doolan, Denise L
2008-03-01
We have evaluated a technology called transcriptionally active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data.
Shu, Jinshuai; Liu, Yumei; Li, Zhansheng; Zhang, Lili; Fang, Zhiyuan; Yang, Limei; Zhuang, Mu; Zhang, Yangyong; Lv, Honghao
2015-01-01
We previously discovered carpelloid stamens when breeding cytoplasmic male sterile lines in broccoli (Brassica oleracea var. italica). In this study, hybrids and multiple backcrosses were produced from different cytoplasmic male sterile carpelloid stamen sources and maintainer lines. Carpelloid stamens caused dysplasia of the flower structure and led to hooked or coiled siliques with poor seed setting, which were inherited in a maternal fashion. Using four distinct carpelloid stamens and twelve distinct normal stamens from cytoplasmic male sterile sources and one maintainer, we used 21 mitochondrial simple sequence repeat (mtSSR) primers and 32 chloroplast SSR primers to identify a mitochondrial marker, mtSSR2, that can differentiate between the cytoplasm of carpelloid and normal stamens. Thereafter, mtSSR2 was used to identify another 34 broccoli accessions, with an accuracy rate of 100%. Analysis of the polymorphic sequences revealed that the mtSSR2 open reading frame of carpelloid stamen sterile sources had a deletion of 51 bases (encoding 18 amino acids) compared with normal stamen materials. The open reading frame is located in the coding region of orf125 and orf108 of the mitochondrial genomes in Brassica crops and had the highest similarity with Raphanus sativus and Brassica carinata. The current study has not only identified a useful molecular marker to detect the cytoplasm of carpelloid stamens during broccoli breeding, but it also provides evidence that the mitochondrial genome is maternally inherited and provides a basis for studying the effect of the cytoplasm on flower organ development in plants. PMID:26407159
Bäumer, Sebastian; Lentes, Sabine; Gottschalk, Gerhard; Deppenmeier, Uwe
2002-01-01
Analysis of genome sequence data from the methanogenic archaeon Methanosarcina mazei Gö1 revealed the existence of two open reading frames encoding proton-translocating pyrophosphatases (PPases). These open reading frames are linked by a 750-bp intergenic region containing TC-rich stretches and are transcribed in opposite directions. The corresponding polypeptides are referred to as Mvp1 and Mvp2 and consist of 671 and 676 amino acids, respectively. Both enzymes represent extremely hydrophobic, integral membrane proteins with 15 predicted transmembrane segments and an overall amino acid sequence similarity of 50.1%. Multiple sequence alignments revealed that Mvp1 is closely related to eukaryotic PPases, whereas Mvp2 shows highest homologies to bacterial PPases. Northern blot experiments with RNA from methanol-grown cells harvested in the mid-log growth phase indicated that only Mvp2 was produced under these conditions. Analysis of washed membranes showed that Mvp2 had a specific activity of 0.34 U mg (protein)–1. Proton translocation experiments with inverted membrane vesicles prepared from methanol-grown cells showed that hydrolysis of 1 mol of pyrophosphate was coupled to the translocation of about 1 mol of protons across the cytoplasmic membrane. Appropriate conditions for mvp1 expression could not be determined yet. The pyrophosphatases of M. mazei Gö1 represent the first examples of this enzyme class in methanogenic archaea and may be part of their energy-conserving system. Abbreviations: DCCD, N,N′-dicyclohexylcarbodiimide; PPase, inorganic pyrophosphatase; PPi, inorganic pyrophosphate; Δp, proton motive force. PMID:15803653
Production and pathogenicity of hepatitis C virus core gene products
Li, Hui-Chun; Ma, Hsin-Chieh; Yang, Chee-Hing; Lo, Shih-Yen
2014-01-01
Hepatitis C virus (HCV) is a major cause of chronic liver diseases, including steatosis, cirrhosis and hepatocellular carcinoma, and its infection is also associated with insulin resistance and type 2 diabetes mellitus. HCV, belonging to the Flaviviridae family, is a small enveloped virus whose positive-stranded RNA genome encoding a polyprotein. The HCV core protein is cleaved first at residue 191 by the host signal peptidase and further cleaved by the host signal peptide peptidase at about residue 177 to generate the mature core protein (a.a. 1-177) and the cleaved peptide (a.a. 178-191). Core protein could induce insulin resistance, steatosis and even hepatocellular carcinoma through various mechanisms. The peptide (a.a. 178-191) may play a role in the immune response. The polymorphism of this peptide is associated with the cellular lipid drop accumulation, contributing to steatosis development. In addition to the conventional open reading frame (ORF), in the +1 frame, an ORF overlaps with the core protein-coding sequence and encodes the alternative reading frame proteins (ARFP or core+1). ARFP/core+1/F protein could enhance hepatocyte growth and may regulate iron metabolism. In this review, we briefly summarized the current knowledge regarding the production of different core gene products and their roles in viral pathogenesis. PMID:24966583
The Role of elF4E Activity in Breast Cancer
2011-08-01
protein; ORF, open reading frame; qPCR, quantitative PCR; RACE, rapid amplification of cDNA ends; RT, reverse transcriptase ; uORF, upstream ORF; UTR...Reactions were also performed using template lacking RT ( reverse transcriptase ): products were either undetectable or greatly reduced (>30000-fold less...that a 5’UTR expressed from the human AXIN2 gene contains a sixty nucleotide sequence that is predicted to form a stable stem-loop structure6. This
Ogihara, Shinji; Saito, Ryoichi; Sawabe, Etsuko; Kozakai, Takahiro; Shima, Mari; Aiso, Yoshibumi; Fujie, Toshihide; Nukui, Yoko; Koike, Ryuji; Hagihara, Michio; Tohda, Shuji
2018-04-01
The recently developed PCR-based open reading frame typing (POT) method is a useful molecular typing tool. Here, we evaluated the performance of POT for molecular typing of methicillin-resistant Staphylococcus aureus (MRSA) isolates and compared its performance to those of multilocus sequence typing (MLST) and Staphylococcus protein A gene typing (spa typing). Thirty-seven MRSA isolates were collected between July 2012 and May 2015. MLST, spa typing, and POT were performed, and their discriminatory powers were evaluated using Simpson's index analysis. The MRSA isolates were classified into 11, 18, and 33 types by MLST, spa typing, and POT, respectively. The predominant strains identified by MLST, spa typing, and POT were ST8 and ST764, t002, and 93-191-127, respectively. The discriminatory power of MLST, spa typing, and POT was 0.853, 0.875, and 0.992, respectively, indicating that POT had the highest discriminatory power. Moreover, the results of MLST and spa were available after 2 days, whereas that of POT was available in 5 h. Furthermore, POT is rapid and easy to perform and interpret. Therefore, POT is a superior molecular typing tool for monitoring nosocomial transmission of MRSA. Copyright © 2017 Japanese Society of Chemotherapy and The Japanese Association for Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Pieper-Fürst, U.; Madkour, M. H.; Mayer, F.; Steinbüchel, A.
1994-01-01
The N-terminal amino acid sequence of the polyhydroxyalkanoic acid (PHA) granule-associated M(r)-15,500 protein of Rhodococcus ruber (the GA14 protein) was analyzed. The sequence revealed that the corresponding structural gene is represented by open reading frame 3, encoding a protein with a calculated M(r) of 14,175 which was recently localized downstream of the PHA synthase gene (U. Pieper and A. Steinbüchel, FEMS Microbiol. Lett. 96:73-80, 1992). A recombinant strain of Escherichia coli XL1-Blue carrying the hybrid plasmid (pSKXA10*) with open reading frame 3 overexpressed the GA14 protein. The GA14 protein was subsequently purified in a three-step procedure including chromatography on DEAE-Sephacel, phenyl-Sepharose CL-4B, and Superose 12. Determination of the molecular weight by gel filtration as well as electron microscopic studies indicates that a tetrameric structure of the recombinant, native GA14 protein is most likely. Immunoelectron microscopy demonstrated a localization of the GA14 protein at the periphery of PHA granules as well as close to the cell membrane in R. ruber. Investigations of PHA-leaky and PHA-negative mutants of R. ruber indicated that expression of the GA14 protein depended strongly on PHA synthesis. Images PMID:8021220
Molecular cloning and expression of rat liver bile acid CoA ligase.
Falany, Charles N; Xie, Xiaowei; Wheeler, James B; Wang, Jin; Smith, Michelle; He, Dongning; Barnes, Stephen
2002-12-01
Bile acid CoA ligase (BAL) is responsible for catalyzing the first step in the conjugation of bile acids with amino acids. Sequencing of putative rat liver BAL cDNAs identified a cDNA (rBAL-1) possessing a 51 nucleotide 5'-untranslated region, an open reading frame of 2,070 bases encoding a 690 aa protein with a molecular mass of 75,960 Da, and a 138 nucleotide 3'-nontranslated region followed by a poly(A) tail. Identity of the cDNA was established by: 1) the rBAL-1 open reading frame encoded peptides obtained by chemical sequencing of the purified rBAL protein; 2) expressed rBAL-1 protein comigrated with purified rBAL during SDS-polyacrylamide gel electrophoresis; and 3) rBAL-1 expressed in insect Sf9 cells had enzymatic properties that were comparable to the enzyme isolated from rat liver. Evidence for a relationship between fatty acid and bile acid metabolism is suggested by specific inhibition of rBAL-1 by cis-unsaturated fatty acids and its high homology to a human very long chain fatty acid CoA ligase. In summary, these results indicate that the cDNA for rat liver BAL has been isolated and expression of the rBAL cDNA in insect Sf9 cells results in a catalytically active enzyme capable of utilizing several different bile acids as substrates.
Mandal, Bijoy Kumar; Kim, Tai-hoon
2013-01-01
We design an Algorithm for bioengine. As a program are enable optimal alignments searching between two sequences, the host sequence (normal plant) as well as query sequence (virus). Searching for homologues has become a routine operation of biological sequences in 4 × 4 combination with different subsequence (word size). This program takes the advantage of the high degree of homology between such sequences to construct an alignment of the matching regions. There is a main aim which is to detect the overlapping reading frames. This program also enables to find out the highly infected colones selection highest matching region with minimum gap or mismatch zones and unique virus colones matches. This is a small, portable, interactive, front-end program intended to be used to find out the regions of matching between host sequence and query subsequences. All the operations are carried out in fraction of seconds, depending on the required task and on the sequence length. PMID:24000321
The challenges of sequencing by synthesis.
Fuller, Carl W; Middendorf, Lyle R; Benner, Steven A; Church, George M; Harris, Timothy; Huang, Xiaohua; Jovanovich, Stevan B; Nelson, John R; Schloss, Jeffery A; Schwartz, David C; Vezenov, Dmitri V
2009-11-01
DNA sequencing-by-synthesis (SBS) technology, using a polymerase or ligase enzyme as its core biochemistry, has already been incorporated in several second-generation DNA sequencing systems with significant performance. Notwithstanding the substantial success of these SBS platforms, challenges continue to limit the ability to reduce the cost of sequencing a human genome to $100,000 or less. Achieving dramatically reduced cost with enhanced throughput and quality will require the seamless integration of scientific and technological effort across disciplines within biochemistry, chemistry, physics and engineering. The challenges include sample preparation, surface chemistry, fluorescent labels, optimizing the enzyme-substrate system, optics, instrumentation, understanding tradeoffs of throughput versus accuracy, and read-length/phasing limitations. By framing these challenges in a manner accessible to a broad community of scientists and engineers, we hope to solicit input from the broader research community on means of accelerating the advancement of genome sequencing technology.
Xin, Min; Zhang, Peipei; Liu, Wenwen; Ren, Yingdang; Cao, Mengji; Wang, Xifeng
2017-10-01
The complete nucleotide sequence of a novel positive single-stranded (+ss) RNA virus, tentatively named watermelon virus A (WVA), was determined using a combination of three methods: RNA sequencing, small RNA sequencing, and Sanger sequencing. The full genome of WVA is comprised of 8,372 nucleotides (nt), excluding the poly (A) tail, and contains four open reading frames (ORFs). The largest ORF, ORF1 encodes a putative replication-associated polyprotein (RP) with three conserved domains. ORF2 and ORF4 encode a movement protein (MP) and coat protein (CP), respectively. The putative product encoded by ORF3, of an estimated molecular mass of 25 kDa, has no significant similarity with other proteins. Identity and phylogenetic analysis indicate that WVA is a new virus, closely related to members of the family Betaflexiviridae. However, the final taxonomic allocation of WVA within the family is yet to be determined.
Gutiérrez, Pablo A; Alzate, Juan F; Montoya, Mauricio Marín
2015-06-01
Transcriptome analysis of a Cape gooseberry (Physalis peruviana) plant with leaf symptoms of a mild yellow mosaic typical of a viral disease revealed an infection with Potato virus X (PVX). The genome sequence of the PVX-Physalis isolate comprises 6435 nt and exhibits higher sequence similarity to members of the Eurasian group of PVX (~95 %) than to the American group (~77 %). Genome organization is similar to other PVX isolates with five open reading frames coding for proteins RdRp, TGBp1, TGBp2, TGBp3, and CP. 5' and 3' untranslated regions revealed all regulatory motifs typically found in PVX isolates. The PVX-Physalis genome is the only complete sequence available for a Potexvirus in Colombia and is a new addition to the restricted number of available sequences of PVX isolates infecting plant species different to potato.
Ducote, Matthew J.; Prakash, Shubha; Pettis, Gregg S.
2000-01-01
Efficient interbacterial transfer of streptomycete plasmid pIJ101 requires the pIJ101 tra gene, as well as a cis-acting plasmid function known as clt. Here we show that the minimal pIJ101 clt locus consists of a sequence no greater than 54 bp in size that includes essential inverted-repeat and direct-repeat sequences and is located in close proximity to the 3′ end of the korB regulatory gene. Evidence that sequences extending beyond the minimal locus and into the korB open reading frame influence clt transfer function and demonstration that clt-korB sequences are intrinsically curved raise the possibility that higher-order structuring of DNA and protein within this plasmid region may be an inherent feature of efficient pIJ101 transfer. PMID:11073933
Ducote, M J; Prakash, S; Pettis, G S
2000-12-01
Efficient interbacterial transfer of streptomycete plasmid pIJ101 requires the pIJ101 tra gene, as well as a cis-acting plasmid function known as clt. Here we show that the minimal pIJ101 clt locus consists of a sequence no greater than 54 bp in size that includes essential inverted-repeat and direct-repeat sequences and is located in close proximity to the 3' end of the korB regulatory gene. Evidence that sequences extending beyond the minimal locus and into the korB open reading frame influence clt transfer function and demonstration that clt-korB sequences are intrinsically curved raise the possibility that higher-order structuring of DNA and protein within this plasmid region may be an inherent feature of efficient pIJ101 transfer.
Kumar, Rajesh; Grover, Sunita; Kaushik, Jai K; Batish, Virender Kumar
2014-01-01
Lactobacillus plantarum is a flexible and versatile microorganism that inhabits a variety of niches, and its genome may express up to four bsh genes to maximize its survival in the mammalian gut. However, the ecological significance of multiple bsh genes in L. plantarum is still not clearly understood. Hence, this study demonstrated the disruption of bile salt hydrolase (bsh1) gene due to the insertion of a transposable element in L. plantarum Lp20 - a wild strain of human fecal origin. Surprisingly, L. plantarum strain Lp20 produced a ∼2.0 kb bsh1 amplicon against the normal size (∼1.0 kb) bsh1 amplicon of Bsh(+)L. plantarum Lp21. Strain Lp20 exhibited minimal Bsh activity in spite of having intact bsh2, bsh3 and bsh4 genes in its genome and hence had a Bsh(-) phenotype. Cloning and sequence characterization of Lp20 bsh1 gene predicted four individual open reading frames (ORFs) within this region. BLAST analysis of ORF1 and ORF2 revealed significant sequence similarity to the L. plantarum bsh1 gene while ORF3 and ORF4 showed high sequence homology to IS30-family transposases. Since, IS30-related transposon element was inserted within Lp20 bsh1 gene in reverse orientation (3'-5'), it introduced several stop codons and disrupted the protein reading frames of both Bsh1 and transposase. Inverted terminal repeats (GGCAGATTG) of transposon, mediated its insertion at 255-263 nt and 1301-1309 nt positions of Lp20 bsh1 gene. In conclusion, insertion of IS30 related-transposon within the bsh1 gene sequence of L. plantarum strain Lp20 demolished the integrity and functionality of Bsh1 enzyme. Additionally, this transposon DNA sequence remains active among various Lactobacillus spp. and hence harbors the potential to be explored in the development of efficient insertion mutagenesis system. Copyright © 2013 Elsevier GmbH. All rights reserved.
O'Farrell, C. L.; Strom, M.S.
1999-01-01
Virulence mechanisms utilized by the salmonid fish pathogen Renibacterium salmoninarum are poorly understood. One potential virulence factor is p57 (also designated MSA for major soluble antigen), an abundant 57 kDa soluble protein that is predominately localized on the bacterial cell surface with significant levels released into the extracellular milieu. Previous studies of an attenuated strain, MT 239, indicated that it differs from virulent strains in the amount of surface-associated p57. In this report, we show overall expression of p57 in R. salmoninarum MT 239 is considerably reduced as compared to a virulent strain, ATCC 33209. The amount of cell-associated p57 is decreased while the level of p57 in the culture supernatant is nearly equivalent between the strains. To determine if lowered amount of cell-associated p57 was due to a sequence defect in p57, a genetic comparison was performed. Two copies of the gene encoding p57 (msa1 and msa2) were found in 33209 and MT 239, as well as in several other virulent isolates. Both copies from 33209 and MT 239 were cloned and sequenced and found to be identical to each other, and identical between the 2 strains. A comparison of msa1 and msa2 within each strain showed that their sequences diverge 40 base pairs 5, to the open reading frame, while sequences 3' to the open reading frame are essentially identical for at least 225 base pairs. Northern blot analysis showed no difference in steady state levels of rosa mRNA between the 2 strains. These data suggest that while cell-surface localization of p57 may be important for R. salmoninarum virulence, the differences in localization, and total p57 expression between 33209 anti MT 239 are not due to differences in rosa sequence or differences in steady state transcript levels.
Cytochrome oxidase subunit II gene in mitochondria of Oenothera has no intron
Hiesel, Rudolf; Brennicke, Axel
1983-01-01
The cytochrome oxidase subunit II gene has been localized in the mitochondrial genome of Oenothera berteriana and the nucleotide sequence has been determined. The coding sequence contains 777 bp and, unlike the corresponding gene in Zea mays, is not interrupted by an intron. No TGA codon is found within the open reading frame. The codon CGG, as in the maize gene, is used in place of tryptophan codons of corresponding genes in other organisms. At position 742 in the Oenothera sequence the TGG of maize is changed into a CGG codon, where Trp is conserved as the amino acid in other organisms. Homologous sequences occur more than once in the mitochondrial genome as several mitochondrial DNA species hybridize with DNA probes of the cytochrome oxidase subunit II gene. ImagesFig. 5. PMID:16453484
Identification and Cloning of gusA, Encoding a New β-Glucuronidase from Lactobacillus gasseri ADH†
Russell, W. M.; Klaenhammer, T. R.
2001-01-01
The gusA gene, encoding a new β-glucuronidase enzyme, has been cloned from Lactobacillus gasseri ADH. This is the first report of a β-glucuronidase gene cloned from a bacterial source other than Escherichia coli. A plasmid library of L. gasseri chromosomal DNA was screened for complementation of an E. coli gus mutant. Two overlapping clones that restored β-glucuronidase activity in the mutant strain were sequenced and revealed three complete and two partial open reading frames. The largest open reading frame, spanning 1,797 bp, encodes a 597-amino-acid protein that shows 39% identity to β-glucuronidase (GusA) of E. coli K-12 (EC 3.2.1.31). The other two complete open reading frames, which are arranged to be separately transcribed, encode a putative bile salt hydrolase and a putative protein of unknown function with similarities to MerR-type regulatory proteins. Overexpression of GusA was achieved in a β-glucuronidase-negative L. gasseri strain by expressing the gusA gene, subcloned onto a low-copy-number shuttle vector, from the strong Lactobacillus P6 promoter. GusA was also expressed in E. coli from a pET expression system. Preliminary characterization of the GusA protein from crude cell extracts revealed that the enzyme was active across an acidic pH range and a broad temperature range. An analysis of other lactobacilli identified β-glucuronidase activity and gusA homologs in other L. gasseri isolates but not in other Lactobacillus species tested. PMID:11229918
Bain, Christine; Parroche, Peggy; Lavergne, Jean Pierre; Duverger, Blandine; Vieux, Claude; Dubois, Valérie; Komurian-Pradel, Florence; Trépo, Christian; Gebuhrer, Lucette; Paranhos-Baccala, Glaucia; Penin, François; Inchauspé, Geneviève
2004-01-01
In vitro studies have described the synthesis of an alternative reading frame form of the hepatitis C virus (HCV) core protein that was named F protein or ARFP (alternative reading frame protein) and includes a domain coded by the +1 open reading frame of the RNA core coding region. The expression of this protein in HCV-infected patients remains controversial. We have analyzed peripheral blood from 47 chronically or previously HCV-infected patients for the presence of T lymphocytes and antibodies specific to the ARFP. Anti-ARFP antibodies were detected in 41.6% of the patients infected with various HCV genotypes. Using a specific ARFP 99-amino-acid polypeptide as well as four ARFP predicted class I-restricted 9-mer peptides, we show that 20% of the patients display specific lymphocytes capable of producing gamma interferon, interleukin-10, or both cytokines. Patients harboring three different viral genotypes (1a, 1b, and 3) carried T lymphocytes reactive to genotype 1b-derived peptides. In longitudinal analysis of patients receiving therapy, both core and ARFP-specific T-cell- and B-cell-mediated responses were documented. The magnitude and kinetics of the HCV antigen-specific responses differed and were not linked with viremia or therapy outcome. These observations provide strong and new arguments in favor of the synthesis, during natural HCV infection, of an ARFP derived from the core sequence. Moreover, the present data provide the first demonstration of the presence of T-cell-mediated immune responses directed to this novel HCV antigen. PMID:15367612
Tn5401, a new class II transposable element from Bacillus thuringiensis.
Baum, J A
1994-01-01
A new class II (Tn3-like) transposable element, designated Tn5401, was recovered from a sporulation-deficient variant of Bacillus thuringiensis subsp. morrisoni EG2158 following its insertion into a recombinant plasmid. Sequence analysis of the insert revealed a 4,837-bp transposon with two large open reading frames, in the same orientation, encoding proteins of 36 kDa (306 residues) and 116 kDa (1,005 residues) and 53-bp terminal inverted repeats. The deduced amino acid sequence for the 36-kDa protein shows 24% sequence identity with the TnpI recombinase of the B. thuringiensis transposon Tn4430, a member of the phage integrase family of site-specific recombinases. The deduced amino acid sequence for the 116-kDa protein shows 42% sequence identity with the transposase of Tn3 but only 28% identity with the TnpA transposase of Tn4430. Two small open reading frames of unknown function, designated orf1 (85 residues) and orf2 (74 residues), were also identified. Southern blot analysis indicated that Tn5401, in contrast to Tn4430, is not commonly found among different subspecies of B. thuringiensis and is not typically associated with known insecticidal crystal protein genes. Transposition was studied with B. thuringiensis by using plasmid pEG922, a temperature-sensitive shuttle vector containing Tn5401. Tn5401 transposed to both chromosomal and plasmid target sites but displayed an apparent preference for plasmid sites. Transposition was replicative and resulted in the generation of a 5-bp duplication at the target site. Transcriptional start sites within Tn5401 were mapped by primer extension analysis. Two promoters, designated PL and PR, direct the transcription of orf1-orf2 and tnpI-tnpA, respectively, and are negatively regulated by TnpI. Sequence comparison of the promoter regions of Tn5401 and Tn4430 suggests that the conserved sequence element ATGTCCRCTAAY mediates TnpI binding and cointegrate resolution. The same element is contained within the 53-bp terminal inverted repeats, thus accounting for their unusual lengths and suggesting an additional role for TnpI in regulating Tn5401 transposition. Images PMID:7514590
Identification of a novel vitivirus from grapevines in New Zealand.
Blouin, Arnaud G; Keenan, Sandi; Napier, Kathryn R; Barrero, Roberto A; MacDiarmid, Robin M
2018-01-01
We report a sequence of a novel vitivirus from Vitis vinifera obtained using two high-throughput sequencing (HTS) strategies on RNA. The initial discovery from small-RNA sequencing was confirmed by HTS of the total RNA and Sanger sequencing. The new virus has a genome structure similar to the one reported for other vitiviruses, with five open reading frames (ORFs) coding for the conserved domains described for members of that genus. Phylogenetic analysis of the complete genome sequence confirmed its affiliation to the genus Vitivirus, with the closest described viruses being grapevine virus E (GVE) and Agave tequilana leaf virus (ATLV). However, the virus we report is distinct and shares only 51% amino acid sequence identity with GVE in the replicase polyprotein and 66.8% amino acid sequence identity with ATLV in the coat protein. This is well below the threshold determined by the ICTV for species demarcation, and we propose that this virus represents a new species. It is provisionally named "grapevine virus G".
Molecular variability analysis of five new complete cacao swollen shoot virus genomic sequences.
Muller, E; Sackey, S
2005-01-01
Cacao swollen shoot virus (CSSV), a member of the family Caulimovi-ridae, genus Badnavirus occurs in all the main cacao-growing areas of West Africa. We amplified, cloned and sequenced complete genomes of five new isolates, two originating from Togo and three originating from Ghana. The genome of these five newly sequenced isolates all contain the five putative open reading frames I, II, III, X and Y described for the first sequenced CSSV isolate, Agou1 originating from Togo. Their genomes have been aligned with the genome of Agou1. The nucleotide and amino acid sequence identities between isolates have been calculated and a phylogenetic analysis has been made including other pararetroviruses. Maximum nucleotide sequence variability between complete genomes of CSSV isolates was 29.4%. Geographical differentiation between isolates appears more important than differentiation between mild and severe isolates. ORF X differs greatly in size and sequence between the Togolese isolates Nyongbo2 and Agou1, and the four other isolates, its functional role is therefore clearly questionable.
Magiorkinis, E; Paraskevis, D; Pavlopoulou, I D; Kantzanou, M; Haida, C; Hatzakis, A; Boletis, I N
2013-08-01
The purpose of this study was to present a fatal case of fulminant hepatitis B (FHB) that developed in a renal transplant recipient, immunized against hepatitis B, 1 year post transplantation. Polymerase chain reaction amplification and full genome sequencing were performed to investigate whether specific mutations were associated with hepatitis B virus (HBV) transmission and FHB. Molecular analysis revealed multiple mutations in various open reading frames of HBV, the most important being the G145R escape mutation and a frameshift mutation-insertion (1838insA) within the pre-C/C reading frame. Our results highlight the possibility of developing FHB, despite previous immunization against HBV or administration of hyperimmune gammaglobulin, because of the selection of escape virus mutants. The current literature and guidelines regarding renal transplantation from hepatitis B surface antigen (HBsAg)-positive to HBsAg-negative patients were also reviewed. © 2013 John Wiley & Sons A/S.
Zong, Li; Qin, Yanli; Jia, Haodi; Ye, Lei; Wang, Yongxiang; Zhang, Jiming; Wands, Jack R; Tong, Shuping; Li, Jisu
2017-05-01
Hepatitis B virus (HBV) transcribes two subsets of 3.5-kb RNAs: precore RNA for hepatitis B e antigen (HBeAg) expression, and pregenomic RNA for core and P protein translation as well as genome replication. HBeAg expression could be prevented by mutations in the precore region, while an upstream open reading frame (uORF) has been proposed as a negative regulator of core protein translation. We employed replication competent HBV DNA constructs and transient transfection experiments in Huh7 cells to verify the uORF effect and to explore the alternative function of precore RNA. Optimized Kozak sequence for the uORF or extra ATG codons as present in some HBV genotypes reduced core protein expression. G1896A nonsense mutation promoted more efficient core protein expression than mutated precore ATG, while a +1 frameshift mutation was ineffective. In conclusion, various HBeAg-negative precore mutations and mutations affecting uORF differentially regulate core protein expression and genome replication. Copyright © 2017 Elsevier Inc. All rights reserved.
Wein, Nicolas; Vulin, Adeline; Falzarano, Maria S; Szigyarto, Christina Al-Khalili; Maiti, Baijayanta; Findlay, Andrew; Heller, Kristin N; Uhlén, Mathias; Bakthavachalu, Baskar; Messina, Sonia; Vita, Giuseppe; Passarelli, Chiara; Brioschi, Simona; Bovolenta, Matteo; Neri, Marcella; Gualandi, Francesca; Wilton, Steve D; Rodino-Klapac, Louise R; Yang, Lin; Dunn, Diane M; Schoenberg, Daniel R; Weiss, Robert B; Howard, Michael T; Ferlini, Alessandra; Flanigan, Kevin M
2014-09-01
Most mutations that truncate the reading frame of the DMD gene cause loss of dystrophin expression and lead to Duchenne muscular dystrophy. However, amelioration of disease severity has been shown to result from alternative translation initiation beginning in DMD exon 6 that leads to expression of a highly functional N-truncated dystrophin. Here we demonstrate that this isoform results from usage of an internal ribosome entry site (IRES) within exon 5 that is glucocorticoid inducible. We confirmed IRES activity by both peptide sequencing and ribosome profiling in muscle from individuals with minimal symptoms despite the presence of truncating mutations. We generated a truncated reading frame upstream of the IRES by exon skipping, which led to synthesis of a functional N-truncated isoform in both human subject-derived cell lines and in a new DMD mouse model, where expression of the truncated isoform protected muscle from contraction-induced injury and corrected muscle force to the same level as that observed in control mice. These results support a potential therapeutic approach for patients with mutations within the 5' exons of DMD.
The Nucleotide Sequence and Spliced pol mRNA Levels of the Nonprimate Spumavirus Bovine Foamy Virus
Holzschu, Donald L.; Delaney, Mari A.; Renshaw, Randall W.; Casey, James W.
1998-01-01
We have determined the complete nucleotide sequence of a replication-competent clone of bovine foamy virus (BFV) and have quantitated the amount of splice pol mRNA processed early in infection. The 544-amino-acid Gag protein precursor has little sequence similarity with its primate foamy virus homologs, but the putative nucleocapsid (NC) protein, like the primate NCs, contains the three glycine-arginine-rich regions that are postulated to bind genomic RNA during virion assembly. The BFV gag and pol open reading frames overlap, with pro and pol in the same translational frame. As with the human foamy virus (HFV) and feline foamy virus, we have detected a spliced pol mRNA by PCR. Quantitatively, this mRNA approximates the level of full-length genomic RNA early in infection. The integrase (IN) domain of reverse transcriptase does not contain the canonical HH-CC zinc finger motif present in all characterized retroviral INs, but it does contain a nearby histidine residue that could conceivably participate as a member of the zinc finger. The env gene encodes a protein that is over 40% identical in sequence to the HFV Env. By comparison, the Gag precursor of BFV is predicted to be only 28% identical to the HFV protein. PMID:9499074
Howitt, Crispin A.; Udall, Pacer K.; Vermaas, Wim F. J.
1999-01-01
Analysis of the genome of Synechocystis sp. strain PCC 6803 reveals three open reading frames (slr0851, slr1743, and sll1484) that may code for type 2 NAD(P)H dehydrogenases (NDH-2). The sequence similarity between the translated open reading frames and NDH-2s from other organisms is low, generally not exceeding 30% identity. However, NAD(P)H and flavin adenine dinucleotide binding motifs are conserved in all three putative NDH-2s in Synechocystis sp. strain PCC 6803. The three open reading frames were cloned, and deletion constructs were made for each. An expression construct containing one of the three open reading frames, slr1743, was able to functionally complement an Escherichia coli mutant lacking both NDH-1s and NDH-2s. Therefore, slr0851, slr1743, and sll1484 have been designated ndbA, ndbB, and ndbC, respectively. Strains that lacked one or more of the ndb genes were created in wild-type and photosystem (PS) I-less backgrounds. Deletion of ndb genes led to small changes in photoautotrophic growth rates and respiratory activities. Electron transfer rates into the plastoquinone pool in thylakoids in darkness were consistent with the presence of a small amount of NDH-2 activity in thylakoids. No difference was observed between wild-type and the Ndb-less strains in the banding patterns seen on native gels when stained for either NADH or NADPH dehydrogenase activity, indicating that the Ndb proteins do not accumulate to high levels. A striking phenotype of the PS I-less background strains lacking one or more of the NDH-2s is that they were able to grow at high light intensities that were lethal to the control strain but they retained normal PS II activity. We suggest that the Ndb proteins in Synechocystis sp. strain PCC 6803 are redox sensors and that they play a regulatory role responding to the redox state of the plastoquinone pool. PMID:10383967
NASA Technical Reports Server (NTRS)
Kaine, B. P.; Mehr, I. J.; Woese, C. R.
1994-01-01
Through random search, a gene from Thermococcus celer has been identified and sequenced that appears to encode a transcription-associated protein (110 amino acid residues). The sequence has clear homology to approximately the last half of an open reading frame reported previously for Sulfolobus acidocaldarius [Langer, D. & Zillig, W. (1993) Nucleic Acids Res. 21, 2251]. The protein translations of these two archaeal genes in turn are homologs of a small subunit found in eukaryotic RNA polymerase I (A12.2) and the counterpart of this from RNA polymerase II (B12.6). Homology is also seen with the eukaryotic transcription factor TFIIS, but it involves only the terminal 45 amino acids of the archaeal proteins. Evolutionary implications of these homologies are discussed.
Genetic characterization of frameshift suppressors with new decoding properties.
Hughes, D; Thompson, S; O'Connor, M; Tuohy, T; Nichols, B P; Atkins, J F
1989-01-01
Suppressor mutants that cause ribosomes to shift reading frame at specific and new sequences are described. Suppressors for trpE91, the only known suppressible -1 frameshift mutant, have been isolated in Escherichia coli and in Salmonella typhimurium. E. coli hopR acts on trpE91 within the 9-base-pair sequence GGA GUG UGA, is dominant, and is located at min 52 on the chromosome. Its Salmonella homolog maps at an equivalent position and arises as a rarer class in that organism as compared with E. coli. The Salmonella suppressor, hopE, believed to be in a duplicate copy of the same gene, maps at min 17. The +1 suppressor, sufT, acts at the nonmonotonous sequence CCGU, is dominant, and maps at min 59 on the Salmonella chromosome. PMID:2644219
Genomic analysis of WCP30 Phage of Weissella cibaria for Dairy Fermented Foods.
Lee, Young-Duck; Park, Jong-Hyun
2017-01-01
In this study, we report the morphogenetic analysis and genome sequence of a new WCP30 phage of Weissella cibaria , isolated from a fermented food. Based on its morphology, as observed by transmission electron microscopy, WCP30 phage belongs to the family Siphoviridae . Genomic analysis of WCP30 phage showed that it had a 33,697-bp double-stranded DNA genome with 41.2% G+C content. Bioinformatics analysis of the genome revealed 35 open reading frames. A BLASTN search showed that WCP30 phage had low sequence similarity compared to other phages infecting lactic acid bacteria. This is the first report of the morphological features and complete genome sequence of WCP30 phage, which may be useful for controlling the fermentation of dairy foods.
Puli'uvea, Christopher; Khan, Subuhi; Chang, Wee-Leong; Valmonte, Gardette; Pearson, Michael N; Higgins, Colleen M
2017-02-01
We present the first complete genome of vanilla mosaic virus (VanMV). The VanMV genomic structure is consistent with that of a potyvirus, containing a single open reading frame (ORF) encoding a polyprotein of 3139 amino acids. Motif analyses indicate the polyprotein can be cleaved into the expected ten individual proteins; other recognised potyvirus motifs are also present. As expected, the VanMV genome shows high sequence similarity to the published Dasheen mosaic virus (DsMV) genome sequences; comparisons with DsMV continue to support VanMV as a vanilla infecting strain of DsMV. Phylogenetic analyses indicate that VanMV and DsMV share a common ancestor, with VanMV having the closest relationship with DsMV strains from the South Pacific.
Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T
1993-01-01
Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043
RAPSearch: a fast protein similarity search tool for short reads
2011-01-01
Background Next Generation Sequencing (NGS) is producing enormous corpuses of short DNA reads, affecting emerging fields like metagenomics. Protein similarity search--a key step to achieve annotation of protein-coding genes in these short reads, and identification of their biological functions--faces daunting challenges because of the very sizes of the short read datasets. Results We developed a fast protein similarity search tool RAPSearch that utilizes a reduced amino acid alphabet and suffix array to detect seeds of flexible length. For short reads (translated in 6 frames) we tested, RAPSearch achieved ~20-90 times speedup as compared to BLASTX. RAPSearch missed only a small fraction (~1.3-3.2%) of BLASTX similarity hits, but it also discovered additional homologous proteins (~0.3-2.1%) that BLASTX missed. By contrast, BLAT, a tool that is even slightly faster than RAPSearch, had significant loss of sensitivity as compared to RAPSearch and BLAST. Conclusions RAPSearch is implemented as open-source software and is accessible at http://omics.informatics.indiana.edu/mg/RAPSearch. It enables faster protein similarity search. The application of RAPSearch in metageomics has also been demonstrated. PMID:21575167
Czeizler, Amalia; Garbarino, Ellen
2017-01-01
The research extends construal theory by testing if a match between the temporal construal framing of a blood donation decision and a blood donation request leads to higher donation intentions than a mismatch. Results show participants considering future donation who read an abstract donation request have significantly higher donation intentions than those who read a concrete request. Conversely, participants considering donating today who read a concrete donation request have significantly higher donation intentions than those who read an abstract request. This study confirms the importance of matching the construal framing of the communication to the temporal framing of the decision.
Chalick, Michael; Jacobi, Oded; Pichinuk, Edward; Garbar, Christian; Bensussan, Armand; Meeker, Alan; Ziv, Ravit; Zehavi, Tania; Smorodinsky, Nechama I; Hilkens, John; Hanisch, Franz-Georg; Rubinstein, Daniel B; Wreschner, Daniel H
2016-01-01
Translation of mRNA in alternate reading frames (ARF) is a naturally occurring process heretofore underappreciated as a generator of protein diversity. The MUC1 gene encodes MUC1-TM, a signal-transducing trans-membrane protein highly expressed in human malignancies. Here we show that an AUG codon downstream to the MUC1-TM initiation codon initiates an alternate reading frame thereby generating a novel protein, MUC1-ARF. MUC1-ARF, like its MUC1-TM 'parent' protein, contains a tandem repeat (VNTR) domain. However, the amino acid sequence of the MUC1-ARF tandem repeat as well as N- and C- sequences flanking it differ entirely from those of MUC1-TM. In vitro protein synthesis assays and extensive immunohistochemical as well as western blot analyses with MUC1-ARF specific monoclonal antibodies confirmed MUC1-ARF expression. Rather than being expressed at the cell membrane like MUC1-TM, immunostaining showed that MUC1-ARF protein localizes mainly in the nucleus: Immunohistochemical analyses of MUC1-expressing tissues demonstrated MUC1-ARF expression in the nuclei of secretory luminal epithelial cells. MUC1-ARF expression varies in different malignancies. While the malignant epithelial cells of pancreatic cancer show limited expression, in breast cancer tissue MUC1-ARF demonstrates strong nuclear expression. Proinflammatory cytokines upregulate expression of MUC1-ARF protein and co-immunoprecipitation analyses demonstrate association of MUC1-ARF with SH3 domain-containing proteins. Mass spectrometry performed on proteins coprecipitating with MUC1-ARF demonstrated Glucose-6-phosphate 1-dehydrogenase (G6PD) and Dynamin 2 (DNM2). These studies not only reveal that the MUC1 gene generates a previously unidentified MUC1-ARF protein, they also show that just like its 'parent' MUC1-TM protein, MUC1-ARF is apparently linked to signaling and malignancy, yet a definitive link to these processes and the roles it plays awaits a precise identification of its molecular functions. Comprising at least 524 amino acids, MUC1-ARF is, furthermore, the longest ARF protein heretofore described.
Pichinuk, Edward; Garbar, Christian; Bensussan, Armand; Meeker, Alan; Ziv, Ravit; Zehavi, Tania; Smorodinsky, Nechama I.; Hilkens, John; Hanisch, Franz-Georg; Rubinstein, Daniel B.; Wreschner, Daniel H.
2016-01-01
Translation of mRNA in alternate reading frames (ARF) is a naturally occurring process heretofore underappreciated as a generator of protein diversity. The MUC1 gene encodes MUC1-TM, a signal-transducing trans-membrane protein highly expressed in human malignancies. Here we show that an AUG codon downstream to the MUC1-TM initiation codon initiates an alternate reading frame thereby generating a novel protein, MUC1-ARF. MUC1-ARF, like its MUC1-TM 'parent’ protein, contains a tandem repeat (VNTR) domain. However, the amino acid sequence of the MUC1-ARF tandem repeat as well as N- and C- sequences flanking it differ entirely from those of MUC1-TM. In vitro protein synthesis assays and extensive immunohistochemical as well as western blot analyses with MUC1-ARF specific monoclonal antibodies confirmed MUC1-ARF expression. Rather than being expressed at the cell membrane like MUC1-TM, immunostaining showed that MUC1-ARF protein localizes mainly in the nucleus: Immunohistochemical analyses of MUC1-expressing tissues demonstrated MUC1-ARF expression in the nuclei of secretory luminal epithelial cells. MUC1-ARF expression varies in different malignancies. While the malignant epithelial cells of pancreatic cancer show limited expression, in breast cancer tissue MUC1-ARF demonstrates strong nuclear expression. Proinflammatory cytokines upregulate expression of MUC1-ARF protein and co-immunoprecipitation analyses demonstrate association of MUC1-ARF with SH3 domain-containing proteins. Mass spectrometry performed on proteins coprecipitating with MUC1-ARF demonstrated Glucose-6-phosphate 1-dehydrogenase (G6PD) and Dynamin 2 (DNM2). These studies not only reveal that the MUC1 gene generates a previously unidentified MUC1-ARF protein, they also show that just like its ‘parent’ MUC1-TM protein, MUC1-ARF is apparently linked to signaling and malignancy, yet a definitive link to these processes and the roles it plays awaits a precise identification of its molecular functions. Comprising at least 524 amino acids, MUC1-ARF is, furthermore, the longest ARF protein heretofore described. PMID:27768738
Assiri, Abdullah M.; Biggs, Holly M.; Abedi, Glen R.; Lu, Xiaoyan; Bin Saeed, Abdulaziz; Abdalla, Osman; Mohammed, Mutaz; Al-Abdely, Hail M.; Algarni, Homoud S.; Alhakeem, Raafat F.; Almasri, Malak M.; Alsharef, Ali A.; Nooh, Randa; Erdman, Dean D.; Gerber, Susan I.; Watson, John T.
2016-01-01
During July–August 2015, the number of cases of Middle East respiratory syndrome (MERS) reported from Saudi Arabia increased dramatically. We reviewed the 143 confirmed cases from this period and classified each based upon likely transmission source. We found that the surge in cases resulted predominantly (90%) from secondary transmission largely attributable to an outbreak at a single healthcare facility in Riyadh. Genome sequencing of MERS coronavirus from 6 cases demonstrated continued circulation of the recently described recombinant virus. A single unique frameshift deletion in open reading frame 5 was detected in the viral sequence from 1 case. PMID:27704019
Complete genome sequence of a novel avian paramyxovirus isolated from wild birds in South Korea.
Jeong, Jipseol; Kim, Youngsik; An, Injung; Wang, Seung-Jun; Kim, Yongkwan; Lee, Hyun-Jeong; Choi, Kang-Seuk; Im, Se-Pyeong; Min, Wongi; Oem, Jae-Ku; Jheong, Weonhwa
2018-01-01
A novel avian paramyxovirus (APMV), Cheonsu1510, was isolated from wild bird feces in South Korea and serologically and genetically characterized. In hemagglutination inhibition tests, antiserum against Cheonsu1510 showed low reactivity with other APMVs and vice versa. The complete genome of Cheonsu1510 comprised 15,408 nucleotides, contained six open reading frames (3'-N-P-M-F-HN-L-5'), and showed low sequence identity to other APMVs (< 63%) and a unique genomic composition. Phylogenetic analysis revealed that Cheonsu1510 was related to but distinct from APMV-1, -9, and -15. These results suggest that Cheonsu1510 represents a new APMV serotype, APMV-17.
Nagai, Makoto; Omatsu, Tsutomu; Aoki, Hiroshi; Otomaru, Konosuke; Uto, Takehiko; Koizumi, Motoya; Minami-Fukuda, Fujiko; Takai, Hikaru; Murakami, Toshiaki; Masuda, Tsuneyuki; Yamasato, Hiroshi; Shiokawa, Mai; Tsuchiaka, Shinobu; Naoi, Yuki; Sano, Kaori; Okazaki, Sachiko; Katayama, Yukie; Oba, Mami; Furuya, Tetsuya; Shirai, Junsuke; Mizutani, Tetsuya
2015-10-01
A viral metagenomics approach was used to investigate fecal samples of Japanese calves with and without diarrhea. Of the different viral pathogens detected, read counts gave nearly complete astrovirus-related RNA sequences in 15 of the 146 fecal samples collected in three distinct areas (Hokkaido, Ishikawa, and Kagoshima Prefectures) between 2009 and 2015. Due to the lack of genetic information about bovine astroviruses (BoAstVs) in Japan, these sequences were analyzed in this study. Nine of the 15 Japanese BoAstVs were closely related to Chinese BoAstVs and clustered into a lineage (tentatively named lineage 1) in all phylogenetic trees. Three of 15 strains were phylogenetically separate from lineage 1, showing low sequence identities, and clustered instead with an American strain isolated from cattle with respiratory disease (tentatively named lineage 2). Interestingly, two of 15 strains clustered with lineage 1 in the open reading frame (ORF)1a and ORF1b regions, while they clustered with lineage 2 in the ORF2 region. Remarkably, one of 15 strains exhibited low amino acid sequence similarity to other BoAstVs and was clustered separately with porcine astrovirus type 5 in all trees, and ovine astrovirus in the ORF2 region, suggesting past interspecies transmission.
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.
Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li
2007-06-01
The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
Stephenson, F H; Ballard, B T; Boyer, H W; Rosenberg, J M; Greene, P J
1989-12-21
The RsrI endonuclease, a type-II restriction endonuclease (ENase) found in Rhodobacter sphaeroides, is an isoschizomer of the EcoRI ENase. A clone containing an 11-kb BamHI fragment was isolated from an R. sphaeroides genomic DNA library by hybridization with synthetic oligodeoxyribonucleotide probes based on the N-terminal amino acid (aa) sequence of RsrI. Extracts of E. coli containing a subclone of the 11-kb fragment display RsrI activity. Nucleotide sequence analysis reveals an 831-bp open reading frame encoding a polypeptide of 277 aa. A 50% identity exists within a 266-aa overlap between the deduced aa sequences of RsrI and EcoRI. Regions of 75-100% aa sequence identity correspond to key structural and functional regions of EcoRI. The type-II ENases have many common properties, and a common origin might have been expected. Nevertheless, this is the first demonstration of aa sequence similarity between ENases produced by different organisms.
Vandenbol, M; Jauniaux, J C; Grenson, M
1989-11-15
The complete nucleotide (nt) sequence of the PUT4 gene, whose product is required for high-affinity proline active transport in the yeast Saccharomyces cerevisiae, is presented. The sequence contains a single long open reading frame of 1881 nt, encoding a polypeptide with a calculated Mr of 68,795. The predicted protein is strongly hydrophobic and exhibits six potential glycosylation sites. Its hydropathy profile suggests the presence of twelve membrane-spanning regions flanked by hydrophilic N- and C-terminal domains. The N terminus does not resemble signal sequences found in secreted proteins. These features are characteristic of integral membrane proteins catalyzing translocation of ligands across cellular membranes. Protein sequence comparisons indicate strong resemblance to the arginine and histidine permeases of S. cerevisiae, but no marked sequence similarity to the proline permease of Escherichia coli or to other known prokaryotic or eukaryotic transport proteins. The strong similarity between the three yeast amino acid permeases suggests a common ancestor for the three proteins.
Ren, Qian; Au, Hilda H.T.; Wang, Qing S.; Lee, Seonghoon; Jan, Eric
2014-01-01
The dicistrovirus intergenic internal ribosome entry site (IGR IRES) directly recruits the ribosome and initiates translation using a non-AUG codon. A subset of IGR IRESs initiates translation in either of two overlapping open reading frames (ORFs), resulting in expression of the 0 frame viral structural polyprotein and an overlapping +1 frame ORFx. A U–G base pair adjacent to the anticodon-like pseudoknot of the IRES directs +1 frame translation. Here, we show that the U-G base pair is not absolutely required for +1 frame translation. Extensive mutagenesis demonstrates that 0 and +1 frame translation can be uncoupled. Ribonucleic acid (RNA) structural probing analyses reveal that the mutant IRESs adopt distinct conformations. Toeprinting analysis suggests that the reading frame is selected at a step downstream of ribosome assembly. We propose a model whereby the IRES adopts conformations to occlude the 0 frame aminoacyl-tRNA thereby allowing delivery of the +1 frame aminoacyl-tRNA to the A site to initiate translation of ORFx. This study provides a new paradigm for programmed recoding mechanisms that increase the coding capacity of a viral genome. PMID:25038250
Ehrmann, M A; Vogel, R E
2001-11-01
An insertion sequence has been identified in the genome of Lactobacillus sanfranciscensis DSM 20451T as segment of 1351 nucleotides containing 37-bp imperfect terminal inverted repeats. The sequence of this element encodes two out of phase, overlapping open reading frames, orfA and orfB, from which three putative proteins are produced. OrfAB is a transframe protein produced by -1 translational frame shifting between orf A and orf B that is presumed to be the transposase. The large orfAB of this element encodes a 342 amino acid protein that displays similarities with transposases encoded by bacterial insertion sequences belonging to the IS3 family. In L. sanfranciscensis type strain DSM 20451T multiple truncated IS elements were identified. Inverse PCR was used to analyze target sites of four of these elements, but except of their highly AT rich character not any sequence specificity was identified so far. Moreover, no flanking direct repeats were identified. Multiple copies of IS153 were detected by hybridization in other strains of L. sanfranciscensis. Resulting hybridization patterns were shown to differentiate between organisms at strain level rather than a probe targeted against the 16S rDNA. With a PCR based approach IS153 or highly similar sequences were detected in L. acidophilus, L. casei, L. malefermentans, L. plantarum, L. hilgardii, L. collinoides L. farciminis L. sakei and L. salivarius, L. reuteri as well as in Enterococcus faecium, Pediococcus acidilactici and P. pentosaceus.
McManus, Hilary A; Sanchez, Daniel J; Karol, Kenneth G
2017-01-01
Comparative studies of chloroplast genomes (plastomes) across the Chlorophyceae are revealing dynamic patterns of size variation, gene content, and genome rearrangements. Phylogenomic analyses are improving resolution of relationships, and uncovering novel lineages as new plastomes continue to be characterized. To gain further insight into the evolution of the chlorophyte plastome and increase the number of representative plastomes for the Sphaeropleales, this study presents two fully sequenced plastomes from the green algal family Hydrodictyaceae (Sphaeropleales, Chlorophyceae), one from Hydrodictyon reticulatum and the other from Pediastrum duplex . Genomic DNA from Hydrodictyon reticulatum and Pediastrum duplex was subjected to Illumina paired-end sequencing and the complete plastomes were assembled for each. Plastome size and gene content were characterized and compared with other plastomes from the Sphaeropleales. Homology searches using BLASTX were used to characterize introns and open reading frames (orfs) ≥ 300 bp. A phylogenetic analysis of gene order across the Sphaeropleales was performed. The plastome of Hydrodictyon reticulatum is 225,641 bp and Pediastrum duplex is 232,554 bp. The plastome structure and gene order of H. reticulatum and P. duplex are more similar to each other than to other members of the Sphaeropleales. Numerous unique open reading frames are found in both plastomes and the plastome of P. duplex contains putative viral protein genes, not found in other Sphaeropleales plastomes. Gene order analyses support the monophyly of the Hydrodictyaceae and their sister relationship to the Neochloridaceae. The complete plastomes of Hydrodictyon reticulatum and Pediastrum duplex , representing the largest of the Sphaeropleales sequenced thus far, once again highlight the variability in size, architecture, gene order and content across the Chlorophyceae. Novel intron insertion sites and unique orfs indicate recent, independent invasions into each plastome, a hypothesis testable with an expanded plastome investigation within the Hydrodictyaceae.
Flärdh, K; Axberg, T; Albertson, N H; Kjelleberg, S
1994-01-01
In order to evaluate the role of the stringent response in starvation adaptations of the marine Vibrio sp. strain S14, we have cloned the relA gene and generated relaxed mutants of this organism. The Vibrio relA gene was selected from a chromosomal DNA library by complementation of an Escherichia coli delta relA strain. The nucleotide sequence contains a 743-codon open reading frame that encodes a polypeptide that is identical in length and highly homologous to the E. coli RelA protein. The amino acid sequences are 64% identical, and they share some completely conserved regions. A delta relA::kan allele was generated by replacing 53% of the open reading frame with a kanamycin resistance gene. The Vibrio relA mutants displayed a relaxed control of RNA synthesis and failed to accumulate ppGpp during amino acid limitation. During carbon and energy starvation, a relA-dependent burst of ppGpp synthesis concomitant with carbon source depletion and growth arrest was observed. Also, in the absence of the relA gene, there was an accumulation of ppGpp during carbon starvation, but this was slower and smaller than that which occurred in the stringent strains, and it was preceded by a marked decrease in the [ATP]/[ADP] ratio. In both the wild-type and the relaxed strains, carbon source depletion caused an immediate decrease in the size of the GTP pool and a block of net RNA accumulation. The relA mutation did not affect long-term survival or the development of resistance against heat, ethanol, and oxidative stress during carbon starvation of Vibrio sp. strain S14. PMID:7928955
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kurilla, M.G.; Stone, H.O.; Keene, J.D.
The 3' end of the genomic RNA of Newcastle disease virus (NDV) has been sequenced and the leader RNA defined. Using hybridization to a 3'-end-labeled genome, leader RNA species from in vitro transcription reactions and from infected cell extracts were found to be 47 and 53 nucleotides long. In addition, the start site of the 3'-proximal mRNA was determined by sequence analysis of in vitro (beta-32P)GTP-labeled transcription products. The genomic sequence extending beyond the leader region demonstrated an open reading frame for at least 42 amino acids and probably represents the amino terminus of the nucleocapsid protein (NP). The terminalmore » 8 nucleotides of the NDV genome were identical to those of measles virus and Sendai virus while the sequence of the distal half of the leader region was more similar to that of vesicular stomatitis virus. These data argue for strong evolutionary relatedness between the paramyxovirus and rhabdovirus groups.« less
Stoker, K; Reijnders, W N; Oltmann, L F; Stouthamer, A H
1989-01-01
To isolate genes from Escherichia coli which regulate the labile hydrogenase activity, a plasmid library was used to transform hydL mutants lacking the labile hydrogenase. A single type of gene, designated hydG, was isolated. This gene also partially restored the hydrogenase activity in hydF mutants (which are defective in all hydrogenase isoenzymes), although the low hydrogenase 1 and 2 levels were not induced. Therefore, hydG apparently regulates, specifically, the labile hydrogenase activity. Restoration of this latter activity in hydF mutants was accompanied by a proportional increase of the H2 uptake activity, suggesting a functional relationship. H2:fumarate oxidoreductase activity was not restored in complemented hydL mutants. These latter strains may therefore lack, in addition to the labile hydrogenase, a second component (provisionally designated component R), possibly an electron carrier coupling H2 oxidation to the anerobic respiratory chain. Sequence analysis showed an open reading frame of 1,314 base pairs for hydG. It was preceded by a ribosome-binding site but apparently lacked a promoter. Minicell experiments revealed a single polypeptide of approximately 50 kilodaltons. Comparison of the predicted amino acid sequence with a protein sequence data base revealed strong homology to NtrC from Klebsiella pneumoniae, a DNA-binding transcriptional activator. The 411 base pairs upstream from pHG40 contained a second open reading frame overlapping hydG by four bases. The deduced amino acid sequence showed considerable homology with the C-terminal part of NtrB. This sequence was therefore assumed to be part of a second gene, encoding the NtrB-like component, and was designated hydH. The labile hydrogenase activity in E. coli is apparently regulated by a multicomponent system analogous to the NtrB-NtrC system. This conclusion is in agreement with the results of Birkmann et al. (A. Birkmann, R. G. Sawers, and A. Böck, Mol. Gen. Genet. 210:535-542, 1987), who demonstrated ntrA dependence for the labile hydrogenase activity. Images PMID:2666400
Au, Hilda H.; Cornilescu, Gabriel; Mouzakis, Kathryn D.; Ren, Qian; Burke, Jordan E.; Lee, Seonghoon; Butcher, Samuel E.; Jan, Eric
2015-01-01
The dicistrovirus intergenic region internal ribosome entry site (IRES) adopts a triple-pseudoknotted RNA structure and occupies the core ribosomal E, P, and A sites to directly recruit the ribosome and initiate translation at a non-AUG codon. A subset of dicistrovirus IRESs directs translation in the 0 and +1 frames to produce the viral structural proteins and a +1 overlapping open reading frame called ORFx, respectively. Here we show that specific mutations of two unpaired adenosines located at the core of the three-helical junction of the honey bee dicistrovirus Israeli acute paralysis virus (IAPV) IRES PKI domain can uncouple 0 and +1 frame translation, suggesting that the structure adopts distinct conformations that contribute to 0 or +1 frame translation. Using a reconstituted translation system, we show that ribosomes assembled on mutant IRESs that direct exclusive 0 or +1 frame translation lack reading frame fidelity. Finally, a nuclear magnetic resonance/small-angle X-ray scattering hybrid approach reveals that the PKI domain of the IAPV IRES adopts an RNA structure that resembles a complete tRNA. The tRNA shape-mimicry enables the viral IRES to gain access to the ribosome tRNA-binding sites and form intermolecular contacts with the ribosome that are necessary for initiating IRES translation in a specific reading frame. PMID:26554019
Detection of novel NF1 mutations and rapid mutation prescreening with Pyrosequencing.
Brinckmann, Anja; Mischung, Claudia; Bässmann, Ingelore; Kühnisch, Jirko; Schuelke, Markus; Tinschert, Sigrid; Nürnberg, Peter
2007-12-01
Neurofibromatosis type 1 (NF1) is caused by mutations in the neurofibromin (NF1) gene. Mutation analysis of NF1 is complicated by its large size, the lack of mutation hotspots, pseudogenes and frequent de novo mutations. Additionally, the search for NF1 mutations on the mRNA level is often hampered by nonsense-mediated mRNA decay (NMD) of the mutant allele. In this study we searched for mutations in a cohort of 38 patients and investigated the relationship between mutation type and allele-specific transcription from the wild-type versus mutant alleles. Quantification of relative mRNA transcript numbers was done by Pyrosequencing, a novel real-time sequencing method whose signals can be quantified very accurately. We identified 21 novel mutations comprising various mutation types. Pyrosequencing detected a definite relationship between allelic NF1 transcript imbalance due to NMD and mutation type in 24 of 29 patients who all carried frame-shift or nonsense mutations. NMD was absent in 5 patients with missense and silent mutations, as well as in 4 patients with splice-site mutations that did not disrupt the reading frame. Pyrosequencing was capable of detecting NMD even when the effects were only moderate. Diagnostic laboratories could thus exploit this effect for rapid prescreening for NF1 mutations as more than 60% of the mutations in this gene disrupt the reading frame and are prone to NMD.
Springfeld, Christoph; Darai, Gholamreza; Cattaneo, Roberto
2005-06-01
Rhabdoviruses are negative-stranded RNA viruses of the order Mononegavirales and have been isolated from vertebrates, insects, and plants. Members of the genus Lyssavirus cause the invariably fatal disease rabies, and a member of the genus Vesiculovirus, Chandipura virus, has recently been associated with acute encephalitis in children. We present here the complete genome sequence and transcription map of a rhabdovirus isolated from cultivated cells of hepatocellular carcinoma tissue from a moribund tree shrew. The negative-strand genome of tupaia rhabdovirus is composed of 11,440 nucleotides and encodes six genes that are separated by one or two intergenic nucleotides. In addition to the typical rhabdovirus genes in the order N-P-M-G-L, a gene encoding a small hydrophobic putative type I transmembrane protein of approximately 11 kDa was identified between the M and G genes, and the corresponding transcript was detected in infected cells. Similar to some Vesiculoviruses and many Paramyxovirinae, the P gene has a second overlapping reading frame that can be accessed by ribosomal choice and encodes a protein of 26 kDa, predicted to be the largest C protein of these virus families. Phylogenetic analyses of the tupaia rhabdovirus N and L genes show that the virus is distantly related to the Vesiculoviruses, Ephemeroviruses, and the recently characterized Flanders virus and Oita virus and further extends the sequence territory occupied by animal rhabdoviruses.
Springfeld, Christoph; Darai, Gholamreza; Cattaneo, Roberto
2005-01-01
Rhabdoviruses are negative-stranded RNA viruses of the order Mononegavirales and have been isolated from vertebrates, insects, and plants. Members of the genus Lyssavirus cause the invariably fatal disease rabies, and a member of the genus Vesiculovirus, Chandipura virus, has recently been associated with acute encephalitis in children. We present here the complete genome sequence and transcription map of a rhabdovirus isolated from cultivated cells of hepatocellular carcinoma tissue from a moribund tree shrew. The negative-strand genome of tupaia rhabdovirus is composed of 11,440 nucleotides and encodes six genes that are separated by one or two intergenic nucleotides. In addition to the typical rhabdovirus genes in the order N-P-M-G-L, a gene encoding a small hydrophobic putative type I transmembrane protein of approximately 11 kDa was identified between the M and G genes, and the corresponding transcript was detected in infected cells. Similar to some Vesiculoviruses and many Paramyxovirinae, the P gene has a second overlapping reading frame that can be accessed by ribosomal choice and encodes a protein of 26 kDa, predicted to be the largest C protein of these virus families. Phylogenetic analyses of the tupaia rhabdovirus N and L genes show that the virus is distantly related to the Vesiculoviruses, Ephemeroviruses, and the recently characterized Flanders virus and Oita virus and further extends the sequence territory occupied by animal rhabdoviruses. PMID:15890917
Promoting the Avoidance of High-Calorie Snacks: Priming Autonomy Moderates Message Framing Effects
Pavey, Louisa; Churchill, Sue
2014-01-01
The beneficial effects of gain-framed vs. loss-framed messages promoting health protective behaviors have been found to be inconsistent, and consideration of potential moderating variables is essential if framed health promotion messages are to be effective. This research aimed to determine the influence of highlighting autonomy (choice and freedom) and heteronomy (coercion) on the avoidance of high-calorie snacks following reading gain-framed or loss-framed health messages. In Study 1 (N = 152) participants completed an autonomy, neutral, or heteronomy priming task, and read a gain-framed or loss-framed health message. In Study 2 (N = 242) participants read a gain-framed or loss-framed health message with embedded autonomy or heteronomy primes. In both studies, snacking intentions and behavior were recorded after seven days. In both studies, when autonomy was highlighted, the gain-framed message (compared to the loss-framed message) resulted in stronger intentions to avoid high-calorie snacks, and lower self-reported snack consumption after seven days. Study 2 demonstrated this effect occurred only for participants to whom the information was most relevant (BMI>25). The results suggest that messages promoting healthy dietary behavior may be more persuasive if the autonomy-supportive vs. coercive nature of the health information is matched to the message frame. Further research is needed to examine potential mediating processes. PMID:25078965
Promoting the avoidance of high-calorie snacks: priming autonomy moderates message framing effects.
Pavey, Louisa; Churchill, Sue
2014-01-01
The beneficial effects of gain-framed vs. loss-framed messages promoting health protective behaviors have been found to be inconsistent, and consideration of potential moderating variables is essential if framed health promotion messages are to be effective. This research aimed to determine the influence of highlighting autonomy (choice and freedom) and heteronomy (coercion) on the avoidance of high-calorie snacks following reading gain-framed or loss-framed health messages. In Study 1 (N = 152) participants completed an autonomy, neutral, or heteronomy priming task, and read a gain-framed or loss-framed health message. In Study 2 (N = 242) participants read a gain-framed or loss-framed health message with embedded autonomy or heteronomy primes. In both studies, snacking intentions and behavior were recorded after seven days. In both studies, when autonomy was highlighted, the gain-framed message (compared to the loss-framed message) resulted in stronger intentions to avoid high-calorie snacks, and lower self-reported snack consumption after seven days. Study 2 demonstrated this effect occurred only for participants to whom the information was most relevant (BMI>25). The results suggest that messages promoting healthy dietary behavior may be more persuasive if the autonomy-supportive vs. coercive nature of the health information is matched to the message frame. Further research is needed to examine potential mediating processes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Solovyev, V.V.; Salamov, A.A.; Lawrence, C.B.
1994-12-31
Discriminant analysis is applied to the problem of recognition 5`-, internal and 3`-exons in human DNA sequences. Specific recognition functions were developed for revealing exons of particular types. The method based on a splice site prediction algorithm that uses the linear Fisher discriminant to combine the information about significant triplet frequencies of various functional parts of splice site regions and preferences of oligonucleotide in protein coding and nation regions. The accuracy of our splice site recognition function is about 97%. A discriminant function for 5`-exon prediction includes hexanucleotide composition of upstream region, triplet composition around the ATG codon, ORF codingmore » potential, donor splice site potential and composition of downstream introit region. For internal exon prediction, we combine in a discriminant function the characteristics describing the 5`- intron region, donor splice site, coding region, acceptor splice site and Y-intron region for each open reading frame flanked by GT and AG base pairs. The accuracy of precise internal exon recognition on a test set of 451 exon and 246693 pseudoexon sequences is 77% with a specificity of 79% and a level of pseudoexon ORF prediction of 99.96%. The recognition quality computed at the level of individual nucleotides is 89%, for exon sequences and 98% for intron sequences. A discriminant function for 3`-exon prediction includes octanucleolide composition of upstream nation region, triplet composition around the stop codon, ORF coding potential, acceptor splice site potential and hexanucleotide composition of downstream region. We unite these three discriminant functions in exon predicting program FEX (find exons). FEX exactly predicts 70% of 1016 exons from the test of 181 complete genes with specificity 73%, and 89% exons are exactly or partially predicted. On the average, 85% of nucleotides were predicted accurately with specificity 91%.« less
Characteristics of yak platelet derived growth factors-alpha gene and expression in brain tissues.
Huang, Zhenhua; Pan, Yangyang; Liu, Penggang; Yu, Sijiu; Cui, Yan
2017-05-29
Platelet derived growth factors (PDGFs) are key components of autocrine and paracrine signaling, both of which play important roles in mammalian developmental processes. PDGF expression levels also relate to oxygen levels. The characteristics of yak PDGFs, which are indigenous to hypoxic environments, have not been clearly described until the current study. We amplified the open reading frame encoding yak (Bos grunniens) platelet derived growth factor-a (PDGFA) from a yak skin tissue cDNA library by reverse transcriptase polymerase chain reaction (PCR) using specific primers and Sanger dideoxy sequencing. Expression of PDGFA mRNA in different portions of yak brain tissue (cerebrum, cerebellum, hippocampus, and spinal cord) was detected by quantitative real-time PCR (qRT-PCR). PDGFA protein expression levels and its location in different portions of the yak brain were evaluated by western blot and immunohistochemistry. We obtained a yak PDGFA 755 bp cDNA gene fragment containing a 636 bp open reading frame, encoding 211 amino acids (GenBank: KU851801). Phylogenetic analysis shows yak PDGFA to be well conserved, having 98.1% DNA sequence identity to homologous Bubalus bubalus and Bos taurus PDGFA genes. However, eight nucleotides in the yak DNA sequence and four amino acids in the yak protein sequence differ from the other two species. PDGFA is widely expressed in yak brain tissue, and furthermore, PDGFA expression in the cerebrum and cerebellum are higher than in the hippocampus and spinal cord (p > 0.05). PDGFA was observed by immunohistochemistry in glial cells of the cerebrum, cerebellum, and hippocampus, as well as in pyramidal cells of the cerebrum, and Purkinje cell bodies of the hippocampus, but not in glial cells of the spinal cord. The PDGFA gene is well conserved in the animal kingdom; however, the yak PDGFA gene has unique characteristics and brain expression patterns specific to this high elevation species.
Lambracht-Washington, Doris; Moore, Yuki F; Wonigeit, Kurt; Lindahl, Kirsten Fischer
2008-04-01
The M region at the telomeric end of the murine major histocompatibility complex (MHC) contains class I genes that are highly conserved in rat and mouse. We have sequenced a cosmid clone of the LEW rat strain (RT1 haplotype) containing three class I genes, RT1.M6-1, RT1.M4, and RT1.M5. The sequences of allelic genes of the BN strain (RT1n haplotype) were obtained either from cDNAs or genomic clones. For the coding parts of the genes few differences were found between the two RT1 haplotypes. In LEW, however, only RT1.M5 and RT1.M6 have open reading frames; whereas in BN all three genes were intact. In line with the findings in BN, transcription was found for all three rat genes in several tissues from strain Sprague Dawley. Protein expression in transfectants could be demonstrated for RT1.M6-1 using the monoclonal antibody OX18. By sequencing of transcripts obtained by RT-PCR, a second, transcribed M6 gene, RT1.M6-2, was discovered, which maps next to RT1.M6-1 outside of the region covered by the cosmid. In addition, alternatively spliced forms for RT1.M5 and RT1.M6 were detected. Of the orthologous mouse genes, H2-M4, H2-M5, and H2-M6, only H2-M5 has an open reading frame. Other important differences between the corresponding parts of the M region of the two species are insertion of long LINE repeats, duplication of RT1.M6, and the inversion of RT1.M5 in the rat. This demonstrates substantial evolutionary dynamics in this region despite conservation of the class I gene sequences themselves.
Fellner, Lea; Simon, Svenja; Scherling, Christian; Witting, Michael; Schober, Steffen; Polte, Christine; Schmitt-Kopplin, Philippe; Keim, Daniel A; Scherer, Siegfried; Neuhaus, Klaus
2015-12-18
Gene duplication is believed to be the classical way to form novel genes, but overprinting may be an important alternative. Overprinting allows entirely novel proteins to evolve de novo, i.e., formerly non-coding open reading frames within functional genes become expressed. Only three cases have been described for Escherichia coli. Here, a fourth example is presented. RNA sequencing revealed an open reading frame weakly transcribed in cow dung, coding for 101 residues and embedded completely in the -2 reading frame of citC in enterohemorrhagic E. coli. This gene is designated novel overlapping gene, nog1. The promoter region fused to gfp exhibits specific activities and 5' rapid amplification of cDNA ends indicated the transcriptional start 40-bp upstream of the start codon. nog1 was strand-specifically arrested in translation by a nonsense mutation silent in citC. This Nog1-mutant showed a phenotype in competitive growth against wild type in the presence of MgCl2. Small differences in metabolite concentrations were also found. Bioinformatic analyses propose Nog1 to be inner membrane-bound and to possess at least one membrane-spanning domain. A phylogenetic analysis suggests that the orphan gene nog1 arose by overprinting after Escherichia/Shigella separated from the other γ-proteobacteria. Since nog1 is of recent origin, non-essential, short, weakly expressed and only marginally involved in E. coli's central metabolism, we propose that this gene is in an initial stage of evolution. While we present specific experimental evidence for the existence of a fourth overlapping gene in enterohemorrhagic E. coli, we believe that this may be an initial finding only and overlapping genes in bacteria may be more common than is currently assumed by microbiologists.
Vengalil, Seena; Preethish-Kumar, Veeramani; Polavarapu, Kiran; Mahadevappa, Manjunath; Sekar, Deepha; Purushottam, Meera; Thomas, Priya Treesa; Nashi, Saraswathi; Nalini, Atchayaram
2017-01-01
Studies of cases of Duchenne muscular dystrophy (DMD) and Becker muscular dystrophy (BMD) confirmed by multiplex ligation-dependent probe amplification (MLPA) have determined the clinical characteristics, genotype, and relations between the reading frame and phenotype for different countries. This is the first such study from India. A retrospective genotype-phenotype analysis of 317 MLPA-confirmed patients with DMD or BMD who visited the neuromuscular clinic of a quaternary referral center in southern India. The 317 patients comprised 279 cases of DMD (88%), 32 of BMD (10.1%), and 6 of intermediate phenotype (1.9%). Deletions accounted for 91.8% of cases, with duplications causing the remaining 8.2%. There were 254 cases of DMD (91%) with deletions and 25 (9%) due to duplications, and 31 cases (96.8%) of BMD with deletions and 1 (3.2%) due to duplication. All six cases of intermediate type were due to deletions. The most-common mutation was a single-exon deletion. Deletions of six or fewer exons constituted 68.8% of cases. The deletion of exon 50 was the most common. The reading-frame rule held in 90% of DMD and 94% of BMD cases. A tendency toward a lower IQ and earlier wheelchair dependence was observed with distal exon deletions, though a significant correlation was not found. The reading-frame rule held in 90% to 94% of children, which is consistent with reports from other parts of the world. However, testing by MLPA is a limitation, and advanced sequencing methods including analysis of the structure of mutant dystrophin is needed for more-accurate assessments of the genotype-phenotype correlation.
Hashimoto, Masayuki; Fukui, Mitsuru; Hayano, Kouichi; Hayatsu, Masahito
2002-01-01
Rhizobium sp. strain AC100, which is capable of degrading carbaryl (1-naphthyl-N-methylcarbamate), was isolated from soil treated with carbaryl. This bacterium hydrolyzed carbaryl to 1-naphthol and methylamine. Carbaryl hydrolase from the strain was purified to homogeneity, and its N-terminal sequence, molecular mass (82 kDa), and enzymatic properties were determined. The purified enzyme hydrolyzed 1-naphthyl acetate and 4-nitrophenyl acetate indicating that the enzyme is an esterase. We then cloned the carbaryl hydrolase gene (cehA) from the plasmid DNA of the strain and determined the nucleotide sequence of the 10-kb region containing cehA. No homologous sequences were found by a database homology search using the nucleotide and deduced amino acid sequences of the cehA gene. Six open reading frames including the cehA gene were found in the 10-kb region, and sequencing analysis shows that the cehA gene is flanked by two copies of insertion sequence-like sequence, suggesting that it makes part of a composite transposon. PMID:11872471
Xu, Li; Ding, Zhi-Shan; Zhou, Yun-Kai; Tao, Xue-Fen
2009-06-01
To obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis by RACE PCR,then investigate the character of Secoisolariciresinol Dehydrogenase gene. The full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene was obtained by 3'-RACE and 5'-RACE from Dysosma versipellis. We first reported the full cDNA sequences of Secoisolariciresinol Dehydrogenase in Dysosma versipellis. The acquired gene was 991bp in full length, including 5' untranslated region of 42bp, 3' untranslated region of 112bp with Poly (A). The open reading frame (ORF) encoding 278 amino acid with molecular weight 29253.3 Daltons and isolectric point 6.328. The gene accession nucleotide sequence number in GeneBank was EU573789. Semi-quantitative RT-PCR analysis revealed that the Secoisolariciresinol Dehydrogenase gene was highly expressed in stem. Alignment of the amino acid sequence of Secoisolariciresinol Dehydrogenase indicated there may be some significant amino acid sequence difference among different species. Obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis.
Analysis of the regulatory region of the protease III (ptr) gene of Escherichia coli K-12.
Claverie-Martin, F; Diaz-Torres, M R; Kushner, S R
1987-01-01
The ptr gene of Escherichia coli encodes protease III (Mr 110,000) and a 50-kDa polypeptide, both of which are found in the periplasmic space. The gene is physically located between the recC and recB loci on the E. coli chromosome. The nucleotide sequence of a 1167-bp EcoRV-ClaI fragment of chromosomal DNA containing the promoter region and 885 bp of the ptr coding sequence has been determined. S1 nuclease mapping analysis showed that the major 5' end of the ptr mRNA was localized 127 bp upstream from the ATG start codon. The open reading frame (ORF), preceded by a Shine-Dalgarno sequence, extends to the end of the sequenced DNA. Downstream from the -35 and -10 regions is a sequence that strongly fits the consensus sequence of known nitrogen-regulated promoters. A signal peptide of 23 amino acids residues is present at the N terminus of the derived amino acid sequence. The cleavage site as well as the ORF were confirmed by sequencing the N terminus of mature protease III.
Brain cDNA clone for human cholinesterase
DOE Office of Scientific and Technical Information (OSTI.GOV)
McTiernan, C.; Adkins, S.; Chatonnet, A.
1987-10-01
A cDNA library from human basal ganglia was screened with oligonucleotide probes corresponding to portions of the amino acid sequence of human serum cholinesterase. Five overlapping clones, representing 2.4 kilobases, were isolated. The sequenced cDNA contained 207 base pairs of coding sequence 5' to the amino terminus of the mature protein in which there were four ATG translation start sites in the same reading frame as the protein. Only the ATG coding for Met-(-28) lay within a favorable consensus sequence for functional initiators. There were 1722 base pairs of coding sequence corresponding to the protein found circulating in human serum.more » The amino acid sequence deduced from the cDNA exactly matched the 574 amino acid sequence of human serum cholinesterase, as previously determined by Edman degradation. Therefore, our clones represented cholinesterase rather than acetylcholinesterase. It was concluded that the amino acid sequences of cholinesterase from two different tissues, human brain and human serum, were identical. Hybridization of genomic DNA blots suggested that a single gene, or very few genes coded for cholinesterase.« less
ERIC Educational Resources Information Center
DiLella, Carol Ann
This paper presents "popcorn story frames"--holistic outlines that facilitate comprehension when reading and writing stories, useful for outlining stories read and for creating outlines for original student stories--that are particularly useful for elementary and intermediate school students. "Popcorn" pops in a horizontal…
Novel snake papillomavirus does not cluster with other non-mammalian papillomaviruses.
Lange, Christian E; Favrot, Claude; Ackermann, Mathias; Gull, Jessica; Vetsch, Elisabeth; Tobler, Kurt
2011-09-12
Papillomaviruses (PVs) are associated with the development of neoplasias and have been found in several different species, most of them in humans and other mammals. We identified, cloned and sequenced PV DNA from pigmented papilloma-like lesions of a diamond python (Morelia spilota spilota). This represents the first complete PV genome discovered in a Squamata host (MsPV1). It consists of 7048 nt and contains the characteristic open reading (ORF) frames E6, E7, E1, E2, L1 and L2. The L1 ORF sequence showed the highest percentage of sequence identities to human PV5 (57.9%) and Caribbean manatee (Trichechus manatus) PV1 (55.4%), thus, establishing a new clade. According to phylogenetic analysis, the MsPV1 genome clusters with PVs of mammalian rather than sauropsid hosts.
Molecular characterization of African orthobunyaviruses.
Yandoko, E Nakouné; Gribaldo, S; Finance, C; Le Faou, A; Rihn, B H
2007-06-01
The genus Orthobunyavirus is composed of segmented, negative-sense RNA viruses that are responsible for mild to severe human diseases. To date, no molecular studies of bunyaviruses in the genus Orthobunyavirus from central Africa have been reported, and their classification relies on serological testing. Four new primer pairs for RT-PCR amplification and sequencing of the complete genomic small (S) RNA segments of 10 orthobunyaviruses isolated from the Central African Republic and pertaining to five different serogroups have been designed and evaluated. Phylogenetic analysis showed that these 10 viruses belong to the Bunyamwera serogroup. The S segment sequences differ from those of the Bunyamwera virus reference strain by 5-15 % at the nucleotide level, and both overlapping reading frames, encoding the nucleocapsid (N) and non-structural (NS) proteins, were evident in sequenced genomes. This study should improve diagnosis and surveillance of African bunyaviruses.
Cloning of precursors for two MIH/VIH-related peptides in the prawn, Macrobrachium rosenbergii.
Yang, W J; Rao, K R
2001-11-30
Two cDNA clones (634 and 1366 bp) encoding MIH/VIH (molt-inhibiting hormone/vitellogenesis-inhibiting hormone)-related peptides were isolated and sequenced from a Macrobrachium rosenbergii eyestalk ganglia cDNA library. The clones contain a 360 and 339 bp open-reading frame, and their conceptually translated peptides consist of a 41 and 34 amino acid signal peptide, respectively, and a 78 amino acid residue mature peptide hormone. The amino acid sequences of the peptides exhibit higher identities with other known MIHs and VIH (44-69%) than with CHHs (28-33%). This is the first report describing the cloning and sequencing of two MIH/VIH-related peptides in a single crustacean species. Transcription of these mRNAs was detected in the eyestalk ganglia, but not in the thoracic ganglia, hepatopancreas, gut, gill, heart, or muscle.
[Cloning and sequence analysis of 55 K protein of egg drop syndrome virus].
Zhu, L; Jin, Q; Zeng, L
1999-06-30
For understanding the characteristics of genomic structure of egg drop syndrome virus(EDSV). Nucleic acid was extracted using routine method from weak virulent strain AA-2 of EDSV isolated from Chinese sick hens. Construction of the whole genomic library was by hydrolysis with Hind III, strand encoding 55 K gene locating in Hind III--A segment was sequenced and analyzed. The open reading frame has a length of 1,014 nt and codes a polypeptide of 337 amino acids with molecular weight of 38,200. Analysis of the amino acid sequence revealed a homology from 25.5%-32.4% to the 55 K protein of human adenovirus types 2, 12, 40, canine adenovirus and fowl adenoviruses of group 1, whereas to ovine adenovirus is 46.4%. The genomic structure of EDSV has some relationship with adenoviruses.
Novel snake papillomavirus does not cluster with other non-mammalian papillomaviruses
2011-01-01
Papillomaviruses (PVs) are associated with the development of neoplasias and have been found in several different species, most of them in humans and other mammals. We identified, cloned and sequenced PV DNA from pigmented papilloma-like lesions of a diamond python (Morelia spilota spilota). This represents the first complete PV genome discovered in a Squamata host (MsPV1). It consists of 7048 nt and contains the characteristic open reading (ORF) frames E6, E7, E1, E2, L1 and L2. The L1 ORF sequence showed the highest percentage of sequence identities to human PV5 (57.9%) and Caribbean manatee (Trichechus manatus) PV1 (55.4%), thus, establishing a new clade. According to phylogenetic analysis, the MsPV1 genome clusters with PVs of mammalian rather than sauropsid hosts. PMID:21910860
Generation and analysis of expressed sequence tags from the bone marrow of Chinese Sika deer.
Yao, Baojin; Zhao, Yu; Zhang, Mei; Li, Juan
2012-03-01
Sika deer is one of the best-known and highly valued animals of China. Despite its economic, cultural, and biological importance, there has not been a large-scale sequencing project for Sika deer to date. With the ultimate goal of sequencing the complete genome of this organism, we first established a bone marrow cDNA library for Sika deer and generated a total of 2,025 reads. After processing the sequences, 2,017 high-quality expressed sequence tags (ESTs) were obtained. These ESTs were assembled into 1,157 unigenes, including 238 contigs and 919 singletons. Comparative analyses indicated that 888 (76.75%) of the unigenes had significant matches to sequences in the non-redundant protein database, In addition to highly expressed genes, such as stearoyl-CoA desaturase, cytochrome c oxidase, adipocyte-type fatty acid-binding protein, adiponectin and thymosin beta-4, we also obtained vascular endothelial growth factor-A and heparin-binding growth-associated molecule, both of which are of great importance for angiogenesis research. There were 244 (21.09%) unigenes with no significant match to any sequence in current protein or nucleotide databases, and these sequences may represent genes with unknown function in Sika deer. Open reading frame analysis of the sequences was performed using the getorf program. In addition, the sequences were functionally classified using the gene ontology hierarchy, clusters of orthologous groups of proteins and Kyoto encyclopedia of genes and genomes databases. Analysis of ESTs described in this paper provides an important resource for the transcriptome exploration of Sika deer, and will also facilitate further studies on functional genomics, gene discovery and genome annotation of Sika deer.
Schiex, Thomas; Gouzy, Jérôme; Moisan, Annick; de Oliveira, Yannick
2003-07-01
We describe FrameD, a program that predicts coding regions in prokaryotic and matured eukaryotic sequences. Initially targeted at gene prediction in bacterial GC rich genomes, the gene model used in FrameD also allows to predict genes in the presence of frameshifts and partially undetermined sequences which makes it also very suitable for gene prediction and frameshift correction in unfinished sequences such as EST and EST cluster sequences. Like recent eukaryotic gene prediction programs, FrameD also includes the ability to take into account protein similarity information both in its prediction and its graphical output. Its performances are evaluated on different bacterial genomes. The web site (http://genopole.toulouse.inra.fr/bioinfo/FrameD/FD) allows direct prediction, sequence correction and translation and the ability to learn new models for new organisms.
Schuster, W; Brennicke, A
1987-01-01
We describe an open reading frame (ORF) with high homology to reverse transcriptase in the mitochondrial genome of Oenothera. This ORF displays all the characteristics of an active plant mitochondrial gene with a possible ribosome binding site and 39% T in the third codon position. It is located between a sequence fragment from the plastid genome and one of nuclear origin downstream from the gene encoding subunit 5 of the NADH dehydrogenase. The nuclear derived sequence consists of 528 nucleotides from the small ribosomal RNA and contains an expansion segment unique to nuclear rRNAs. The plastid sequence contains part of the ribosomal protein S4 and the complete tRNA(Ser). The observation that only transcribed sequences have been found i more than one subcellular compartment in higher plants suggests that interorganellar transfer of genetic information may occur via RNA and subsequent local reverse transcription and genomic integration. PMID:14650433
Characterization of Austrian koi herpesvirus samples based on the ORF40 region.
Marek, A; Schachner, O; Bilic, I; Hess, M
2010-02-17
Using a PCR that amplifies a region of the thymidine kinase (TK) gene, an epidemic spread of koi herpesvirus (KHV) was determined in koi carps in Austria in 2007. A total of 15 virus samples from different locations in Austria were analyzed to determine their genetic relatedness following PCR and nucleic acid sequencing of the open reading frame 40 (ORF40) region of the KHV genome. ORF40-specific PCR amplification products that were obtained from tissue samples shared 100% nucleotide sequence identity with the published sequence of the Japanese strain of KHV. The ORF40 sequence of one isolate from the UK that was included in the present study was 100% identical with the published sequence of an Israeli strain of KHV. This is the first study that used a larger number of samples and a PCR method, which allowed distinguishing all 3 strains of KHV. The present investigation provides information on the epidemiology of KHV infections in Europe and describes a useful molecular tool for epidemiological studies.
Genetic analysis of duck circovirus in Pekin ducks from South Korea.
Cha, S-Y; Kang, M; Cho, J-G; Jang, H-K
2013-11-01
The genetic organization of the 24 duck circovirus (DuCV) strains detected in commercial Pekin ducks from South Korea between 2011 and 2012 is described in this study. Multiple sequence alignment and phylogenetic analyses were performed on the 24 viral genome sequences as well as on 45 genome sequences available from the GenBank database. Phylogenetic analyses based on the genomic and open reading frame 2/cap sequences demonstrated that all DuCV strains belonged to genotype 1 and were designated in a subcluster under genotype 1. Analysis of the capsid protein amino acid sequences of the 24 Korean DuCV strains showed 10 substitutions compared with that of other genotype 1 strains. Our analysis showed that genotype 1 is predominant and circulating in South Korea. These present results serve as incentive to add more data to the DuCV database and provide insight to conduct further intensive study on the geographic relationships among these virus strains.
Becker, Y; Asher, Y; Tabor, E; Davidson, I; Malkinson, M
1994-01-01
A DNA segment of the MDV-1 BamHI-D fragment was sequenced, and the open reading frames (ORFs) present in the 4556 nucleotide fragment were analyzed by computer programs. Computer analysis identified 19 putative ORFs in the sequence ranging from a coding capacity of 37 amino acids (aa) (ORF-1a) to 684aa (ORF-1). The special properties of four ORFs (1a, 1, 2, and 3) were investigated. Two adjacent ORFs, ORF-1a and ORF-1, were found by computer analysis to have the properties of two introns encoding a glycoprotein: ORF-1a encodes an aa sequence with the properties of a signal peptide, and ORF-1 encodes a polypeptide with a membrane anchor domain and putative N-glycosylation sites in the aa sequence. ORF-1a and ORF-1 were found to be transcribed in MDV-1-infected cells. Two RNA transcripts were detected: a precursor RNA and its spliced form. Both are transcribed from a promoter located 5' to ORF-1a, and splice donor and acceptor sites are used to splice the mRNA after cleavage of a 71-nucleotide sequence. This finding suggest that ORF-1a and ORF-1 are two introns of a new MDV-1 glycoprotein gene. The DNA sequence containing ORF-1 was transiently expressed in COS-1 cells, and the viral protein produced in these cells was found to react with anti-MDV serotype-1 Antigen B-specific monoclonal antibodies. These studies indicate that the protein encoded by ORF-1 has antigenic properties resembling Antigen B of MDV-1. A gene homologous to ORF-1 was detected in the genome of both MDV-2(SB1) and MDV-3(HVT), which serve as commercial vaccine strains. Two additional ORFs were noted in the 4556 nucleotide sequence: ORF-2, which encodes a 333 aa polypeptide initiating in the UL and terminating in the TRL prior to the putative origin of replication, and ORF-3, which encodes a 155 aa polypeptide that is partly homologous to the phosphoprotein pp38 encoded by the BamHI-H sequence. The 65 N-terminal aa of the two gene products are identical, both being derived from the nucleotide sequences in the TRL and IRL, respectively. Additional homologous aa sequences are the hydrophobic aa domain in the middle of both proteins. The functions of ORF-2, ORF-3, and additional ORFs are under study.
Physics of Non-Inertial Reference Frames
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamalov, Timur F.
2010-12-22
Physics of non-inertial reference frames is a generalizing of Newton's laws to any reference frames. It is the system of general axioms for classical and quantum mechanics. The first, Kinematics Principle reads: the kinematic state of a body free of forces conserves and equal in absolute value to an invariant of the observer's reference frame. The second, Dynamics Principle extended Newton's second law to non-inertial reference frames and also contains additional variables there are higher derivatives of coordinates. Dynamics Principle reads: a force induces a change in the kinematic state of the body and is proportional to the rate ofmore » its change. It is mean that if the kinematic invariant of the reference frame is n-th derivative with respect the time, then the dynamics of a body being affected by the force F is described by the 2n-th differential equation. The third, Statics Principle reads: the sum of all forces acting a body at rest is equal to zero.« less
Li, Wei-Dong; Huang, Min; Lü, Wen-Gang; Chen, Xiao; Shen, Ming-Hui; Li, Xiang-Min; Wang, Rong-Xia; Ke, Cai-Huan
2015-01-01
The small abalone Haliotis diversicolor is an economically important mollusk that is widely cultivated in Southern China. Gonad precocity may affect the aquaculture of small abalone. Polyamines, which are small cationic molecules essential for cellular proliferation, may affect gonadal development. Ornithine decarboxylase (ODC) and antizyme (AZ) are essential elements of a feedback circuit that regulates cellular polyamines. This paper presents the molecular cloning and characterization of AZ from small abalone. Sequence analysis showed that the cDNA sequence of H. diversicolor AZ (HdiODCAZ) consisted of two overlapping open reading frames (ORFs) and conformed to the +1 frameshift property of the frame. Thin Layer chromatography (TLC) analysis suggested that the expressed protein encoded by +1 ORF2 was the functional AZ that targets ODC to 26S proteasome degradation. The result demonstrated that the expression level of AZ was higher than that of ODC in the ovary of small abalone. In addition, the expression profiles of ODC and AZ at the different development stages of the ovary indicated that these two genes might be involved in the gonadal development of small abalone.
Lü, Wen-Gang; Chen, Xiao; Shen, Ming-Hui; Li, Xiang-Min; Wang, Rong-Xia; Ke, Cai-Huan
2015-01-01
The small abalone Haliotis diversicolor is an economically important mollusk that is widely cultivated in Southern China. Gonad precocity may affect the aquaculture of small abalone. Polyamines, which are small cationic molecules essential for cellular proliferation, may affect gonadal development. Ornithine decarboxylase (ODC) and antizyme (AZ) are essential elements of a feedback circuit that regulates cellular polyamines. This paper presents the molecular cloning and characterization of AZ from small abalone. Sequence analysis showed that the cDNA sequence of H. diversicolor AZ (HdiODCAZ) consisted of two overlapping open reading frames (ORFs) and conformed to the +1 frameshift property of the frame. Thin Layer chromatography (TLC) analysis suggested that the expressed protein encoded by +1 ORF2 was the functional AZ that targets ODC to 26S proteasome degradation. The result demonstrated that the expression level of AZ was higher than that of ODC in the ovary of small abalone. In addition, the expression profiles of ODC and AZ at the different development stages of the ovary indicated that these two genes might be involved in the gonadal development of small abalone. PMID:26313647
DOE Office of Scientific and Technical Information (OSTI.GOV)
Firth, Andrew E., E-mail: a.firth@ucc.i; Blitvich, Bradley J., E-mail: blitvich@iastate.ed; Wills, Norma M., E-mail: nwills@genetics.utah.ed
2010-03-30
Flaviviruses have a positive-sense, single-stranded RNA genome of approx11 kb, encoding a large polyprotein that is cleaved to produce approx10 mature proteins. Cell fusing agent virus, Kamiti River virus, Culex flavivirus and several recently discovered flaviviruses have no known vertebrate host and apparently infect only insects. We present compelling bioinformatic evidence for a 253-295 codon overlapping gene (designated fifo) conserved throughout these insect-specific flaviviruses and immunofluorescent detection of its product. Fifo overlaps the NS2A/NS2B coding sequence in the - 1/+ 2 reading frame and is most likely expressed as a trans-frame fusion protein via ribosomal frameshifting at a conserved GGAUUUYmore » slippery heptanucleotide with 3'-adjacent RNA secondary structure (which stimulates efficient frameshifting in vitro). The discovery bears striking parallels to the recently discovered ribosomal frameshifting site in the NS2A coding sequence of the Japanese encephalitis serogroup of flaviviruses and suggests that programmed ribosomal frameshifting may be more widespread in flaviviruses than currently realized.« less
Encrypting Digital Camera with Automatic Encryption Key Deletion
NASA Technical Reports Server (NTRS)
Oakley, Ernest C. (Inventor)
2007-01-01
A digital video camera includes an image sensor capable of producing a frame of video data representing an image viewed by the sensor, an image memory for storing video data such as previously recorded frame data in a video frame location of the image memory, a read circuit for fetching the previously recorded frame data, an encryption circuit having an encryption key input connected to receive the previously recorded frame data from the read circuit as an encryption key, an un-encrypted data input connected to receive the frame of video data from the image sensor and an encrypted data output port, and a write circuit for writing a frame of encrypted video data received from the encrypted data output port of the encryption circuit to the memory and overwriting the video frame location storing the previously recorded frame data.
Ishida, Yasuko; McCallister, Chelsea; Nikolaidis, Nikolas; Tsangaras, Kyriakos; Helgen, Kristofer M; Greenwood, Alex D; Roca, Alfred L
2015-01-15
The koala retrovirus (KoRV), which is transitioning from an exogenous to an endogenous form, has been associated with high mortality in koalas. For other retroviruses, the envelope protein p15E has been considered a candidate for vaccine development. We therefore examined proviral sequence variation of KoRV p15E in a captive Queensland and three wild southern Australian koalas. We generated 163 sequences with intact open reading frames, which grouped into 39 distinct haplotypes. Sixteen distinct haplotypes comprising 139 of the sequences (85%) coded for the same polypeptide. Among the remaining 23 haplotypes, 22 were detected only once among the sequences, and each had 1 or 2 non-synonymous differences from the majority sequence. Several analyses suggested that p15E was under purifying selection. Important epitopes and domains were highly conserved across the p15E sequences and in previously reported exogenous KoRVs. Overall, these results support the potential use of p15E for KoRV vaccine development. Copyright © 2014 Elsevier Inc. All rights reserved.
Nucleotide sequence of Hungarian grapevine chrome mosaic nepovirus RNA1.
Le Gall, O; Candresse, T; Brault, V; Dunez, J
1989-01-01
The nucleotide sequence of the RNA1 of hungarian grapevine chrome mosaic virus, a nepovirus very closely related to tomato black ring virus, has been determined from cDNA clones. It is 7212 nucleotides in length excluding the 3' terminal poly(A) tail and contains a large open reading frame extending from nucleotides 216 to 6971. The presumably encoded polyprotein is 2252 amino acids in length with a molecular weight of 250 kDa. The primary structure of the polyprotein was compared with that of other viral polyproteins, revealing the same general genetic organization as that of other picorna-like viruses (comoviruses, potyviruses and picornaviruses), except that an additional protein is suspected to occupy the N-terminus of the polyprotein. PMID:2798128
Cloning and characterization of an abalone (Haliotis discus hannai) actin gene
NASA Astrophysics Data System (ADS)
Ma, Hongming; Xu, Wei; Mai, Kangsen; Liufu, Zhiguo; Chen, Hong
2004-10-01
An actin encoding gene was cloned by using RT-PCR, 3‧ RACE and 5‧ RACE from abalone Haliotis discus hannai. The full length of the gene is 1532 base pairs, which contains a long 3‧ untranslated region of 307 base pairs and 79 base pairs of 5‧ untranslated sequence. The open reading frame encodes 376 amino acid residues. Sequence comparison with those of human and other mollusks showed high conservation among species at amino acid level. The identities was 96%, 97% and 96% respectively compared with Aplysia californica, Biomphalaria glabrata and Homo sapience β-actin. It is also indicated that this actin is more similar to the human cytoplasmic actin (β-actin) than to human muscle actin.
Kondo, Hideki; Takemoto, Shogo; Maruyama, Kazuyuki; Chiba, Sotaro; Andika, Ida Bagus; Suzuki, Nobuhiro
2015-08-01
Cymbidium chlorotic mosaic virus (CyCMV), isolated from a spring orchid (Cymbidium goeringii), was characterized molecularly. CyCMV isometric virions comprise a single, positive-strand RNA genome of 4,083 nucleotides and 30-kDa coat protein. The virus genome contains five overlapping open reading frames with a genomic organization similar to that of sobemoviruses. BLAST searches and phylogenetic analysis revealed that CyCMV is most closely related to papaya lethal yellowing virus, a proposed dicot-infecting sobemovirus (58.8 % nucleotide sequence identity), but has a relatively distant relationship to monocot-infecting sobemoviruses, with only modest sequence identities. This suggests that CyCMV is a new monocot-infecting member of the floating genus Sobemovirus.
Complete nucleotide sequence of jasmine virus H, a new member of the family Tombusviridae.
Zhuo, Tao; Zhu, Li-Juan; Lu, Cheng-Cong; Jiang, Chao-Yang; Chen, Zi-Yin; Zhang, Guangzhi; Wang, Zong-Hua; Jovel, Juan; Han, Yan-Hong
2018-03-01
Jasmine virus H (JaVH) is a novel virus associated with symptoms of yellow mosaic on jasmine. The JaVH genome is 3,867 nt in length with five open reading frames (ORFs) encoding a 27-kDa protein (ORF 1), an 87-kDa replicase protein (ORF 2), two centrally located movement proteins (ORF 3 and 4), and a 37-kDa capsid protein (ORF 5). Based on genomic and phylogenetic analysis, JaVH is predicted to be a member of the genus Pelarspovirus in the family Tombusviridae.
2012-06-08
RT-PCR and sequencing of the Rsu1 open reading frame revealed that exon 8 is missing in the alternatively spliced Rsu1 RNA in human gliomas and...concentration of 75 nM. Cells were collected 72-96 hours post-transfection. 16 Western blotting Cell lysates were collected in RIPA buffer...cell lysates were collected and bound to the glutathione beads loaded with the GST-fusion of the Rac1/cdc42 binding domain of Pak. The bound Rac1
Blick, Robert J.; Revel, Andrew T.; Hansen, Eric J.
2008-01-01
Summary FindGDPs is a program that uses a greedy algorithm to quickly identify a set of genome-directed primers that specifically anneal to all of the open reading frames in a genome and that do not exhibit full-length complementarity to the members of another user-supplied set of nucleotide sequences. Availability The program code is distributed under the GNU General Public License at http://www8.utsouthwestern.edu/utsw/cda/dept131456/files/159331.html Contact eric.hansen@utsouthwestern.edu PMID:15593406
Lip reading using neural networks
NASA Astrophysics Data System (ADS)
Kalbande, Dhananjay; Mishra, Akassh A.; Patil, Sanjivani; Nirgudkar, Sneha; Patel, Prashant
2011-10-01
Computerized lip reading, or speech reading, is concerned with the difficult task of converting a video signal of a speaking person to written text. It has several applications like teaching deaf and dumb to speak and communicate effectively with the other people, its crime fighting potential and invariance to acoustic environment. We convert the video of the subject speaking vowels into images and then images are further selected manually for processing. However, several factors like fast speech, bad pronunciation, and poor illumination, movement of face, moustaches and beards make lip reading difficult. Contour tracking methods and Template matching are used for the extraction of lips from the face. K Nearest Neighbor algorithm is then used to classify the 'speaking' images and the 'silent' images. The sequence of images is then transformed into segments of utterances. Feature vector is calculated on each frame for all the segments and is stored in the database with properly labeled class. Character recognition is performed using modified KNN algorithm which assigns more weight to nearer neighbors. This paper reports the recognition of vowels using KNN algorithms
Yang, Tao; Jia, Quanzhang; Guo, Hong; Xu, Jianzhong; Bai, Yun; Yang, Kai; Luo, Fei; Zhang, Zehua; Hou, Tianyong
2012-06-01
To investigate the effects of genetic factors on idiopathic scoliosis (IS) and genetic modes through genetic epidemiological survey on IS in Chongqing City, China, and to determine whether SH3GL1, GADD45B, and FGF22 in the chromosome 19p13.3 are the pathogenic genes of IS through genetic sequence analysis. 214 nuclear families were investigated to analyse the age incidence, familial aggregation, and heritability. SH3GL1, GADD45B, and FGF22 were chosen as candidate genes for mutation screening in 56 IS patients of 214 families. The sequence alignment analysis was performed to determine mutations and predict the protein structure. The average age of onset of 10.8 years suggests that IS is a early onset disease. Incidences of IS in first-, second-, third-degree relatives and the overall incidence in families (5.68%) were also significantly higher than that of the general population (1.04%). The U test indicated a significant difference, suggesting that IS has a familial aggregation. The heritability of first-degree relatives (77.68 ±10.39%), second-degree relatives (69.89 ±3.14%), and third-degree relatives (62.14 ±11.92%) illustrated that genetic factors play an important role in IS pathogenesis. The incidence of first-degree relatives (10.01%), second-degree relatives (2.55%) and third-degree relatives (1.76%) illustrated that IS is not in simple accord with monogenic Mendel's law but manifests as traits of multifactorial hereditary diseases. Sequence alignment of exons of SH3GL1, GADD45B, and FGF22 showed 17 base mutations, of which 16 mutations do not induce open reading frame (ORF) shift or amino acid changes whereas one mutation (C→T)occurred in SH3GL1 results in formation of the termination codon, which induces variation of protein reading frame. Prediction analysis of protein sequence showed that the SH3GL1 mutant encoded a truncated protein, thus affecting the protein structure. IS is a multifactorial genetic disease and SH3GL1 may be one of the pathogenic genes for IS.
Davis, John K.; Paoli, George C.; He, Zhongqi; Nadeau, Lloyd J.; Somerville, Charles C.; Spain, Jim C.
2000-01-01
Pseudomonas pseudoalcaligenes JS45 grows on nitrobenzene by a partially reductive pathway in which the intermediate hydroxylaminobenzene is enzymatically rearranged to 2-aminophenol by hydroxylaminobenzene mutase (HAB mutase). The properties of the enzyme, the reaction mechanism, and the evolutionary origin of the gene(s) encoding the enzyme are unknown. In this study, two open reading frames (habA and habB), each encoding an HAB mutase enzyme, were cloned from a P. pseudoalcaligenes JS45 genomic library and sequenced. The open reading frames encoding HabA and HabB are separated by 2.5 kb and are divergently transcribed. The deduced amino acid sequences of HabA and HabB are 44% identical. The HAB mutase specific activities in crude extracts of Escherichia coli clones synthesizing either HabA or HabB were similar to the specific activities of extracts of strain JS45 grown on nitrobenzene. HAB mutase activity in E. coli extracts containing HabB withstood heating at 85°C for 10 min, but extracts containing HabA were inactivated when they were heated at temperatures above 60°C. HAB mutase activity in extracts of P. pseudoalcaligenes JS45 grown on nitrobenzene exhibited intermediate temperature stability. Although both the habA gene and the habB gene conferred HAB mutase activity when they were separately cloned and expressed in E. coli, reverse transcriptase PCR analysis indicated that only habA is transcribed in P. pseudoalcaligenes JS45. A mutant strain derived from strain JS45 in which the habA gene was disrupted was unable to grow on nitrobenzene, which provided physiological evidence that HabA is involved in the degradation of nitrobenzene. A strain in which habB was disrupted grew on nitrobenzene. Gene Rv3078 of Mycobacterium tuberculosis H37Rv encodes a protein whose deduced amino acid sequence is 52% identical to the HabB amino acid sequence. E. coli containing M. tuberculosis gene Rv3078 cloned into pUC18 exhibited low levels of HAB mutase activity. Sequences that exhibit similarity to transposable element sequences are present between habA and habB, as well as downstream of habB, which suggests that horizontal gene transfer resulted in acquisition of one or both of the hab genes. PMID:10877793
Unitary circular code motifs in genomes of eukaryotes.
El Soufi, Karim; Michel, Christian J
A set X of 20 trinucleotides was identified in genes of bacteria, eukaryotes, plasmids and viruses, which has in average the highest occurrence in reading frame compared to its two shifted frames (Michel, 2015; Arquès and Michel, 1996). This set X has an interesting mathematical property as X is a circular code (Arquès and Michel, 1996). Thus, the motifs from this circular code X, called X motifs, have the property to always retrieve, synchronize and maintain the reading frame in genes. The origin of this circular code X in genes is an open problem since its discovery in 1996. Here, we first show that the unitary circular codes (UCC), i.e. sets of one word, allow to generate unitary circular code motifs (UCC motifs), i.e. a concatenation of the same motif (simple repeats) leading to low complexity DNA. Three classes of UCC motifs are studied here: repeated dinucleotides (D + motifs), repeated trinucleotides (T + motifs) and repeated tetranucleotides (T + motifs). Thus, the D + , T + and T + motifs allow to retrieve, synchronize and maintain a frame modulo 2, modulo 3 and modulo 4, respectively, and their shifted frames (1 modulo 2; 1 and 2 modulo 3; 1, 2 and 3 modulo 4 according to the C 2 , C 3 and C 4 properties, respectively) in the DNA sequences. The statistical distribution of the D + , T + and T + motifs is analyzed in the genomes of eukaryotes. A UCC motif and its comp lementary UCC motif have the same distribution in the eukaryotic genomes. Furthermore, a UCC motif and its complementary UCC motif have increasing occurrences contrary to their number of hydrogen bonds, very significant with the T + motifs. The longest D + , T + and T + motifs in the studied eukaryotic genomes are also given. Surprisingly, a scarcity of repeated trinucleotides (T + motifs) in the large eukaryotic genomes is observed compared to the D + and T + motifs. This result has been investigated and may be explained by two outcomes. Repeated trinucleotides (T + motifs) are identified in the X motifs of low composition (cardinality less than 10) in the genomes of eukaryotes. Furthermore, identical trinucleotide pairs of the circular code X are preferentially used in the gene sequences of eukaryotes. These two results suggest that the unitary circular codes of trinucleotides may have been involved in the formation of the trinucleotide circular code X. Indeed, repeated trinucleotides in the X motifs in the genomes of eukaryotes may represent an intermediary evolution from repeated trinucleotides of cardinality 1 (T + motifs) in the genomes of eukaryotes up to the X motifs of cardinality 20 in the gene sequences of eukaryotes. Copyright © 2017 Elsevier B.V. All rights reserved.
The nucleotide sequence and genome organization of Plasmopara halstedii virus.
Heller-Dohmen, Marion; Göpfert, Jens C; Pfannstiel, Jens; Spring, Otmar
2011-03-17
Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2) were established. RNA1 consisted of 2793 nucleotides (nt) exclusive its 3' poly(A) tract and a single open-reading frame (ORF1) of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR) of 18 nt and a 3' untranslated region (3' UTR) of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp) and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A) and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A) tract and a second ORF (ORF2) of 1128 nt. ORF2 coded for the single viral coat protein (CP) and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb) and RNA2 (ca. 1.4 kb) were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. The results showed the presence of a single and new virus type in different P. halstedii isolates. Insignificant viral sequence variation indicated that the virus did not account for differences in pathogenicity of the oomycete P. halstedii.
Zhang, Chunxiao; Sheng, Chaolan; Wang, Wei; Hu, Hongbo; Peng, Huasong; Zhang, Xuehong
2015-01-01
Streptomyces lomondensis S015 synthesizes the broad-spectrum phenazine antibiotic lomofungin. Whole genome sequencing of this strain revealed a genomic locus consisting of 23 open reading frames that includes the core phenazine biosynthesis gene cluster lphzGFEDCB. lomo10, encoding a putative flavin-dependent monooxygenase, was also identified in this locus. Inactivation of lomo10 by in-frame partial deletion resulted in the biosynthesis of a new phenazine metabolite, 1-carbomethoxy-6-formyl-4,9-dihydroxy-phenazine, along with the absence of lomofungin. This result suggests that lomo10 is responsible for the hydroxylation of lomofungin at its C-7 position. This is the first description of a phenazine hydroxylation gene in Streptomyces, and the results of this study lay the foundation for further investigation of phenazine metabolite biosynthesis in Streptomyces. PMID:26305803
Draft Genome Sequencing and Comparative Analysis of Aspergillus sojae NBRC4239
Sato, Atsushi; Oshima, Kenshiro; Noguchi, Hideki; Ogawa, Masahiro; Takahashi, Tadashi; Oguma, Tetsuya; Koyama, Yasuji; Itoh, Takehiko; Hattori, Masahira; Hanya, Yoshiki
2011-01-01
We conducted genome sequencing of the filamentous fungus Aspergillus sojae NBRC4239 isolated from the koji used to prepare Japanese soy sauce. We used the 454 pyrosequencing technology and investigated the genome with respect to enzymes and secondary metabolites in comparison with other Aspergilli sequenced. Assembly of 454 reads generated a non-redundant sequence of 39.5-Mb possessing 13 033 putative genes and 65 scaffolds composed of 557 contigs. Of the 2847 open reading frames with Pfam domain scores of >150 found in A. sojae NBRC4239, 81.7% had a high degree of similarity with the genes of A. oryzae. Comparative analysis identified serine carboxypeptidase and aspartic protease genes unique to A. sojae NBRC4239. While A. oryzae possessed three copies of α-amyalse gene, A. sojae NBRC4239 possessed only a single copy. Comparison of 56 gene clusters for secondary metabolites between A. sojae NBRC4239 and A. oryzae revealed that 24 clusters were conserved, whereas 32 clusters differed between them that included a deletion of 18 508 bp containing mfs1, mao1, dmaT, and pks-nrps for the cyclopiazonic acid (CPA) biosynthesis, explaining the no productivity of CPA in A. sojae. The A. sojae NBRC4239 genome data will be useful to characterize functional features of the koji moulds used in Japanese industries. PMID:21659486
Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart
2010-07-01
High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users.
Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart
2010-01-01
High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users. PMID:20501601
Bodewes, R; Kik, M J L; Raj, V Stalin; Schapendonk, C M E; Haagmans, B L; Smits, S L; Osterhaus, A D M E
2013-06-01
Arenaviruses are bi-segmented negative-stranded RNA viruses, which were until recently only detected in rodents and humans. Now highly divergent arenaviruses have been identified in boid snakes with inclusion body disease (IBD). Here, we describe the identification of a new species and variants of the highly divergent arenaviruses, which were detected in tissues of captive boid snakes with IBD in The Netherlands by next-generation sequencing. Phylogenetic analysis of the complete sequence of the open reading frames of the four predicted proteins of one of the detected viruses revealed that this virus was most closely related to the recently identified Golden Gate virus, while considerable sequence differences were observed between the highly divergent arenaviruses detected in this study. These findings add to the recent identification of the highly divergent arenaviruses in boid snakes with IBD in the United States and indicate that these viruses also circulate among boid snakes in Europe.
Quan, Phenix-Lan; Junglen, Sandra; Tashmukhamedova, Alla; Conlan, Sean; Hutchison, Stephen K.; Kurth, Andreas; Ellerbrok, Heinz; Egholm, Michael; Briese, Thomas; Leendertz, Fabian H.; Ian Lipkin, W
2009-01-01
Characterization of arboviruses at the interface of pristine habitats and anthropogenic landscapes is crucial to comprehensive emergent disease surveillance and forecasting efforts. In context of surveillance campaign in and around a West African rainforest, particles morphologically consistent with rhabdoviruses were identified in cell cultures infected with homogenates of trapped mosquitoes. RNA recovered from these cultures was used to derive the first complete genome sequence of a rhabdovirus isolated from Culex decens mosquitoes in Côte d’Ivoire, tentatively named Moussa virus (MOUV). MOUV shows the classical genome organization of rhabdoviruses, with five open reading frames (ORF) in a linear order. However, sequences show only limited conservation (12–33% identity at amino acid level), and ORF2 and ORF3 have no significant similarity to sequences deposited in GenBank. Phylogenetic analysis indicates a potential new species with distant relationship to Tupaia and Tibrogargan virus. PMID:19804801
Chen, Tianbao; Gagliardo, Ron; Walker, Brian; Zhou, Mei; Shaw, Chris
2005-12-01
Phylloxin is a novel prototype antimicrobial peptide from the skin of Phyllomedusa bicolor. Here, we describe parallel identification and sequencing of phylloxin precursor transcript (mRNA) and partial gene structure (genomic DNA) from the same sample of lyophilized skin secretion using our recently-described cloning technique. The open-reading frame of the phylloxin precursor was identical in nucleotide sequence to that previously reported and alignment with the nucleotide sequence derived from genomic DNA indicated the presence of a 175 bp intron located in a near identical position to that found in the dermaseptins. The highly-conserved structural organization of skin secretion peptide genes in P. bicolor can thus be extended to include that encoding phylloxin (plx). These data further reinforce our assertion that application of the described methodology can provide robust genomic/transcriptomic/peptidomic data without the need for specimen sacrifice.
Ribosomal protein S14 transcripts are edited in Oenothera mitochondria.
Schuster, W; Unseld, M; Wissinger, B; Brennicke, A
1990-01-01
The gene encoding ribosomal protein S14 (rps14) in Oenothera mitochondria is located upstream of the cytochrome b gene (cob). Sequence analysis of independently derived cDNA clones covering the entire rps14 coding region shows two nucleotides edited from the genomic DNA to the mRNA derived sequences by C to U modifications. A third editing event occurs four nucleotides upstream of the AUG initiation codon and improves a potential ribosome binding site. A CGG codon specifying arginine in a position conserved in evolution between chloroplasts and E. coli as a UGG tryptophan codon is not edited in any of the cDNAs analysed. An inverted repeat 3' of an unidentified open reading frame is located upstream of the rps14 gene. The inverted repeat sequence is highly conserved at analogous regions in other Oenothera mitochondrial loci. Images PMID:2326162
Quan, Phenix-Lan; Junglen, Sandra; Tashmukhamedova, Alla; Conlan, Sean; Hutchison, Stephen K; Kurth, Andreas; Ellerbrok, Heinz; Egholm, Michael; Briese, Thomas; Leendertz, Fabian H; Lipkin, W Ian
2010-01-01
Characterization of arboviruses at the interface of pristine habitats and anthropogenic landscapes is crucial to comprehensive emergent disease surveillance and forecasting efforts. In context of a surveillance campaign in and around a West African rainforest, particles morphologically consistent with rhabdoviruses were identified in cell cultures infected with homogenates of trapped mosquitoes. RNA recovered from these cultures was used to derive the first complete genome sequence of a rhabdovirus isolated from Culex decens mosquitoes in Côte d'Ivoire, tentatively named Moussa virus (MOUV). MOUV shows the classical genome organization of rhabdoviruses, with five open reading frames (ORF) in a linear order. However, sequences show only limited conservation (12-33% identity at amino acid level), and ORF2 and ORF3 have no significant similarity to sequences deposited in GenBank. Phylogenetic analysis indicates a potential new species with distant relationship to Tupaia and Tibrogargan virus.
Amino acid sequence of the Amur tiger prion protein.
Wu, Changde; Pang, Wanyong; Zhao, Deming
2006-10-01
Prion diseases are fatal neurodegenerative disorders in human and animal associated with conformational conversion of a cellular prion protein (PrP(C)) into the pathologic isoform (PrP(Sc)). Various data indicate that the polymorphisms within the open reading frame (ORF) of PrP are associated with the susceptibility and control the species barrier in prion diseases. In the present study, partial Prnp from 25 Amur tigers (tPrnp) were cloned and screened for polymorphisms. Four single nucleotide polymorphisms (T423C, A501G, C511A, A610G) were found; the C511A and A610G nucleotide substitutions resulted in the amino acid changes Lysine171Glutamine and Alanine204Threoine, respectively. The tPrnp amino acid sequence is similar to house cat (Felis catus ) and sheep, but differs significantly from other two cat Prnp sequences that were previously deposited in GenBank.
Tsai, M H; Saier, M H
1995-06-01
Electron transfer flavoproteins (ETF) are alpha beta-heterodimers found in eukaryotic mitochondria and bacteria. We have identified currently sequenced protein members of the ETF-alpha and ETF-beta families. Members of these two families include (a) the ETF subunits of mammals and bacteria, (b) homologous pairs of proteins (FixB/FixA) that are essential for nitrogen fixation in some bacteria, and (c) a pair of carnitine-inducible proteins encoded by two open reading frames in Escherichia coli (YaaQ and YaaR). These three groups of proteins comprise three clusters on both the ETF-alpha and ETF-beta phylogenetic trees, separated from each other by comparable phylogenetic distances. This fact suggests that these two protein families evolved with similar overall rates of evolutionary divergence. Relative regions of sequence conservation are evaluated, and signature sequences for both families are derived.
Grebenok, R J; Galbraith, D W; Penna, D D
1997-08-01
We report the characterization of a higher-plant C-24 sterol methyltransferase by yeast complementation. A Zea mays endosperm expressed sequence tag (EST) was identified which, upon complete sequencing, showed 46% identity to the yeast C-24 methyltransferase gene (ERG6) and 75% and 37% amino acid identity to recently isolated higher-plant sterol methyltransferases from soybean and Arabidopsis, respectively. When placed under GALA regulation, the Z. mays cDNA functionally complemented the erg6 mutation, restoring ergosterol production and conferring resistance to cycloheximide. Complementation was both plasmid-dependent and galactose-inducible. The Z. mays cDNA clone contains an open reading frame encoding a 40 kDa protein containing motifs common to a large number of S-adenosyl-L-methionine methyltransferases (SMTs). Sequence comparisons and functional studies of the maize, soybean and Arabidopsis cDNAs indicates two types of C-24 SMTs exist in higher plants.
Liu, Li-Jun; You, Xiao-Yan; Zheng, Huajun; Wang, Shengyue; Jiang, Cheng-Ying; Liu, Shuang-Jiang
2011-07-01
The genome of the metal sulfide-oxidizing, thermoacidophilic strain Metallosphaera cuprina Ar-4 has been completely sequenced and annotated. Originally isolated from a sulfuric hot spring, strain Ar-4 grows optimally at 65°C and a pH of 3.5. The M. cuprina genome has a 1,840,348-bp circular chromosome (2,029 open reading frames [ORFs]) and is 16% smaller than the previously sequenced Metallosphaera sedula genome. Compared to the M. sedula genome, there are no counterpart genes in the M. cuprina genome for about 480 ORFs in the M. sedula genome, of which 243 ORFs are annotated as hypothetical protein genes. Still, there are 233 ORFs uniquely occurring in M. cuprina. Genome annotation supports that M. cuprina lives a facultative life on CO(2) and organics and obtains energy from oxidation of sulfidic ores and reduced inorganic sulfuric compounds.
Johnson, K S; Wells, K; Bock, J V; Nene, V; Taylor, D W; Cordingley, J S
1989-08-01
We report the sequence of a cDNA clone encoding an 86-kDa polypeptide antigen (p86) from Schistosoma mansoni. Fusion proteins made in Escherichia coli are recognized by human infection sera. The reading frame of this antigen is highly homologous to those of the large heat-shock proteins of Saccharomyces cerevisiae (HSP90) and Drosophila melanogaster (HSP83). mRNA encoding p86 increases in response to heat shock of adult worms, as does HSP70. Comparisons of the sequences of HSP70 and HSP83 homologues show that these two families of heat-shock proteins are not significantly related except for the last four amino acid residues, which are Glu-Glu-Val-Asp in every case. This sequence is not found at the carboxy terminus of any other protein in the current databases.
Fu, Xiao-Zhe; Shi, Cun-Bin; Li, Ning-Qiu; Pan, Hou-Jun; Chang, Ou-Qin; Wu, Shu-Qin
2007-09-01
The major capsid protein of lymphocystis disease virus isolated from Rachycentron canadum (LCDV-rc) was amplified and analysed. The 457bp DNA core fragment was amplified with the degenerate primers designed according to the conserved sequences of MCP gene of iridoviruses, then the flaking sequences adjacent to the core region were amplified by inverse PCR, and the complete sequence was obtained by combining all of them. The open reading frame of the gene is 1380bp in length, encoding a putative protein of 459 aa with molecular weight 51.12 kD and pI 6.87. Constructing the phylogenetic tree for comparing the MCP amino acid of iridoviruses, the results indicated that LCDV-rc is most homologous to the other Lymphocystis viruses and all of them constitute a branch. Accordingly LCDV-rc is identified as Lymphocystivirus.
Design and construction of 2A peptide-linked multicistronic vectors.
Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A
2012-02-01
The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. This article describes the design and construction of 2A peptide-linked multicistronic vectors, which can be used to express multiple proteins from a single open reading frame (ORF). The small 2A peptide sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. Expression of more than two genes using conventional approaches has several limitations, most notably imbalanced protein expression and large size. The use of 2A peptide sequences alleviates these concerns. They are small (18-22 amino acids) and have divergent amino-terminal sequences, which minimizes the chance for homologous recombination and allows for multiple, different 2A peptide sequences to be used within a single vector. Importantly, separation of genes placed between 2A peptide sequences is nearly 100%, which allows for stoichiometric and concordant expression of the genes, regardless of the order of placement within the vector.
Toxins of Prokaryotic Toxin-Antitoxin Systems with Sequence-Specific Endoribonuclease Activity
Masuda, Hisako; Inouye, Masayori
2017-01-01
Protein translation is the most common target of toxin-antitoxin system (TA) toxins. Sequence-specific endoribonucleases digest RNA in a sequence-specific manner, thereby blocking translation. While past studies mainly focused on the digestion of mRNA, recent analysis revealed that toxins can also digest tRNA, rRNA and tmRNA. Purified toxins can digest single-stranded portions of RNA containing recognition sequences in the absence of ribosome in vitro. However, increasing evidence suggests that in vivo digestion may occur in association with ribosomes. Despite the prevalence of recognition sequences in many mRNA, preferential digestion seems to occur at specific positions within mRNA and also in certain reading frames. In this review, a variety of tools utilized to study the nuclease activities of toxins over the past 15 years will be reviewed. A recent adaptation of an RNA-seq-based technique to analyze entire sets of cellular RNA will be introduced with an emphasis on its strength in identifying novel targets and redefining recognition sequences. The differences in biochemical properties and postulated physiological roles will also be discussed. PMID:28420090
Unit-length line-1 transcripts in human teratocarcinoma cells.
Skowronski, J; Fanning, T G; Singer, M F
1988-01-01
We have characterized the approximately 6.5-kilobase cytoplasmic poly(A)+ Line-1 (L1) RNA present in a human teratocarcinoma cell line, NTera2D1, by primer extension and by analysis of cloned cDNAs. The bulk of the RNA begins (5' end) at the residue previously identified as the 5' terminus of the longest known primate genomic L1 elements, presumed to represent "unit" length. Several of the cDNA clones are close to 6 kilobase pairs, that is, close to full length. The partial sequences of 18 cDNA clones and full sequence of one (5,975 base pairs) indicate that many different genomic L1 elements contribute transcripts to the 6.5-kilobase cytoplasmic poly(A)+ RNA in NTera2D1 cells because no 2 of the 19 cDNAs analyzed had identical sequences. The transcribed elements appear to represent a subset of the total genomic L1s, a subset that has a characteristic consensus sequence in the 3' noncoding region and a high degree of sequence conservation throughout. Two open reading frames (ORFs) of 1,122 (ORF1) and 3,852 (ORF2) bases, flanked by about 800 and 200 bases of sequence at the 5' and 3' ends, respectively, can be identified in the cDNAs. Both ORFs are in the same frame, and they are separated by 33 bases bracketed by two conserved in-frame stop codons. ORF 2 is interrupted by at least one randomly positioned stop codon in the majority of the cDNAs. The data support proposals suggesting that the human L1 family includes one or more functional genes as well as an extraordinarily large number of pseudogenes whose ORFs are broken by stop codons. The cDNA structures suggest that both genes and pseudogenes are transcribed. At least one of the cDNAs (cD11), which was sequenced in its entirety, could, in principle, represent an mRNA for production of the ORF1 polypeptide. The similarity of mammalian L1s to several recently described invertebrate movable elements defines a new widely distributed class of elements which we term class II retrotransposons. Images PMID:2454389
Cocho, Germinal; Miramontes, Pedro; Mansilla, Ricardo; Li, Wentian
2014-12-01
We examine the relationship between exponential correlation functions and Markov models in a bacterial genome in detail. Despite the well known fact that Markov models generate sequences with correlation function that decays exponentially, simply constructed Markov models based on nearest-neighbor dimer (first-order), trimer (second-order), up to hexamer (fifth-order), and treating the DNA sequence as being homogeneous all fail to predict the value of exponential decay rate. Even reading-frame-specific Markov models (both first- and fifth-order) could not explain the fact that the exponential decay is very slow. Starting with the in-phase coding-DNA-sequence (CDS), we investigated correlation within a fixed-codon-position subsequence, and in artificially constructed sequences by packing CDSs with out-of-phase spacers, as well as altering CDS length distribution by imposing an upper limit. From these targeted analyses, we conclude that the correlation in the bacterial genomic sequence is mainly due to a mixing of heterogeneous statistics at different codon positions, and the decay of correlation is due to the possible out-of-phase between neighboring CDSs. There are also small contributions to the correlation from bases at the same codon position, as well as by non-coding sequences. These show that the seemingly simple exponential correlation functions in bacterial genome hide a complexity in correlation structure which is not suitable for a modeling by Markov chain in a homogeneous sequence. Other results include: use of the (absolute value) second largest eigenvalue to represent the 16 correlation functions and the prediction of a 10-11 base periodicity from the hexamer frequencies. Copyright © 2014 Elsevier Ltd. All rights reserved.
Su, Zhipeng; Zhu, Jiawen; Xu, Zhuofei; Xiao, Ran; Zhou, Rui; Li, Lu; Chen, Huanchun
2016-01-01
Actinobacillus pleuropneumoniae is the pathogen of porcine contagious pleuropneumoniae, a highly contagious respiratory disease of swine. Although the genome of A. pleuropneumoniae was sequenced several years ago, limited information is available on the genome-wide transcriptional analysis to accurately annotate the gene structures and regulatory elements. High-throughput RNA sequencing (RNA-seq) has been applied to study the transcriptional landscape of bacteria, which can efficiently and accurately identify gene expression regions and unknown transcriptional units, especially small non-coding RNAs (sRNAs), UTRs and regulatory regions. The aim of this study is to comprehensively analyze the transcriptome of A. pleuropneumoniae by RNA-seq in order to improve the existing genome annotation and promote our understanding of A. pleuropneumoniae gene structures and RNA-based regulation. In this study, we utilized RNA-seq to construct a single nucleotide resolution transcriptome map of A. pleuropneumoniae. More than 3.8 million high-quality reads (average length ~90 bp) from a cDNA library were generated and aligned to the reference genome. We identified 32 open reading frames encoding novel proteins that were mis-annotated in the previous genome annotations. The start sites for 35 genes based on the current genome annotation were corrected. Furthermore, 51 sRNAs in the A. pleuropneumoniae genome were discovered, of which 40 sRNAs were never reported in previous studies. The transcriptome map also enabled visualization of 5'- and 3'-UTR regions, in which contained 11 sRNAs. In addition, 351 operons covering 1230 genes throughout the whole genome were identified. The RNA-Seq based transcriptome map validated annotated genes and corrected annotations of open reading frames in the genome, and led to the identification of many functional elements (e.g. regions encoding novel proteins, non-coding sRNAs and operon structures). The transcriptional units described in this study provide a foundation for future studies concerning the gene functions and the transcriptional regulatory architectures of this pathogen. PMID:27018591
Onofre, Cláudia; Tomé, Filipa; Barbosa, Cristina; Silva, Ana Luísa
2015-01-01
The gene encoding human hemojuvelin (HJV) is one of the genes that, when mutated, can cause juvenile hemochromatosis, an early-onset inherited disorder associated with iron overload. The 5′ untranslated region of the human HJV mRNA has two upstream open reading frames (uORFs), with 28 and 19 codons formed by two upstream AUGs (uAUGs) sharing the same in-frame stop codon. Here we show that these uORFs decrease the translational efficiency of the downstream main ORF in HeLa and HepG2 cells. Indeed, ribosomal access to the main AUG is conditioned by the strong uAUG context, which results in the first uORF being translated most frequently. The reach of the main ORF is then achieved by ribosomes that resume scanning after uORF translation. Furthermore, the amino acid sequences of the uORF-encoded peptides also reinforce the translational repression of the main ORF. Interestingly, when iron levels increase, translational repression is relieved specifically in hepatic cells. The upregulation of protein levels occurs along with phosphorylation of the eukaryotic initiation factor 2α. Nevertheless, our results support a model in which the increasing recognition of the main AUG is mediated by a tissue-specific factor that promotes uORF bypass. These results support a tight HJV translational regulation involved in iron homeostasis. PMID:25666510
Roux, K H; Greenberg, A S; Greene, L; Strelets, L; Avila, D; McKinney, E C; Flajnik, M F
1998-09-29
We recently have identified an antigen receptor in sharks called NAR (new or nurse shark antigen receptor) that is secreted by splenocytes but does not associate with Ig light (L) chains. The NAR variable (V) region undergoes high levels of somatic mutation and is equally divergent from both Ig and T cell receptors (TCR). Here we show by electron microscopy that NAR V regions, unlike those of conventional Ig and TCR, do not form dimers but rather are independent, flexible domains. This unusual feature is analogous to bona fide camelid IgG in which modifications of Ig heavy chain V (VH) sequences prevent dimer formation with L chains. NAR also displays a uniquely flexible constant (C) region. Sequence analysis and modeling show that there are only two types of expressed NAR genes, each having different combinations of noncanonical cysteine (Cys) residues in the V domains that likely form disulfide bonds to stabilize the single antigen-recognition unit. In one NAR class, rearrangement events result in mature genes encoding an even number of Cys (two or four) in complementarity-determining region 3 (CDR3), which is analogous to Cys codon expression in an unusual human diversity (D) segment family. The NAR CDR3 Cys generally are encoded by preferred reading frames of rearranging D segments, providing a clear design for use of preferred reading frame in antigen receptor D regions. These unusual characteristics shared by NAR and unconventional mammalian Ig are most likely the result of convergent evolution at the molecular level.
Grant, Susan; Grant, William D; Cowan, Don A; Jones, Brian E; Ma, Yanhe; Ventosa, Antonio; Heaphy, Shaun
2006-01-01
Here we describe the application of metagenomic technologies to construct cDNA libraries from RNA isolated from environmental samples. RNAlater (Ambion) was shown to stabilize RNA in environmental samples for periods of at least 3 months at -20 degrees C. Protocols for library construction were established on total RNA extracted from Acanthamoeba polyphaga trophozoites. The methodology was then used on algal mats from geothermal hot springs in Tengchong county, Yunnan Province, People's Republic of China, and activated sludge from a sewage treatment plant in Leicestershire, United Kingdom. The Tenchong libraries were dominated by RNA from prokaryotes, reflecting the mainly prokaryote microbial composition. The majority of these clones resulted from rRNA; only a few appeared to be derived from mRNA. In contrast, many clones from the activated sludge library had significant similarity to eukaryote mRNA-encoded protein sequences. A library was also made using polyadenylated RNA isolated from total RNA from activated sludge; many more clones in this library were related to eukaryotic mRNA sequences and proteins. Open reading frames (ORFs) up to 378 amino acids in size could be identified. Some resembled known proteins over their full length, e.g., 36% match to cystatin, 49% match to ribosomal protein L32, 63% match to ribosomal protein S16, 70% to CPC2 protein. The methodology described here permits the polyadenylated transcriptome to be isolated from environmental samples with no knowledge of the identity of the microorganisms in the sample or the necessity to culture them. It has many uses, including the identification of novel eukaryotic ORFs encoding proteins and enzymes.
Hayashi, Toshiaki; Koshino, Hiroyuki; Malon, Michal; Hirota, Hiroshi; Kudo, Toshiaki
2014-01-01
Comamonas testosteroni TA441 degrades steroids via aromatization and meta-cleavage of the A ring, followed by hydrolysis, and produces 9,17-dioxo-1,2,3,4,10,19-hexanorandrostan-5-oic acid as an intermediate compound. Herein, we identify a new intermediate compound, 9α-hydroxy-17-oxo-1,2,3,4,10,19-hexanorandrostan-5-oic acid. Open reading frame 28 (ORF28)- and ORF30-encoded acyl coenzyme A (acyl-CoA) dehydrogenase was shown to convert the CoA ester of 9α-hydroxy-17-oxo-1,2,3,4,10,19-hexanorandrostan-5-oic acid to the CoA ester of 9α-hydroxy-17-oxo-1,2,3,4,10,19-hexanorandrost-6-en-5-oic acid. A homology search of the deduced amino acid sequences suggested that the ORF30-encoded protein is a member of the acyl-CoA dehydrogenase_fadE6_17_26 family, whereas the deduced amino acid sequence of ORF28 showed no significant similarity to specific acyl-CoA dehydrogenase family proteins. Possible steroid degradation gene clusters similar to the cluster of TA441 appear in bacterial genome analysis data. In these clusters, ORFs similar to ORFs 28 and 30 are often found side by side and ordered in the same manner as ORFs 28 and 30. PMID:25092028
Mariottini, P; Chomyn, A; Riley, M; Cottrell, B; Doolittle, R F; Attardi, G
1986-01-01
In previous work, antibodies prepared against chemically synthesized peptides predicted from the DNA sequence were used to identify the polypeptides encoded in three of the eight unassigned reading frames (URFs) of human mitochondrial DNA (mtDNA). In the present study, this approach has been extended to other human mtDNA URFs. In particular, antibodies directed against the NH2-terminal octapeptide of the putative URF2 product specifically precipitated component 11 of the HeLa cell mitochondrial translation products, the reaction being inhibited by the specific peptide. Similarly, antibodies directed against the COOH-terminal nonapeptide of the putative URF4 product reacted specifically with components 4 and 5, and antibodies against a COOH-terminal heptapeptide of the presumptive URF4L product reacted specifically with component 26. Antibodies against the NH2-terminal heptapeptide of the putative product of URF5 reacted with component 1, but only to a marginal extent; however, the results of a trypsin fingerprinting analysis of component 1 point strongly to this component as being the authentic product of URF5. The polypeptide assignments to the mtDNA URFs analyzed here are supported by the relative electrophoretic mobilities of proteins 11, 4-5, 26, and 1, which are those expected for the molecular weights predicted from the DNA sequence for the products of URF2, URF4, URF4L, and URF5, respectively. With the present assignment, seven of the eight human mtDNA URFs have been shown to be expressed in HeLa cells. Images PMID:3456601
Cohen, P T; Cohen, P
1989-06-15
Infection of Escherichia coli with phage lambda gt10 resulted in the appearance of a protein phosphatase with activity towards 32P-labelled casein. Activity reached a maximum near the point of cell lysis and declined thereafter. The phosphatase was stimulated 30-fold by Mn2+, while Mg2+ and Ca2+ were much less effective. Activity was unaffected by inhibitors 1 and 2, okadaic acid, calmodulin and trifluoperazine, distinguishing it from the major serine/threonine-specific protein phosphatases of eukaryotic cells. The lambda phosphatase was also capable of dephosphorylating other substrates in the presence of Mn2+, although activity towards 32P-labelled phosphorylase was 10-fold lower, and activity towards phosphorylase kinase and glycogen synthase 25 50-fold lower than with casein. No casein phosphatase activity was present in either uninfected cells, or in E. coli infected with phage lambda gt11. Since lambda gt11 lacks part of the open reading frame (orf) 221, previously shown to encode a protein with sequence similarity to protein phosphatase-1 and protein phosphatase-2A of mammalian cells [Cohen, Collins, Coulson, Berndt & da Cruz e Silva (1988) Gene 69, 131-134], the results indicate that ORF221 is the protein phosphatase detected in cells infected with lambda gt10. Comparison of the sequence of ORF221 with other mammalian protein phosphatases defines three highly conserved regions which are likely to be essential for function. The first of these is deleted in lambda gt11.
Motivating Reading Comprehension: Concept-Oriented Reading Instruction
ERIC Educational Resources Information Center
Guthrie, John T., Ed.; Wigfield, Allan, Ed.; Perencevich, Kathleen C., Ed.
2004-01-01
Concept Oriented Reading Instruction (CORI) is a unique, classroom-tested model of reading instruction that breaks new ground by explicitly showing how content knowledge, reading strategies, and motivational support all merge in successful reading instruction. A theoretical perspective (engagement in reading) frames the book and provides a…
The primary structure of the Saccharomyces cerevisiae gene for 3-phosphoglycerate kinase.
Hitzeman, R A; Hagie, F E; Hayflick, J S; Chen, C Y; Seeburg, P H; Derynck, R
1982-01-01
The DNA sequence of the gene for the yeast glycolytic enzyme, 3-phosphoglycerate kinase (PGK), has been obtained by sequencing part of a 3.1 kbp HindIII fragment obtained from the yeast genome. The structural gene sequence corresponds to a reading frame of 1251 bp coding for 416 amino acids with no intervening DNA sequences. The amino acid sequence is approximately 65 percent homologous with human and horse PGK protein sequences and is in general agreement with the published protein sequence for yeast PGK. As for other highly expressed structural genes in yeast, the coding sequence is highly codon biased with 95 percent of the amino acids coded for by a select 25 codons (out of 61 possible). Besides structural DNA sequence, 291 bp of 5'-flanking sequence and 286 bp of 3'-flanking sequence were determined. Transcription starts 36 nucleotides upstream from the translational start and stops 86-93 nucleotides downstream from the translational stop. These results suggest a non-polyadenylated mRNA length of 1373 to 1380 nucleotides, which is consistent with the observed length of 1500 nucleotides for polyadenylated PGK mRNA. A sequence TATATATAAA is found at 145 nucleotides upstream from the translational start. This sequence resembles the TATAAA box that is possibly associated with RNA polymerase II binding. Images PMID:6296791
Suzuki, Shun'ichi; Takenaka, Yasuhiro; Onishi, Norimasa; Yokozeki, Kenzo
2005-08-01
A DNA fragment from Microbacterium liquefaciens AJ 3912, containing the genes responsible for the conversion of 5-substituted-hydantoins to alpha-amino acids, was cloned in Escherichia coli and sequenced. Seven open reading frames (hyuP, hyuA, hyuH, hyuC, ORF1, ORF2, and ORF3) were identified on the 7.5 kb fragment. The deduced amino acid sequence encoded by the hyuA gene included the N-terminal amino acid sequence of the hydantoin racemase from M. liquefaciens AJ 3912. The hyuA, hyuH, and hyuC genes were heterologously expressed in E. coli; their presence corresponded with the detection of hydantoin racemase, hydantoinase, and N-carbamoyl alpha-amino acid amido hydrolase enzymatic activities respectively. The deduced amino acid sequences of hyuP were similar to those of the allantoin (5-ureido-hydantoin) permease from Saccharomyces cerevisiae, suggesting that hyuP protein might function as a hydantoin transporter.
Isolation and cloning of a metalloproteinase from king cobra snake venom.
Guo, Xiao-Xi; Zeng, Lin; Lee, Wen-Hui; Zhang, Yun; Jin, Yang
2007-06-01
A 50 kDa fibrinogenolytic protease, ohagin, from the venom of Ophiophagus hannah was isolated by a combination of gel filtration, ion-exchange and heparin affinity chromatography. Ohagin specifically degraded the alpha-chain of human fibrinogen and the proteolytic activity was completely abolished by EDTA, but not by PMSF, suggesting it is a metalloproteinase. It dose-dependently inhibited platelet aggregation induced by ADP, TMVA and stejnulxin. The full sequence of ohagin was deduced by cDNA cloning and confirmed by protein sequencing and peptide mass fingerprinting. The full-length cDNA sequence of ohagin encodes an open reading frame of 611 amino acids that includes signal peptide, proprotein and mature protein comprising metalloproteinase, disintegrin-like and cysteine-rich domains, suggesting it belongs to P-III class metalloproteinase. In addition, P-III class metalloproteinases from the venom glands of Naja atra, Bungarus multicinctus and Bungarus fasciatus were also cloned in this study. Sequence analysis and phylogenetic analysis indicated that metalloproteinases from elapid snake venoms form a new subgroup of P-III SVMPs.
Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E
1996-10-03
We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Pyrococcus furiosus (Pf), a hyperthermophillic archeon. Sequence analysis of the Pf gene indicated an open reading frame specifying a protein of 485 amino acids (aa) with a calculated M(r) of 52900. Canonical Archaea promoter elements, Box A and Box B, are located -49 and -17 nucleotides (nt), respectively, upstream of the putative start codon. The sequence of the putative active-site region conforms to the IMPDH signature motif and contains a putative active-site cysteine. Phylogenetic relationships derived by using all available IMPDH sequences are consistent with trees developed for other molecules; they do not precisely resolve the history of Pf IMPDH but indicate a close similarity to bacterial IMPDH proteins. The phylogenetic analysis indicates that a gene duplication occurred prior to the division between rodents and humans, accounting for the Type I and II isoforms identified in mice and humans.
A third genotype of the human parvovirus PARV4 in sub-Saharan Africa.
Simmonds, Peter; Douglas, Jill; Bestetti, Giovanna; Longhi, Erika; Antinori, Spinello; Parravicini, Carlo; Corbellino, Mario
2008-09-01
PARV4 is a recently discovered human parvovirus widely distributed in injecting drug users in the USA and Europe, particularly in those co-infected with human immunodeficiency virus (HIV). Like parvovirus B19, PARV4 persists in previously exposed individuals. In bone marrow and lymphoid tissue, PARV4 sequences were detected in two sub-Saharan African study subjects with AIDS but without a reported history of parenteral exposure and who were uninfected with hepatitis C virus. PARV4 variants infecting these subjects were phylogenetically distinct from genotypes 1 and 2 (formerly PARV5) that were reported previously. Analysis of near-complete genome sequences demonstrated that they should be classified as a third (equidistant) PARV4 genotype. The availability of a further near-complete genome sequence of this novel genotype facilitated identification of conserved novel open reading frames embedded in the ORF2 coding sequence; one encoded a putative protein with identifiable homology to SAT proteins of members of the genus Parvovirus.
The influence of visual and vestibular orientation cues in a clock reading task.
Davidenko, Nicolas; Cheong, Yeram; Waterman, Amanda; Smith, Jacob; Anderson, Barrett; Harmon, Sarah
2018-05-23
We investigated how performance in the real-life perceptual task of analog clock reading is influenced by the clock's orientation with respect to egocentric, gravitational, and visual-environmental reference frames. In Experiment 1, we designed a simple clock-reading task and found that observers' reaction time to correctly tell the time depends systematically on the clock's orientation. In Experiment 2, we dissociated egocentric from environmental reference frames by having participants sit upright or lie sideways while performing the task. We found that both reference frames substantially contribute to response times in this task. In Experiment 3, we placed upright or rotated participants in an upright or rotated immersive virtual environment, which allowed us to further dissociate vestibular from visual cues to the environmental reference frame. We found evidence of environmental reference frame effects only when visual and vestibular cues were aligned. We discuss the implications for the design of remote and head-mounted displays. Copyright © 2018 Elsevier Inc. All rights reserved.
USDA-ARS?s Scientific Manuscript database
A synthetic Candida antarctica lipase B (CALB) gene open reading frame (ORF) for expression in yeast was produced using an automated PCR assembly and DNA purification protocol on an integrated robotic workcell. The lycotoxin-1 (Lyt-1) C3 variant gene ORF was added in-frame with the CALB ORF to pote...
Framing effects reveal discrete lexical-semantic and sublexical procedures in reading: an fMRI study
Danelli, Laura; Marelli, Marco; Berlingeri, Manuela; Tettamanti, Marco; Sberna, Maurizio; Paulesu, Eraldo; Luzzatti, Claudio
2015-01-01
According to the dual-route model, a printed string of letters can be processed by either a grapheme-to-phoneme conversion (GPC) route or a lexical-semantic route. Although meta-analyses of the imaging literature support the existence of distinct but interacting reading procedures, individual neuroimaging studies that explored neural correlates of reading yielded inconclusive results. We used a list-manipulation paradigm to provide a fresh empirical look at this issue and to isolate specific areas that underlie the two reading procedures. In a lexical condition, we embedded disyllabic Italian words (target stimuli) in lists of either loanwords or trisyllabic Italian words with unpredictable stress position. In a GPC condition, similar target stimuli were included within lists of pseudowords. The procedure was designed to induce participants to emphasize either the lexical-semantic or the GPC reading procedure, while controlling for possible linguistic confounds and keeping the reading task requirements stable across the two conditions. Thirty-three adults participated in the behavioral study, and 20 further adult participants were included in the fMRI study. At the behavioral level, we found sizeable effects of the framing manipulations that included slower voice onset times for stimuli in the pseudoword frames. At the functional anatomical level, the occipital and temporal regions, and the intraparietal sulcus were specifically activated when subjects were reading target words in a lexical frame. The inferior parietal and anterior fusiform cortex were specifically activated in the GPC condition. These patterns of activation represented a valid classifying model of fMRI images associated with target reading in both frames in the multi-voxel pattern analyses. Further activations were shared by the two procedures in the occipital and inferior parietal areas, in the premotor cortex, in the frontal regions and the left supplementary motor area. These regions are most likely involved in either early input or late output processes. PMID:26441712
Danelli, Laura; Marelli, Marco; Berlingeri, Manuela; Tettamanti, Marco; Sberna, Maurizio; Paulesu, Eraldo; Luzzatti, Claudio
2015-01-01
According to the dual-route model, a printed string of letters can be processed by either a grapheme-to-phoneme conversion (GPC) route or a lexical-semantic route. Although meta-analyses of the imaging literature support the existence of distinct but interacting reading procedures, individual neuroimaging studies that explored neural correlates of reading yielded inconclusive results. We used a list-manipulation paradigm to provide a fresh empirical look at this issue and to isolate specific areas that underlie the two reading procedures. In a lexical condition, we embedded disyllabic Italian words (target stimuli) in lists of either loanwords or trisyllabic Italian words with unpredictable stress position. In a GPC condition, similar target stimuli were included within lists of pseudowords. The procedure was designed to induce participants to emphasize either the lexical-semantic or the GPC reading procedure, while controlling for possible linguistic confounds and keeping the reading task requirements stable across the two conditions. Thirty-three adults participated in the behavioral study, and 20 further adult participants were included in the fMRI study. At the behavioral level, we found sizeable effects of the framing manipulations that included slower voice onset times for stimuli in the pseudoword frames. At the functional anatomical level, the occipital and temporal regions, and the intraparietal sulcus were specifically activated when subjects were reading target words in a lexical frame. The inferior parietal and anterior fusiform cortex were specifically activated in the GPC condition. These patterns of activation represented a valid classifying model of fMRI images associated with target reading in both frames in the multi-voxel pattern analyses. Further activations were shared by the two procedures in the occipital and inferior parietal areas, in the premotor cortex, in the frontal regions and the left supplementary motor area. These regions are most likely involved in either early input or late output processes.
Sun, Lingling; Che, Kui; Zhao, Zhenzhen; Liu, Song; Xing, Xiaoming; Luo, Bing
2015-09-04
NK/T cell lymphoma is an aggressive lymphoma almost always associated with EBV. BamHI-A rightward open reading frame 1 (BARF1) and BamHI-H rightward open reading frame 1 (BHRF1) are two EBV early genes, which may be involved in the oncogenicity of EBV. It has been found that V29A strains, a BARF1 mutant subtype, showed higher prevalence in NPC, which may suggest the association between this variation and nasopharyngeal carcinoma (NPC). To characterize the sequence variation patterns of the Epstein-Barr virus (EBV) early genes and to elucidate their association with NK/T cell lymphoma, we analyzed the sequences of BARF1 and BHRF1 in EBV-positive NK/T cell lymphoma samples from Northern China. In situ hybridization (ISH) performed for EBV-encoded small RNA1 (EBER1) with specific digoxigenin-labeled probes was used to select the EBV positive lymphoma samples. Nested-polymerase chain reaction (nested-PCR) and DNA sequence analysis technique were used to obtain the sequences of BARF1 and BHRF1. The polymorphisms of these two genes were classified according to the signature changes and compared with the known corresponding EBV gene variation data. Two major subtypes of BARF1 gene, designated as B95-8 and V29A subtype, were identified. B95-8 subtype was the dominant subtype. The V29A subtype had one consistent amino acid change at amino acid residue 29 (V → A). Compared with B95-8, AA change at 88 (L → V) of BHRF1 was found in the majority of the isolates, and AA79 (V → L) mutation in a few isolates. Functional domains of BARF1 and BHRF1 were highly conserved. The distributions of BARF1 and BHRF1 subtypes had no significant differences among different EBV-associated malignancies and healthy donors. The sequences of BARF1 and BHRF1 are highly conserved which may contribute to maintain the biological function of these two genes. There is no evidence that particular EBV substrains of BARF1 or BHRF1 is region-restricted or disease-specific.
Verification of 2A peptide cleavage.
Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A
2012-02-01
The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. It is now possible to express multiple proteins from a single open reading frame (ORF) using 2A peptide-linked multicistronic vectors. These small sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. The easiest and most effective way to assess 2A cleavage is to perform transient transfection of 293T cells (human embryonic kidney cells) followed by western blot analysis, as described in this protocol. 293T cells are easy to grow and can be efficiently transfected with a variety of vectors. Cleavage can be assessed by detection with antibodies against the target proteins or anti-2A serum.
Phonological Constraints on the Assembly of Skeletal Structure in Reading
ERIC Educational Resources Information Center
Marom, Michal; Berent, Iris
2010-01-01
Linguistic research suggests that certain skeletal frames (e.g., CVC) are preferred to others (e.g., VCC). We examine whether such preferences constrain reading in the Stroop task. We demonstrate that CCVC nonwords facilitate naming the color "black" (/blaek/, a CCVC frame) relative to CVC controls. Conversely, CCVC items inhibit "red" (a CVC…
Smith, A R; Boursnell, M E; Binns, M M; Brown, T D; Inglis, S C
1990-01-01
Nucleotide sequences from the third open reading frame of mRNA D (D3) of infectious bronchitis virus (IBV) were expressed in bacteria as part of a fusion protein with beta-galactosidase. Antiserum raised in rabbits against this fusion protein immunoprecipitated from IBV-infected chick kidney or Vero cells a polypeptide of 12.4K, the size expected for a D3-encoded product. The D3 polypeptide is apparently non-glycosylated, and appears to be associated with the membrane fraction of infected cells, as judged by cell fractionation and immunofluorescence.
Sommer, J M; Nguyen, T T; Wang, C C
1994-08-15
Import of proteins into the glycosomes of T. brucei resembles the peroxisomal protein import in that C-terminal SKL-like tripeptide sequences can function as targeting signals. Many of the glycosomal proteins do not, however, possess such C-terminal tripeptide signals. Among these, phosphoenolpyruvate carboxykinase (PEPCK (ATP)) was thought to be targeted to the glycosomes by an N-terminal or an internal targeting signal. A limited similarity to the N-terminal targeting signal of rat peroxisomal thiolase exists at the N-terminus of T. brucei PEPCK. However, we found that this peroxisomal targeting signal does not function for glycosomal protein import in T. brucei. Further studies of the PEPCK gene revealed that the C-terminus of the predicted protein does not correspond to the previously deduced protein sequence of 472 amino acids due to a -1 frame shift error in the original DNA sequence. Readjusting the reading frame of the sequence results in a predicted protein of 525 amino acids in length ending in a tripeptide serine-arginine-leucine (SRL), which is a potential targeting signal for import into the glycosomes. A fusion protein of firefly luciferase, without its own C-terminal SKL targeting signal, and T. brucei PEPCK is efficiently imported into the glycosomes when expressed in procyclic trypanosomes. Deletion of the C-terminal SRL tripeptide or the last 29 amino acids of PEPCK reduced the import only by about 50%, while a deletion of the last 47 amino acids completely abolished the import. These results suggest that T. brucei PEPCK may contain a second, internal glycosomal targeting signal upstream of the C-terminal SRL sequence.
The complete genome sequence of freesia mosaic virus and its relationship to other potyviruses.
Choi, H I; Lim, H R; Song, Y S; Kim, M J; Choi, S H; Song, Y S; Bae, S C; Ryu, K H
2010-07-01
We have completed the genomic sequence of a potyvirus, freesia mosaic virus (FreMV), and compared it to those of other known potyviruses. The full-length genome sequence of FreMV consists of 9,489 nucleotides. The large protein contains 3,077 amino acids, with an AUG start codon and UAA stop codon, containing one open reading frame typical of a potyvirus polyprotein. The polyprotein of FreMV-Kr gives rise to eleven proteins (P1, HC-pro, P3, PIPO, 6K1, CI, 6K2, VPg, NIa, NIb and CP), and putative cleavage sites of each protein were identified by sequence comparison to those of other known potyviruses. Phylogenetic analysis of the polyprotein revealed that FreMV-Kr was most closely related to PeMoV and was related to BtMV, BaRMV and PeLMV, which belong to the BCMV subgroup. This is the first information on the complete genome structure of FreMV, and the sequence information clearly supports the status of FreMV as a member of a distinct species in the genus Potyvirus.
Teng, Y; Liu, H; Lv, J Q; Fan, W H; Zhang, Q Y; Qin, Q W
2007-01-01
The complete genome of spring viraemia of carp virus (SVCV) strain A-1 isolated from cultured common carp (Cyprinus carpio) in China was sequenced and characterized. Reverse transcription-polymerase chain reaction (RT-PCR) derived clones were constructed and the DNA was sequenced. It showed that the entire genome of SVCV A-1 consists of 11,100 nucleotide base pairs, the predicted size of the viral RNA of rhabdoviruses. However, the additional insertions in bp 4633-4676 and bp 4684-4724 of SVCV A-1 were different from the other two published SVCV complete genomes. Five open reading frames (ORFs) of SVCV A-1 were identified and further confirmed by RT-PCR and DNA sequencing of their respective RT-PCR products. The 5 structural proteins encoded by the viral RNA were ordered 3'-N-P-M-G-L-5'. This is the first report of a complete genome sequence of SVCV isolated from cultured carp in China. Phylogenetic analysis indicates that SVCV A-1 is closely related to the members of the genus Vesiculovirus, family Rhabdoviridae.
Artificial Intelligence, DNA Mimicry, and Human Health.
Stefano, George B; Kream, Richard M
2017-08-14
The molecular evolution of genomic DNA across diverse plant and animal phyla involved dynamic registrations of sequence modifications to maintain existential homeostasis to increasingly complex patterns of environmental stressors. As an essential corollary, driver effects of positive evolutionary pressure are hypothesized to effect concerted modifications of genomic DNA sequences to meet expanded platforms of regulatory controls for successful implementation of advanced physiological requirements. It is also clearly apparent that preservation of updated registries of advantageous modifications of genomic DNA sequences requires coordinate expansion of convergent cellular proofreading/error correction mechanisms that are encoded by reciprocally modified genomic DNA. Computational expansion of operationally defined DNA memory extends to coordinate modification of coding and previously under-emphasized noncoding regions that now appear to represent essential reservoirs of untapped genetic information amenable to evolutionary driven recruitment into the realm of biologically active domains. Additionally, expansion of DNA memory potential via chemical modification and activation of noncoding sequences is targeted to vertical augmentation and integration of an expanded cadre of transcriptional and epigenetic regulatory factors affecting linear coding of protein amino acid sequences within open reading frames.
The complete DNA sequence of lymphocystis disease virus.
Tidona, C A; Darai, G
1997-04-14
Lymphocystis disease virus (LCDV) is the causative agent of lymphocystis disease, which has been reported to occur in over 100 different fish species worldwide. LCDV is a member of the family Iridoviridae and the type species of the genus Lymphocystivirus. The virions contain a single linear double-stranded DNA molecule, which is circularly permuted, terminally redundant, and heavily methylated at cytosines in CpG sequences. The complete nucleotide sequence of LCDV-1 (flounder isolate) was determined by automated cycle sequencing and primer walking. The genome of LCDV-1 is 102.653 bp in length and contains 195 open reading frames with coding capacities ranging from 40 to 1199 amino acids. Computer-assisted analyses of the deduced amino acid sequences led to the identification of several putative gene products with significant homologies to entries in protein data banks, such as the two major subunits of the viral DNA-dependent RNA polymerase, DNA polymerase, several protein kinases, two subunits of the ribonucleoside diphosphate reductase, DNA methyltransferase, the viral major capsid protein, insulin-like growth factor, and tumor necrosis factor receptor homolog.
Dinsmore, P K; Klaenhammer, T R
1997-05-01
A spontaneous mutant of the lactococcal phage phi31 that is insensitive to the phage defense mechanism AbiA was characterized in an effort to identify the phage factor(s) involved in sensitivity of phi31 to AbiA. A point mutation was localized in the genome of the AbiA-insensitive phage (phi31A) by heteroduplex analysis of a 9-kb region. The mutation (G to T) was within a 738-bp open reading frame (ORF245) and resulted in an arginine-to-leucine change in the predicted amino acid sequence of the protein. The mutant phi31A-ORF245 reduced the sensitivity of phi31 to AbiA when present in trans, indicating that the mutation in ORF245 is responsible for the AbiA insensitivity of phi31A. Transcription of ORF245 occurs early in the phage infection cycles of phi31 and phi31A and is unaffected by AbiA. Expansion of the phi31 sequence revealed ORF169 (immediately upstream of ORF245) and ORF71 (which ends 84 bp upstream of ORF169). Two inverted repeats lie within the 84-bp region between ORF71 and ORF169. Sequence analysis of an independently isolated AbiA-insensitive phage, phi31B, identified a mutation (G to A) in one of the inverted repeats. A 118-bp fragment from phi31, encompassing the 84-bp region between ORF71 and ORF169, eliminates AbiA activity against phi31 when present in trans, establishing a relationship between AbiA and this fragment. The study of this region of phage phi31 has identified an open reading frame (ORF245) and a 118-bp DNA fragment that interact with AbiA and are likely to be involved in the sensitivity of this phage to AbiA.
High-Resolution Analysis of Coronavirus Gene Expression by RNA Sequencing and Ribosome Profiling
Jones, Joshua D.; Chung, Betty Y.-W.; Siddell, Stuart G.; Brierley, Ian
2016-01-01
Members of the family Coronaviridae have the largest genomes of all RNA viruses, typically in the region of 30 kilobases. Several coronaviruses, such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and Middle East respiratory syndrome-related coronavirus (MERS-CoV), are of medical importance, with high mortality rates and, in the case of SARS-CoV, significant pandemic potential. Other coronaviruses, such as Porcine epidemic diarrhea virus and Avian coronavirus, are important livestock pathogens. Ribosome profiling is a technique which exploits the capacity of the translating ribosome to protect around 30 nucleotides of mRNA from ribonuclease digestion. Ribosome-protected mRNA fragments are purified, subjected to deep sequencing and mapped back to the transcriptome to give a global “snap-shot” of translation. Parallel RNA sequencing allows normalization by transcript abundance. Here we apply ribosome profiling to cells infected with Murine coronavirus, mouse hepatitis virus, strain A59 (MHV-A59), a model coronavirus in the same genus as SARS-CoV and MERS-CoV. The data obtained allowed us to study the kinetics of virus transcription and translation with exquisite precision. We studied the timecourse of positive and negative-sense genomic and subgenomic viral RNA production and the relative translation efficiencies of the different virus ORFs. Virus mRNAs were not found to be translated more efficiently than host mRNAs; rather, virus translation dominates host translation at later time points due to high levels of virus transcripts. Triplet phasing of the profiling data allowed precise determination of translated reading frames and revealed several translated short open reading frames upstream of, or embedded within, known virus protein-coding regions. Ribosome pause sites were identified in the virus replicase polyprotein pp1a ORF and investigated experimentally. Contrary to expectations, ribosomes were not found to pause at the ribosomal frameshift site. To our knowledge this is the first application of ribosome profiling to an RNA virus. PMID:26919232
Guan, Wei; Shao, Jonathan; Elbeaino, Toufic; Davis, Robert E.; Zhao, Tingchang; Huang, Qi
2015-01-01
Xylella fastidiosa causes bacterial leaf scorch in many landscape trees including elm, oak, sycamore and mulberry, but methods for specific identification of a particular tree host species-limited strain or differentiation of tree-specific strains are lacking. It is also unknown whether a particular landscape tree-infecting X. fastidiosa strain is capable of infecting multiple landscape tree species in an urban environment. We developed two PCR primers specific for mulberry-infecting strains of X. fastidiosa based on the nucleotide sequence of a unique open reading frame identified only in mulberry-infecting strains among all the North and South American strains of X. fastidiosa sequenced to date. PCR using the primers allowed for detection and identification of mulberry-infecting X. fastidiosa strains in cultures and in samples collected from naturally infected mulberry trees. In addition, no mixed infections with or non-specific detections of the mulberry-infecting strains of X. fastidiosa were found in naturally X. fastidiosa-infected oak, elm and sycamore trees growing in the same region where naturally infected mulberry trees were grown. This genotype-specific PCR assay will be valuable for disease diagnosis, studies of strain-specific infections in insects and plant hosts, and management of diseases caused by X. fastidiosa. Unexpectedly but interestingly, the unique open reading frame conserved in the mulberry-infecting strains in the U. S. was also identified in the recently sequenced olive-associated strain CoDiRO isolated in Italy. When the primer set was tested against naturally infected olive plant samples collected in Italy, it allowed for detection of olive-associated strains of X. fastidiosa in Italy. This PCR assay, therefore, will also be useful for detection and identification of the Italian group of X. fastidiosa strains to aid understanding of the occurrence, evolution and biology of this new group of X. fastidiosa strains. PMID:26061051
Guan, Wei; Shao, Jonathan; Elbeaino, Toufic; Davis, Robert E; Zhao, Tingchang; Huang, Qi
2015-01-01
Xylella fastidiosa causes bacterial leaf scorch in many landscape trees including elm, oak, sycamore and mulberry, but methods for specific identification of a particular tree host species-limited strain or differentiation of tree-specific strains are lacking. It is also unknown whether a particular landscape tree-infecting X. fastidiosa strain is capable of infecting multiple landscape tree species in an urban environment. We developed two PCR primers specific for mulberry-infecting strains of X. fastidiosa based on the nucleotide sequence of a unique open reading frame identified only in mulberry-infecting strains among all the North and South American strains of X. fastidiosa sequenced to date. PCR using the primers allowed for detection and identification of mulberry-infecting X. fastidiosa strains in cultures and in samples collected from naturally infected mulberry trees. In addition, no mixed infections with or non-specific detections of the mulberry-infecting strains of X. fastidiosa were found in naturally X. fastidiosa-infected oak, elm and sycamore trees growing in the same region where naturally infected mulberry trees were grown. This genotype-specific PCR assay will be valuable for disease diagnosis, studies of strain-specific infections in insects and plant hosts, and management of diseases caused by X. fastidiosa. Unexpectedly but interestingly, the unique open reading frame conserved in the mulberry-infecting strains in the U. S. was also identified in the recently sequenced olive-associated strain CoDiRO isolated in Italy. When the primer set was tested against naturally infected olive plant samples collected in Italy, it allowed for detection of olive-associated strains of X. fastidiosa in Italy. This PCR assay, therefore, will also be useful for detection and identification of the Italian group of X. fastidiosa strains to aid understanding of the occurrence, evolution and biology of this new group of X. fastidiosa strains.
Silverman, Lee R.; Phipps, Andrew J.; Montgomery, Andrew; Ratner, Lee; Lairmore, Michael D.
2004-01-01
Human T-cell lymphotropic virus type 1 (HTLV-1) causes adult T-cell leukemia/lymphoma and exhibits high genetic stability in vivo. HTLV-1 contains four open reading frames (ORFs) in its pX region. ORF II encodes two proteins, p30II and p13II, both of which are incompletely characterized. p30II localizes to the nucleus or nucleolus and has distant homology to the transcription factors Oct-1, Pit-1, and POU-M1. In vitro studies have demonstrated that at low concentrations, p30II differentially regulates cellular and viral promoters through an interaction with CREB binding protein/p300. To determine the in vivo significance of p30II, we inoculated rabbits with cell lines expressing either a wild-type clone of HTLV-1 (ACH.1) or a clone containing a mutation in ORF II, which eliminated wild-type p30II expression (ACH.30.1). ACH.1-inoculated rabbits maintained higher HTLV-1-specific antibody titers than ACH.30.1-inoculated rabbits, and all ACH.1-inoculated rabbits were seropositive for HTLV-1, whereas only two of six ACH.30.1-inoculated rabbits were seropositive. Provirus could be consistently PCR amplified from peripheral blood mononuclear cell (PBMC) DNA in all ACH.1-inoculated rabbits but in only three of six ACH.30.1-inoculated rabbits. Quantitative competitive PCR indicated higher PBMC proviral loads in ACH.1-inoculated rabbits. Interestingly, sequencing of ORF II from PBMC of provirus-positive ACH.30.1-inoculated rabbits revealed a reversion to wild-type sequence with evidence of early coexistence of mutant and wild-type sequence. Our data provide evidence that HTLV-1 must maintain its key accessory genes to survive in vivo and that in vivo pressures select for maintenance of wild-type ORF II gene products during the early course of infection. PMID:15047799
Nedelcu, Aurora M.; Lee, Robert W.; Lemieux, Claude; Gray, Michael W.; Burger, Gertraud
2000-01-01
Two distinct mitochondrial genome types have been described among the green algal lineages investigated to date: a reduced–derived, Chlamydomonas-like type and an ancestral, Prototheca-like type. To determine if this unexpected dichotomy is real or is due to insufficient or biased sampling and to define trends in the evolution of the green algal mitochondrial genome, we sequenced and analyzed the mitochondrial DNA (mtDNA) of Scenedesmus obliquus. This genome is 42,919 bp in size and encodes 42 conserved genes (i.e., large and small subunit rRNA genes, 27 tRNA and 13 respiratory protein-coding genes), four additional free-standing open reading frames with no known homologs, and an intronic reading frame with endonuclease/maturase similarity. No 5S rRNA or ribosomal protein-coding genes have been identified in Scenedesmus mtDNA. The standard protein-coding genes feature a deviant genetic code characterized by the use of UAG (normally a stop codon) to specify leucine, and the unprecedented use of UCA (normally a serine codon) as a signal for termination of translation. The mitochondrial genome of Scenedesmus combines features of both green algal mitochondrial genome types: the presence of a more complex set of protein-coding and tRNA genes is shared with the ancestral type, whereas the lack of 5S rRNA and ribosomal protein-coding genes as well as the presence of fragmented and scrambled rRNA genes are shared with the reduced–derived type of mitochondrial genome organization. Furthermore, the gene content and the fragmentation pattern of the rRNA genes suggest that this genome represents an intermediate stage in the evolutionary process of mitochondrial genome streamlining in green algae. [The sequence data described in this paper have been submitted to the GenBank data library under accession no. AF204057.] PMID:10854413
Puthoff, D P; Neelam, A; Ehrenfried, M L; Scheffler, B E; Ballard, L; Song, Q; Campbell, K B; Cooper, B; Tucker, M L
2008-10-01
Hyphae, 2 to 8 days postinoculation (dpi), and haustoria, 5 dpi, were isolated from Uromyces appendiculatus infected bean leaves (Phaseolus vulgaris cv. Pinto 111) and a separate cDNA library prepared for each fungal preparation. Approximately 10,000 hyphae and 2,700 haustoria clones were sequenced from both the 5' and 3' ends. Assembly of all of the fungal sequences yielded 3,359 contigs and 927 singletons. The U. appendiculatus sequences were compared with sequence data for other rust fungi, Phakopsora pachyrhizi, Uromyces fabae, and Puccinia graminis. The U. appendiculatus haustoria library included a large number of genes with unknown cellular function; however, summation of sequences of known cellular function suggested that haustoria at 5 dpi had fewer transcripts linked to protein synthesis in favor of energy metabolism and nutrient uptake. In addition, open reading frames in the U. appendiculatus data set with an N-terminal signal peptide were identified and compared with other proteins putatively secreted from rust fungi. In this regard, a small family of putatively secreted RTP1-like proteins was identified in U. appendiculatus and P. graminis.
Generation of 2A-linked multicistronic cassettes by recombinant PCR.
Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A
2012-02-01
The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. It is now possible to express multiple proteins from a single open reading frame (ORF) using 2A peptide-linked multicistronic vectors. These small sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. Expression of more than two genes using conventional approaches has several limitations, most notably imbalanced protein expression and large size. The use of 2A peptide sequences alleviates these concerns. They are small (18-22 amino acids) and have divergent amino-terminal sequences, which minimizes the chance for homologous recombination and allows for multiple, different 2A peptide sequences to be used within a single vector. Importantly, separation of genes placed between 2A peptide sequences is nearly 100%, which allows for stoichiometric and concordant expression of the genes, regardless of the order of placement within the vector. This protocol describes the use of recombinant polymerase chain reaction (PCR) to connect multiple 2A-linked protein sequences. The final construct is subcloned into an expression vector.
Hinnant, Amanda; Oh, Hyun Jee; Caburnay, Charlene A; Kreuter, Matthew W
2011-12-01
News stories reporting race-specific health information commonly emphasize disparities between racial groups. But recent research suggests this focus on disparities has unintended effects on African American audiences, generating negative emotions and less interest in preventive behaviors (Nicholson RA, Kreuter MW, Lapka C et al. Unintended effects of emphasizing disparities in cancer communication to African-Americans. Cancer Epidemiol Biomarkers Prev 2008; 17: 2946-52). They found that black adults are more interested in cancer screening after reading about the progress African Americans have made in fighting cancer than after reading stories emphasizing disparities between blacks and whites. This study builds on past findings by (i) examining how health journalists judge the newsworthiness of stories that report race-specific health information by emphasizing disparities versus progress and (ii) determining whether these judgments can be changed by informing journalists of audience reactions to disparity versus progress framing. In a double-blind-randomized experiment, 175 health journalists read either a disparity- or progress-framed story on colon cancer, preceded by either an inoculation about audience effects of such framing or an unrelated (i.e. control) information stimuli. Journalists rated the disparity-frame story more favorably than the progress-frame story in every category of news values. However, the inoculation significantly increased positive reactions to the progress-frame story. Informing journalists of audience reactions to race-specific health information could influence how health news stories are framed.
Kinchington, P R; Vergnes, J P; Defechereux, P; Piette, J; Turse, S E
1994-01-01
Four of the 68 varicella-zoster virus (VZV) unique open reading frames (ORFs), i.e., ORFs 4, 61, 62, and 63, encode proteins that influence viral transcription and are considered to be positional homologs of herpes simplex virus type 1 (HSV-1) immediate-early (IE) proteins. In order to identify the elements that regulate transcription of VZV ORFs 4 and 63, the encoded mRNAs were mapped in detail. For ORF 4, a major 1.8-kb and a minor 3.0-kb polyadenylated [poly(A)+] RNA were identified, whereas ORF 63-specific probes recognized 1.3- and 1.9-kb poly(A)+ RNAs. Probes specific for sequences adjacent to the ORFs and mapping of the RNA 3' ends indicated that the ORF 4 RNAs were 3' coterminal, whereas the RNAs for ORF 63 represented two different termination sites. S1 nuclease mapping and primer extension analyses indicated a single transcription initiation site for ORF 4 at 38 bp upstream of the ORF start codon. For ORF 63, multiple transcriptional start sites at 87 to 95, 151 to 153, and (tentatively) 238 to 243 bp upstream of the ORF start codon were identified. TATA box motifs at good positional locations were found upstream of all mapped transcription initiation sites. However, no sequences resembling the TAATGARAT motif, which confers IE regulation upon HSV-1 IE genes, were found. The finding of the absence of this motif was supported through analyses of the regulatory sequences of ORFs 4 and 63 in transient transfection assays alongside those of ORFs 61 and 62. Sequences representing the promoters for ORFs 4, 61, and 63 were all stimulated by VZV infection but failed to be stimulated by coexpression with the HSV-1 transactivator Vmw65. In contrast, the promoter for ORF 62, which contains TAATGARAT motifs, was activated by VZV infection and coexpression with Vmw65. These results extend the transcriptional knowledge for VZV and suggest that ORFs 4 and 63 contain regulatory signals different from those of the ORF 62 and HSV-1 IE genes. Images PMID:8189496
IL26 gene inactivation in Equidae.
Shakhsi-Niaei, M; Drögemüller, M; Jagannathan, V; Gerber, V; Leeb, T
2013-12-01
Interleukin-26 (IL26) is a member of the IL10 cytokine family. The IL26 gene is located between two other well-known cytokines genes of this family encoding interferon-gamma (IFNG) and IL22 in an evolutionary conserved gene cluster. In contrast to humans and most other mammals, mice lack a functional Il26 gene. We analyzed the genome sequences of other vertebrates for the presence or absence of functional IL26 orthologs and found that the IL26 gene has also become inactivated in several equid species. We detected a one-base pair frameshift deletion in exon 2 of the IL26 gene in the domestic horse (Equus caballus), Przewalski horse (Equus przewalskii) and donkey (Equus asinus). The remnant IL26 gene in the horse is still transcribed and gives rise to at least five alternative transcripts. None of these transcripts share a conserved open reading frame with the human IL26 gene. A comparative analysis across diverse vertebrates revealed that the IL26 gene has also independently been inactivated in a few other mammals, including the African elephant and the European hedgehog. The IL26 gene thus appears to be highly variable, and the conserved open reading frame has been lost several times during mammalian evolution. © 2013 The Authors, Animal Genetics © 2013 Stichting International Foundation for Animal Genetics.
Donzé, O; Spahr, P F
1992-01-01
The Rous sarcoma virus (RSV) RNA leader sequence carries three open reading frames (uORFs) upstream of the AUG initiator of the gag gene. We studied, in vivo, the role of these uORFs by changing two or three nucleotides of the three AUGs or by deleting the first uORF. Our results show that (i) unlike most previously characterized uORFs, which decrease translation, the first uORF (AUG1) of RSV acts as an enhancer of translation, since absence of the first AUG decreased translation; AUG3 also modulates translation, probably by interfering with scanning ribosomes as described for other upstream ORFs, and mutation of AUG2 had no effect on translation. (ii) Mutation of each of the upstream AUGs lowered the infectivity of progeny virions. (iii) Unexpectedly, mutation of AUG1 and/or AUG3 dramatically reduced RNA packaging by 50-to 100-fold, unlike mutation of AUG2 which did not alter RNA packaging efficiency. Additional mutants in the vicinity of uORF1 and uORF3 were constructed in order to elucidate the mechanism by which uORFs affect RNA packaging: a translation model requiring uORFs 1 and 3, and involving ribosome pausing at AUG 3 is discussed. Images PMID:1327749
Novel Genes Encoding Hexadecanoic Acid Δ6-Desaturase Activity in a Rhodococcus sp.
Araki, Hiroyuki; Hagihara, Hiroshi; Takigawa, Hirofumi; Tsujino, Yukiharu; Ozaki, Katsuya
2016-11-01
cis-6-Hexadecenoic acid, a major component of human sebaceous lipids, is involved in the defense mechanism against Staphylococcus aureus infection in healthy skin and closely related to atopic dermatitis. Previously, Koike et al. (Biosci Biotechnol Biochem 64:1064-1066, 2000) reported that a mutant strain of Rhodococcus sp. produced cis-6-hexadecenoate derivatives from palmitate alkyl esters. From the mutant Rhodococcus strain, we identified and sequenced two open reading frames present in an amplified 5.7-kb region; these open reading frames encoded tandemly repeated Δ6-desaturase-like genes, Rdes1 and Rdes2. A phylogenetic tree indicated that Rdes1 and Rdes2 were different from previously known Δ6-desaturase genes, and that they formed a new cluster. Rdes1 and Rdes2 were each introduced into vectors and then expressed separately in Escherichia coli, and the fatty acid composition of the transformed cells was analyzed by gas chromatography and mass spectrometry. The amount of cis-6-hexadecenoic acid was significantly higher in Rdes1- or Rdes2-transformed E. coli cells (twofold and threefold, respectively) than in vector-only control cells. These results showed that cis-6-hexadecenoic acid was produced in E. coli cells by the rhodococcal Δ6-desaturase-like proteins.
Sequence verification of synthetic DNA by assembly of sequencing reads
Wilson, Mandy L.; Cai, Yizhi; Hanlon, Regina; Taylor, Samantha; Chevreux, Bastien; Setubal, João C.; Tyler, Brett M.; Peccoud, Jean
2013-01-01
Gene synthesis attempts to assemble user-defined DNA sequences with base-level precision. Verifying the sequences of construction intermediates and the final product of a gene synthesis project is a critical part of the workflow, yet one that has received the least attention. Sequence validation is equally important for other kinds of curated clone collections. Ensuring that the physical sequence of a clone matches its published sequence is a common quality control step performed at least once over the course of a research project. GenoREAD is a web-based application that breaks the sequence verification process into two steps: the assembly of sequencing reads and the alignment of the resulting contig with a reference sequence. GenoREAD can determine if a clone matches its reference sequence. Its sophisticated reporting features help identify and troubleshoot problems that arise during the sequence verification process. GenoREAD has been experimentally validated on thousands of gene-sized constructs from an ORFeome project, and on longer sequences including whole plasmids and synthetic chromosomes. Comparing GenoREAD results with those from manual analysis of the sequencing data demonstrates that GenoREAD tends to be conservative in its diagnostic. GenoREAD is available at www.genoread.org. PMID:23042248
Li, Runsheng; Hsieh, Chia-Ling; Young, Amanda; Zhang, Zhihong; Ren, Xiaoliang; Zhao, Zhongying
2015-01-01
Most next-generation sequencing platforms permit acquisition of high-throughput DNA sequences, but the relatively short read length limits their use in genome assembly or finishing. Illumina has recently released a technology called Synthetic Long-Read Sequencing that can produce reads of unusual length, i.e., predominately around 10 Kb. However, a systematic assessment of their use in genome finishing and assembly is still lacking. We evaluate the promise and deficiency of the long reads in these aspects using isogenic C. elegans genome with no gap. First, the reads are highly accurate and capable of recovering most types of repetitive sequences. However, the presence of tandem repetitive sequences prevents pre-assembly of long reads in the relevant genomic region. Second, the reads are able to reliably detect missing but not extra sequences in the C. elegans genome. Third, the reads of smaller size are more capable of recovering repetitive sequences than those of bigger size. Fourth, at least 40 Kbp missing genomic sequences are recovered in the C. elegans genome using the long reads. Finally, an N50 contig size of at least 86 Kbp can be achieved with 24×reads but with substantial mis-assembly errors, highlighting a need for novel assembly algorithm for the long reads. PMID:26039588
Mouse mammary tumor virus-like gene sequences are present in lung patient specimens
2011-01-01
Background Previous studies have reported on the presence of Murine Mammary Tumor Virus (MMTV)-like gene sequences in human cancer tissue specimens. Here, we search for MMTV-like gene sequences in lung diseases including carcinomas specimens from a Mexican population. This study was based on our previous study reporting that the INER51 lung cancer cell line, from a pleural effusion of a Mexican patient, contains MMTV-like env gene sequences. Results The MMTV-like env gene sequences have been detected in three out of 18 specimens studied, by PCR using a specific set of MMTV-like primers. The three identified MMTV-like gene sequences, which were assigned as INER6, HZ101, and HZ14, were 99%, 98%, and 97% homologous, respectively, as compared to GenBank sequence accession number AY161347. The INER6 and HZ-101 samples were isolated from lung cancer specimens, and the HZ-14 was isolated from an acute inflammatory lung infiltrate sample. Two of the env sequences exhibited disruption of the reading frame due to mutations. Conclusion In summary, we identified the presence of MMTV-like gene sequences in 2 out of 11 (18%) of the lung carcinomas and 1 out of 7 (14%) of acute inflamatory lung infiltrate specimens studied of a Mexican Population. PMID:21943279
Zahn-Zabal, M.; Lehmann, E.; Kohli, J.
1995-01-01
The M26 mutation in the ade6 gene of Schizosaccharomyces pombe creates a hot spot of meiotic recombination. A single base substitution, the M26 mutation is situated within the open reading frame, near the 5' end. It has previously been shown that the heptanucleotide sequence 5' ATGACGT 3', which includes the M26 mutation, is required for hot spot activity. The 510-bp ade6-delXB deletion encompasses the promoter and the first 23 bp of the open reading frame, ending 112 bp upstream of M26. Deletion of the promoter in cis to M26 abolishes hot spot activity, while deletion in trans to M26 has no effect. Homozygous deletion of the promoter also eliminates M26 hot spot activity, indicating that the heterology created through deletion of the promoter per se is not responsible for the loss of hot spot activity. Thus, DNA sequences other than the heptanucleotide 5' ATGACGT 3', which must be located at the 5' end of the ade6 gene, appear to be required for hot spot activity. While the M26 hotspot stimulates crossovers associated with M26 conversion, it does not affect the crossover frequency in the intervals adjacent to ade6. The flanking marker ura4-aim, a heterology created by insertion of the ura4(+) gene upstream of ade6, turned out to be a hot spot itself. It shows disparity of conversion with preferential loss of the insertion. The frequency of conversion at ura4-aim is reduced when the M26 hot spot is active 15 kb away, indicating competition for recombination factors by hot spots in close proximity. PMID:7498729
Sun, Fei; Du, Wenhua; Ma, Junhua; Gu, Mingjun; Wang, Jingnan; Zhu, Hongling; Song, Huaidong; Gao, Guanqi
2018-06-11
Neonatal diabetes mellitus is likely caused by monogenic mutations, several of which have been identified. INS mutations have a broad spectrum of clinical presentations, ranging from severe neonatal onset to mild adult onset, which suggests that the products of different mutant INS alleles behave differently and utilize distinct mechanisms to induce diabetes. In this study, a neonatal diabetes mellitus patient's INS gene was sequenced, and functional experiments were conducted. The neonatal diabetes mellitus patient's genomic DNA was extracted, and the patient's KCNJ11, ABCC8, and INS genes were sequenced. A novel mutation was identified in INS, and the open reading frame of this human mutant INS gene was inserted into the pMSCV-PIG plasmid. The constructed pMSCV-PIG plasmid was combined with VSV-g and Gag-pol and transfected into 293T cells to package the lentivirus. To stably overexpress the mutant gene, INS-1 cells were infected with the virus. The levels of insulin in the cell culture medium and cytoplasm were determined by ELISA and immunocytochemistry, respectively. A heterozygous mutation, c.125T>G (p. Val42Gly), was identified in a neonatal diabetes mellitus patient's INS gene. The human mutant INS open reading frame was overexpressed in INS-1 cells, and the mutant insulin was undetectable in the cell culture medium and cytoplasm. The novel heterozygous activating mutation c.125 T>G (p.Val42Gly) impairs the synthesis of insulin by pancreatic beta cells, resulting in diabetes. © Georg Thieme Verlag KG Stuttgart · New York.
Bowling, Bethany V.; Schultheis, Patrick J.
2015-01-01
Saccharomyces cerevisiae was the first eukaryotic organism to be sequenced, however little progress has been made in recent years in furthering our understanding of all open reading frames (ORFs). From October 2012 to May 2015 the number of verified ORFs has only risen from 75.31% to 78% while the number of uncharacterized ORFs have decreased from 12.8% to 11% (representing more than 700 genes still left in this category) [http://www.yeastgenome.org/genomesnapshot]. Course-based research has been shown to increase student learning while providing experience with real scientific investigation; however, implementation in large, multi-section courses presents many challenges. This study sought to test the feasibility and effectiveness of incorporating authentic research into a core genetics course with multiple instructors to increase student learning and progress our understanding of uncharacterized ORFs. We generated a module-based annotation toolkit and utilized easily accessible bioinformatics tools to predict gene function for uncharacterized ORFs within the Saccharomyces Genome Database (SGD). Students were each assigned an uncharacterized ORF which they annotated using contemporary comparative genomics methodologies including multiple sequence alignment, conserved domain identification, signal peptide prediction and cellular localization algorithms. Student learning outcomes were measured by quizzes, project reports and presentations, as well as a post-project questionnaire. Our results indicate the authentic research experience had positive impacts on student's perception of their learning and their confidence to conduct future research. Furthermore we believe that creation of an online repository and adoption and/or adaptation of this project across multiple researchers and institutions could speed the process of gene function prediction. PMID:26460164
Grant, Susan; Grant, William D.; Cowan, Don A.; Jones, Brian E.; Ma, Yanhe; Ventosa, Antonio; Heaphy, Shaun
2006-01-01
Here we describe the application of metagenomic technologies to construct cDNA libraries from RNA isolated from environmental samples. RNAlater (Ambion) was shown to stabilize RNA in environmental samples for periods of at least 3 months at −20°C. Protocols for library construction were established on total RNA extracted from Acanthamoeba polyphaga trophozoites. The methodology was then used on algal mats from geothermal hot springs in Tengchong county, Yunnan Province, People's Republic of China, and activated sludge from a sewage treatment plant in Leicestershire, United Kingdom. The Tenchong libraries were dominated by RNA from prokaryotes, reflecting the mainly prokaryote microbial composition. The majority of these clones resulted from rRNA; only a few appeared to be derived from mRNA. In contrast, many clones from the activated sludge library had significant similarity to eukaryote mRNA-encoded protein sequences. A library was also made using polyadenylated RNA isolated from total RNA from activated sludge; many more clones in this library were related to eukaryotic mRNA sequences and proteins. Open reading frames (ORFs) up to 378 amino acids in size could be identified. Some resembled known proteins over their full length, e.g., 36% match to cystatin, 49% match to ribosomal protein L32, 63% match to ribosomal protein S16, 70% to CPC2 protein. The methodology described here permits the polyadenylated transcriptome to be isolated from environmental samples with no knowledge of the identity of the microorganisms in the sample or the necessity to culture them. It has many uses, including the identification of novel eukaryotic ORFs encoding proteins and enzymes. PMID:16391035
1994-01-01
The 40-S subunit of eukaryotic ribosomes binds to the capped 5'-end of mRNA and scans for the first AUG in a favorable sequence context to initiate translation. Most eukaryotic mRNAs therefore have a short 5'- untranslated region (5'-UTR) and no AUGs upstream of the translational start site; features that seem to assure efficient translation. However, approximately 5-10% of all eukaryotic mRNAs, particularly those encoding for regulatory proteins, have complex leader sequences that seem to compromise translational initiation. The retinoic-acid- receptor-beta 2 (RAR beta 2) mRNA is such a transcript with a long (461 nucleotides) 5'-UTR that contains five, partially overlapping, upstream open reading frames (uORFs) that precede the major ORF. We have begun to investigate the function of this complex 5'-UTR in transgenic mice, by introducing mutations in the start/stop codons of the uORFs in RAR beta 2-lacZ reporter constructs. When we compared the expression patterns of mutant and wild-type constructs we found that these mutations affected expression of the downstream RAR beta 2-ORF, resulting in an altered regulation of RAR beta 2-lacZ expression in heart and brain. Other tissues were unaffected. RNA analysis of adult tissues demonstrated that the uORFs act at the level of translation; adult brains and hearts of transgenic mice carrying a construct with either the wild-type or a mutant UTR, had the same levels of mRNA, but only the mutant produced protein. Our study outlines an unexpected role for uORFs: control of tissue-specific and developmentally regulated gene expression. PMID:7962071
Bowling, Bethany V; Schultheis, Patrick J; Strome, Erin D
2016-02-01
Saccharomyces cerevisiae was the first eukaryotic organism to be sequenced; however, little progress has been made in recent years in furthering our understanding of all open reading frames (ORFs). From October 2012 to May 2015 the number of verified ORFs had only risen from 75.31% to 78%, while the number of uncharacterized ORFs had decreased from 12.8% to 11% (representing > 700 genes still left in this category; http://www.yeastgenome.org/genomesnapshot). Course-based research has been shown to increase student learning while providing experience with real scientific investigation; however, implementation in large, multi-section courses presents many challenges. This study sought to test the feasibility and effectiveness of incorporating authentic research into a core genetics course, with multiple instructors, to increase student learning and progress our understanding of uncharacterized ORFs. We generated a module-based annotation toolkit and utilized easily accessible bioinformatics tools to predict gene function for uncharacterized ORFs within the Saccharomyces Genome Database (SGD). Students were each assigned an uncharacterized ORF, which they annotated using contemporary comparative genomics methodologies, including multiple sequence alignment, conserved domain identification, signal peptide prediction and cellular localization algorithms. Student learning outcomes were measured by quizzes, project reports and presentations, as well as a post-project questionnaire. Our results indicate that the authentic research experience had positive impacts on students' perception of their learning and their confidence to conduct future research. Furthermore, we believe that creation of an online repository and adoption and/or adaptation of this project across multiple researchers and institutions could speed the process of gene function prediction. Copyright © 2015 John Wiley & Sons, Ltd.
Xu, Dongxue; Sun, Lina; Liu, Shilin; Zhang, Libin; Yang, Hongsheng
2016-08-01
The heat shock response (HSR) is known for the elevated synthesis of heat shock proteins (HSPs) under heat stress, which is mediated primarily by heat shock factor 1 (HSF1). Heat shock factor binding protein 1 (HSBP1) and feedback control of heat shock protein 70 (HSP70) are major regulators of the activity of HSF1. We obtained full-length cDNA of genes hsf1 and hsbp1 in the sea cucumber Apostichopus japonicus, which are the second available for echinoderm (after Strongylocentrotus purpuratus), and the first available for holothurian. The full-length cDNA of hsf1 was 2208bp, containing a 1326bp open reading frame encoding 441 amino acids. The full-length cDNA of hsbp1 was 2850bp, containing a 225bp open reading frame encoding 74 amino acids. The similarities of A. japonicus HSF1 with other species are low, and much higher similarity identities of A. japonicus HSBP1 were shared. Phylogenetic trees showed that A. japonicus HSF1 and HSBP1 were clustered with sequences from S. purpuratus, and fell into distinct clades with sequences from mollusca, arthropoda and vertebrata. Analysis by real-time PCR showed hsf1 and hsbp1 mRNA was expressed constitutively in all tissues examined. The expression of hsf1, hsbp1 and hsp70 in the intestine at 26°C was time-dependent. The results of this study might provide new insights into the regulation of heat shock response in this species. Copyright © 2016. Published by Elsevier Inc.
2008-06-01
proteins during embryogene- sis, neurodevelopment and cancer. Part of their function is through the repression of CKIs, including p16. Some functions...protein 2 Other AUTS2 autism susceptibility candidate 2 C1orf24 chromosome 1 open reading frame 24 C20orf97 chromosome 20 open reading frame 97
2007-06-01
embryogene- sis, neurodevelopment and cancer. Part of their function is through the repression of CKIs, including p16. Some functions have been attributed to...AUTS2 autism susceptibility candidate 2 C1orf24 chromosome 1 open reading frame 24 C20orf97 chromosome 20 open reading frame 97 DKFZP566B183
USDA-ARS?s Scientific Manuscript database
The members of Capillovirus genus encode two overlapping open reading frames (ORFs): ORF1 encodes a large polyprotein containing the domains of replication-associated proteins plus a coat protein (CP), and ORF2 encodes a movement protein, located within ORF1 in a different reading frame. Organizatio...
Spielmann, A; Stutz, E
1983-10-25
The soybean chloroplast psb A gene (photosystem II thylakoid membrane protein of Mr 32 000, lysine-free) and the trn H gene (tRNAHisGUG), which both map in the large single copy region adjacent to one of the inverted repeat structures (IR1), have been sequenced including flanking regions. The psb A gene shows in its structural part 92% sequence homology with the corresponding genes of spinach and N. debneyi and contains also an open reading frame for 353 aminoacids. The aminoacid sequence of a potential primary translation product (calculated Mr, 38 904, no lysine) diverges from that of spinach and N. debneyi in only two positions in the C-terminal part. The trn H gene has the same polarity as the psb A gene and the coding region is located at the very end of the large single copy region. The deduced sequence of the soybean chloroplast tRNAHisGUG is identical with that of Zea mays chloroplasts. Both ends of the large single copy region were sequenced including a small segment of the adjacent IR1 and IR2.
Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain
2011-01-01
cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.
The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus.
Gustafson, G; Armour, S L
1986-01-01
The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus (BSMV) has been determined. The sequence is 3289 nucleotides in length and contains four open reading frames (ORFs) which code for proteins of Mr 22,147 (ORF1), Mr 58,098 (ORF2), Mr 17,378 (ORF3), and Mr 14,119 (ORF4). The predicted N-terminal amino acid sequence of the polypeptide encoded by the ORF nearest the 5'-end of the RNA (ORF1) is identical (after the initiator methionine) to the published N-terminal amino acid sequence of BSMV coat protein for 29 of the first 30 amino acids. ORF2 occupies the central portion of the coding region of RNA beta and ORF3 is located at the 3'-end. The ORF4 sequence overlaps the 3'-region of ORF2 and the 5'-region of ORF3 and differs in codon usage from the other three RNA beta ORFs. The coding region of RNA beta is followed by a poly(A) tract and a 238 nucleotide tRNA-like structure which are common to all three BSMV genomic RNAs. Images PMID:3754962
Molecular characterization of a novel luteovirus infecting apple by next-generation sequencing.
Shen, Pan; Tian, Xin; Zhang, Song; Ren, Fang; Li, Ping; Yu, Yun-Qi; Li, Ruhui; Zhou, Changyong; Cao, Mengji
2018-03-01
A new single-stranded positive-sense RNA virus, which shares the highest nucleotide (nt) sequence identity of 53.4% with the genome sequence of cherry-associated luteovirus South Korean isolate (ChALV-SK, genus Luteovirus), was discovered in this work. It is provisionally named apple-associated luteovirus (AaLV). The complete genome sequence of AaLV comprises 5,890 nt and contains eight open reading frames (ORFs), in a very similar arrangement that is typical of members of the genus Luteovirus. When compared with other members of the family Luteoviridae, ORF1 of AaLV was found to encompass another ORF, ORF1a, which encodes a putative 32.9-kDa protein. The ORF1-ORF2 region (RNA-dependent RNA polymerase, RdRP) showed the greatest amino acid (aa) sequence identity (59.7%) to that of cherry-associated luteovirus Czech Republic isolate (ChALV-CZ, genus Luteovirus). The results of genome sequence comparisons and phylogenetic analysis, suggest that AaLV should be a member of a novel species in the genus Luteovirus. To our knowledge, it is the sixth member of the genus Luteovirus reported to naturally infect rosaceous plants.
The primary structure of the thymidine kinase gene of fish lymphocystis disease virus.
Schnitzler, P; Handermann, M; Szépe, O; Darai, G
1991-06-01
The DNA nucleotide sequence of the thymidine kinase (TK) gene of fish lymphocystis disease virus (FLDV) which has been localized between the coordinates 0.678 to 0.688 of the viral genome was determined. The analysis of the DNA nucleotide sequence located between the recognition sites of HindIII (0.669 map unit; nucleotide position 1) and AccI (nucleotide position 2032) revealed the presence of an open reading frame of 954 bp on the lower strand of this region between nucleotide positions 1868 (ATG) and 915 (TAA). It encodes for a protein of 318 amino acid residues. The evolutionary relationships of the TK gene of FLDV to the other known TK genes was investigated using the method of progressive sequence alignment. These analyses revealed a high degree of diversity between the protein sequence of FLDV TK gene and the amino acid composition of other TKs tested. However, significant conservations were detected at several regions of amino acid residues of the FLDV TK protein when compared to the amino acid sequence of TKs of African swine fever virus, fowlpox virus, shope fibroma virus, and vaccinia virus and to the amino acid sequences of the cellular cytoplasmic TK of chicken, mouse, and man.
Self-guide framing and persuasion: responsibly increasing message processing to ideal levels.
Evans, Lisa M; Petty, Richard E
2003-03-01
The current research examines the effect that framing persuasive messages in terms of self-guides (ideal vs. ought) has on the attitudes and cognitive responses of individuals with chronic ideal versus ought self-guides. The strength of participants' ideal and ought self-guides and the magnitude of participants' ideal and ought self-discrepancies were measured using a computerized reaction time program. One week later, participants read a persuasive message about a fictional breakfast product, framed in terms of either ideals or oughts. Matching framing to stronger self-guide led to enhanced message processing activity, especially among individuals who were low in need for cognition. Individuals who read messages framed to match their stronger self-guides paid more attention to argument quality, as reflected in their attitudes and cognitive responses. Messages with self-guide framing that matched individuals' stronger self-discrepancies did not have this effect on processing.
Yocum, R R; Perkins, J B; Howitt, C L; Pero, J
1996-01-01
The metE gene, encoding S-adenosylmethionine synthetase (EC 2.5.1.6) from Bacillus subtilis, was cloned in two steps by normal and inverse PCR. The DNA sequence of the metE gene contains an open reading frame which encodes a 400-amino-acid sequence that is homologous to other known S-adenosylmethionine synthetases. The cloned gene complements the metE1 mutation and integrates at or near the chromosomal site of metE1. Expression of S-adenosylmethionine synthetase is reduced by only a factor of about 2 by exogenous methioinine. Overproduction of S-adenosylmethionine synthetase from a strong constitutive promoter leads to methionine auxotrophy in B. subtilis, suggesting that S-adenosylmethionine is a corepressor of methionine biosynthesis in B. subtilis, as others have already shown for Escherichia coli. PMID:8755891
Yocum, R R; Perkins, J B; Howitt, C L; Pero, J
1996-08-01
The metE gene, encoding S-adenosylmethionine synthetase (EC 2.5.1.6) from Bacillus subtilis, was cloned in two steps by normal and inverse PCR. The DNA sequence of the metE gene contains an open reading frame which encodes a 400-amino-acid sequence that is homologous to other known S-adenosylmethionine synthetases. The cloned gene complements the metE1 mutation and integrates at or near the chromosomal site of metE1. Expression of S-adenosylmethionine synthetase is reduced by only a factor of about 2 by exogenous methioinine. Overproduction of S-adenosylmethionine synthetase from a strong constitutive promoter leads to methionine auxotrophy in B. subtilis, suggesting that S-adenosylmethionine is a corepressor of methionine biosynthesis in B. subtilis, as others have already shown for Escherichia coli.
Qiu, T; Lu, R H; Zhang, J; Zhu, Z Y
2001-07-01
The complete nucleotide sequence of M6 gene of grass carp hemorrhage virus (GCHV) was determined. It is 2039 nucleotides in length and contains a single large open reading frame that could encode a protein of 648 amino acids with predicted molecular mass of 68.7 kDa. Amino acid sequence comparison revealed that the protein encoded by GCHV M6 is closely related to the protein mu1 of mammalian reovirus. The M6 gene, encoding the major outer-capsid protein, was expressed using the pET fusion protein vector in Escherichia coli and detected by Western blotting using chicken anti-GCHV immunoglobulin (IgY). The result indicates that the protein encoded by M6 may share a putative Asn-42-Pro-43 proteolytic cleavage site with mu1.
Complete genomic sequence of the Lactobacillus temperate phage LF1.
Yoon, Bo Hyun; Chang, Hyo Ihl
2011-10-01
Bacteriophage LF1, a newly isolated temperate phage from a mitomycin-C-induced lysate of wild type Lactobacillus fermentum, was found to contain a double-strand DNA of 42,606 base pairs (bp) with a G+C content of 45%. Bioinformatic analysis of the phage genome revealed 57 putative open reading frames (ORFs). The predicted protein products of ORFs were determined and described. According to morphological analysis by transmission electron microscopy (TEM), LF1 has an isometric head and a non-contractile tail, indicating that it belongs to the family Siphoviridae. The temperate phage LF1 has a good genetic mosaic relationship with ΦPYB5 in the packaging module. To our knowledge, this is first report of genomic sequencing and characterization of temperate phage LF1 from wild-type L. fermentum isolated from Kimchi in Korea.
Hinnant, Amanda; Oh, Hyun Jee; Caburnay, Charlene A.; Kreuter, Matthew W.
2011-01-01
News stories reporting race-specific health information commonly emphasize disparities between racial groups. But recent research suggests this focus on disparities has unintended effects on African American audiences, generating negative emotions and less interest in preventive behaviors (Nicholson RA, Kreuter MW, Lapka C et al. Unintended effects of emphasizing disparities in cancer communication to African-Americans. Cancer Epidemiol Biomarkers Prev 2008; 17: 2946–52). They found that black adults are more interested in cancer screening after reading about the progress African Americans have made in fighting cancer than after reading stories emphasizing disparities between blacks and whites. This study builds on past findings by (i) examining how health journalists judge the newsworthiness of stories that report race-specific health information by emphasizing disparities versus progress and (ii) determining whether these judgments can be changed by informing journalists of audience reactions to disparity versus progress framing. In a double-blind-randomized experiment, 175 health journalists read either a disparity- or progress-framed story on colon cancer, preceded by either an inoculation about audience effects of such framing or an unrelated (i.e. control) information stimuli. Journalists rated the disparity-frame story more favorably than the progress-frame story in every category of news values. However, the inoculation significantly increased positive reactions to the progress-frame story. Informing journalists of audience reactions to race-specific health information could influence how health news stories are framed. PMID:21911844
Ustav, M; Stenlund, A
1991-02-01
Bovine papillomavirus (BPV) DNA is maintained as an episome with a constant copy number in transformed cells and is stably inherited. To study BPV replication we have developed a transient replication assay based on a highly efficient electroporation procedure. Using this assay we have determined that in the context of the viral genome two of the viral open reading frames, E1 and E2, are required for replication. Furthermore we show that when produced from expression vectors in the absence of other viral gene products, the full length E2 transactivator polypeptide and a 72 kd polypeptide encoded by the E1 open reading frame in its entirety, are both necessary and sufficient for replication BPV in C127 cells.
Yedavalli, Venkat R. K.; Chappey, Colombe; Ahmad, Nafees
1998-01-01
The vpr sequences from six human immunodeficiency virus type 1 (HIV-1)-infected mother-infant pairs following perinatal transmission were analyzed. We found that 153 of the 166 clones analyzed from uncultured peripheral blood mononuclear cell DNA samples showed a 92.17% frequency of intact vpr open reading frames. There was a low degree of heterogeneity of vpr genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vpr sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Moreover, the infants’ sequences displayed patterns similar to those seen in their mothers. The functional domains essential for Vpr activity, including virion incorporation, nuclear import, and cell cycle arrest and differentiation were highly conserved in most of the sequences. Phylogenetic analyses of 166 mother-infant pairs and 195 other available vpr sequences from HIV databases formed distinct clusters for each mother-infant pair and for other vpr sequences and grouped the six mother-infant pairs’ sequences with subtype B sequences. A high degree of conservation of intact and functional vpr supports the notion that vpr plays an important role in HIV-1 infection and replication in mother-infant isolates that are involved in perinatal transmission. PMID:9658150
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species
Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha
2011-01-01
Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
Galinier, Richard; van Beurden, Steven; Amilhat, Elsa; Castric, Jeannette; Schoehn, Guy; Verneau, Olivier; Fazio, Géraldine; Allienne, Jean-François; Engelsma, Marc; Sasal, Pierre; Faliex, Elisabeth
2012-06-01
Eel virus European X (EVEX) was first isolated from diseased European eel Anguilla anguilla in Japan at the end of seventies. The virus was tentatively classified into the Rhabdoviridae family on the basis of morphology and serological cross reactivity. This family of viruses is organized into six genera and currently comprises approximately 200 members, many of which are still unassigned because of the lack of molecular data. This work presents the morphological, biochemical and genetic characterizations of EVEX, and proposes a taxonomic classification for this virus. We provide its complete genome sequence, plus a comprehensive sequence comparison between isolates from different geographical origins. The genome encodes the five classical structural proteins plus an overlapping open reading frame in the phosphoprotein gene, coding for a putative C protein. Phylogenic relationship with other rhabdoviruses indicates that EVEX is most closely related to the Vesiculovirus genus and shares the highest identity with trout rhabdovirus 903/87. Copyright © 2012 Elsevier B.V. All rights reserved.
Sequence and Analysis of the Tomato JOINTLESS Locus1
Mao, Long; Begum, Dilara; Goff, Stephen A.; Wing, Rod A.
2001-01-01
A 119-kb bacterial artificial chromosome from the JOINTLESS locus on the tomato (Lycopersicon esculentum) chromosome 11 contained 15 putative genes. Repetitive sequences in this region include one copia-like LTR retrotransposon, 13 simple sequence repeats, three copies of a novel type III foldback transposon, and four putative short DNA repeats. Database searches showed that the foldback transposon and the short DNA repeats seemed to be associated preferably with genes. The predicted tomato genes were compared with the complete Arabidopsis genome. Eleven out of 15 tomato open reading frames were found to be colinear with segments on five Arabidopsis bacterial artificial chromosome/P1-derived artificial chromosome clones. The synteny patterns, however, did not reveal duplicated segments in Arabidopsis, where over half of the genome is duplicated. Our analysis indicated that the microsynteny between the tomato and Arabidopsis genomes was still conserved at a very small scale but was complicated by the large number of gene families in the Arabidopsis genome. PMID:11457984
Tao, Yaqiong; Zeng, Bo; Xu, Liu; Yue, Bisong; Yang, Dong; Zou, Fangdong
2010-01-01
Interferon-gamma (IFN-gamma) is the only member of type II IFN and is vital in the regulation of immune and inflammatory responses. Herein we report the cloning, expression, and sequence analysis of IFN-gamma from the giant panda (Ailuropoda melanoleuca). The open reading frame of this gene is 501 base pair in length and encodes a polypeptide consisting of 166 amino acids. All conserved N-linked glycosylation sites and cysteine residues among carnivores were found in the predicted amino acid sequence of the giant panda. Recombinant giant panda IFN-gamma with a V5 epitope and polyhistidine tag was expressed in HEK293 host cells and confirmed by Western blotting. Phylogenetic analysis of mammalian IFN-gamma-coding sequences indicated that the giant panda IFN-gamma was closest to that of carnivores, then to ungulates and dolphin, and shared a distant relationship with mouse and human. These results represent a first step into the study of IFN-gamma in giant panda.
Yamamoto, S; Mutoh, N; Tsuzuki, D; Ikai, H; Nakao, H; Shinoda, S; Narimatsu, S; Miyoshi, S I
2000-05-01
L-2,4-diaminobutyrate decarboxylase (DABA DC) catalyzes the formation of 1,3-diaminopropane (DAP) from DABA. In the present study, the ddc gene encoding DABA DC from Enterobacter aerogenes ATCC 13048 was cloned and characterized. Determination of the nucleotide sequence revealed an open reading frame of 1470 bp encoding a 53659-Da protein of 490 amino acids, whose deduced NH2-terminal sequence was identical to that of purified DABA DC from E. aerogenes. The deduced amino acid sequence was highly similar to those of Acinetobacter baumannii and Haemophilus influenzae DABA DCs encoded by the ddc genes. The lysine-307 of the E. aerogenes DABA DC was identified as the pyridoxal 5'-phosphate binding residue by site-directed mutagenesis. Furthermore, PCR analysis revealed the distribution of E. aerogenes ddc homologs in some other species of Enterobacteriaceae. Such a relatively wide occurrence of the ddc homologs implies biological significance of DABA DC and its product DAP.
İnce, İkbal Agah; Pijlman, Gorben P; Vlak, Just M; van Oers, Monique M
2017-11-01
Previously, we observed that the transcripts of Invertebrate iridescent virus 6 (IIV6) are not polyadenylated, in line with the absence of canonical poly(A) motifs (AATAAA) downstream of the open reading frames (ORFs) in the genome. Here, we determined the 3' ends of the transcripts of fifty-four IIV6 virion protein genes in infected Drosophila Schneider 2 (S2) cells. By using ligation-based amplification of cDNA ends (LACE) it was shown that the IIV6 mRNAs often ended with a CAUUA motif. In silico analysis showed that the 3'-untranslated regions of IIV6 genes have the ability to form hairpin structures (22-56 nt in length) and that for about half of all IIV6 genes these 3' sequences contained complementary TAATG and CATTA motifs. We also show that a hairpin in the 3' flanking region with conserved sequence motifs is a conserved feature in invertebrate-infecting iridoviruses (genus Iridovirus and Chloriridovirus). Copyright © 2017 Elsevier Inc. All rights reserved.
The genome of the Lactobacillus sanfranciscensis temperate phage EV3
2013-01-01
Background Bacteriophages infection modulates microbial consortia and transduction is one of the most important mechanism involved in the bacterial evolution. However, phage contamination brings food fermentations to a halt causing economic setbacks. The number of phage genome sequences of lactic acid bacteria especially of lactobacilli is still limited. We analysed the genome of a temperate phage active on Lactobacillus sanfranciscensis, the predominant strain in type I sourdough fermentations. Results Sequencing of the DNA of EV3 phage revealed a genome of 34,834 bp and a G + C content of 36.45%. Of the 43 open reading frames (ORFs) identified, all but eight shared homology with other phages of lactobacilli. A similar genomic organization and mosaic pattern of identities align EV3 with the closely related Lactobacillus vaginalis ATCC 49540 prophage. Four unknown ORFs that had no homologies in the databases or predicted functions were identified. Notably, EV3 encodes a putative dextranase. Conclusions EV3 is the first L. sanfranciscensis phage that has been completely sequenced so far. PMID:24308641
Characterization of a chitinolytic enzyme from Serratia sp. KCK isolated from kimchi juice.
Kim, Hyun-Soo; Timmis, Kenneth N; Golyshin, Peter N
2007-07-01
The novel chitinolytic bacterium Serratia sp. KCK, which was isolated from kimchi juice, produced chitinase A. The gene coding for the chitinolytic enzyme was cloned on the basis of sequencing of internal peptides, homology search, and design of degenerated primers. The cloned open reading frame of chiA encodes for deduced polypeptide of 563 amino acid residues with a calculated molecular mass of 61 kDa and appears to correspond to a molecular mass of about 57 kDa, which excluded the signal sequence. The deduced amino acid sequence showed high similarity to those of bacterial chitinases classified as family 18 of glycosyl hydrolases. The chitinase A is an exochitinase and exhibits a greater pH range (5.0-10.0), thermostability with a temperature optimum of 40 degrees C, and substrate range other than Serratia chitinases thus far described. These results suggested that Serratia sp. KCK chitinase A can be used for biotechnological applications with good potential.
Chernin, L S; De la Fuente, L; Sobolev, V; Haran, S; Vorgias, C E; Oppenheim, A B; Chet, I
1997-01-01
The gene chiA, which codes for endochitinase, was cloned from a soilborne Enterobacter agglomerans. Its complete sequence was determined, and the deduced amino acid sequence of the enzyme designated Chia_Entag yielded an open reading frame coding for 562 amino acids of a 61-kDa precursor protein with a putative leader peptide at its N terminus. The nucleotide and polypeptide sequences of Chia_Entag showed 86.8 and 87.7% identity with the corresponding gene and enzyme, Chia_Serma, of Serratia marcescens, respectively. Homology modeling of Chia_Entag's three-dimensional structure demonstrated that most amino acid substitutions are at solvent-accessible sites. Escherichia coli JM109 carrying the E. agglomerans chiA gene produced and secreted Chia_Entag. The antifungal activity of the secreted endochitinase was demonstrated in vitro by inhibition of Fusarium oxysporum spore germination. The transformed strain inhibited Rhizoctonia solani growth on plates and the root rot disease caused by this fungus in cotton seedlings under greenhouse conditions. PMID:9055404
Highlander, S K; Wickersham, E A; Garza, O; Weinstock, G M
1993-01-01
Multicopy and single-copy chromosomal fusions between the Pasteurella haemolytica leukotoxin regulatory region and the Escherichia coli beta-galactosidase gene have been constructed. These fusions were used as reporters to identify and isolate regulators of leukotoxin expression from a P. haemolytica cosmid library. A cosmid clone, which inhibited leukotoxin expression from multicopy and single-copy protein fusions, was isolated and found to contain the complete leukotoxin gene cluster plus additional upstream sequences. The locus responsible for inhibition of expression from leukotoxin-beta-galactosidase fusions was mapped within these upstream sequences, by transposon mutagenesis with Tn5, and its DNA sequence was determined. The inhibitory activity was found to be associated with a predicted 440-amino-acid reading frame (lapA) that lies within a four-gene arginine transport locus. LapA is predicted to be the nucleotide-binding component of this transport system and shares homology with the Clp family of proteases. Images PMID:8359916
Peoples, R J; Cisco, M J; Kaplan, P; Francke, U
1998-01-01
We have identified a novel gene (WBSCR9) within the common Williams-Beuren syndrome (WBS) deletion by interspecies sequence conservation. The WBSCR9 gene encodes a roughly 7-kb transcript with an open reading frame of 1483 amino acids and a predicted protein product size of 170.8 kDa. WBSCR9 is comprised of at least 20 exons extending over 60 kb. The transcript is expressed ubiquitously throughout development and is subject to alternative splicing. Functional motifs identified by sequence homology searches include a bromodomain; a PHD, or C4HC3, finger; several putative nuclear localization signals; four nuclear receptor binding motifs; a polyglutamate stretch and two PEST sequences. Bromodomains, PHD motifs and nuclear receptor binding motifs are cardinal features of proteins that are involved in chromatin remodeling and modulation of transcription. Haploinsufficiency for WBSCR9 gene products may contribute to the complex phenotype of WBS by interacting with tissue-specific regulatory factors during development.
Ling, Roger; Firth, Andrew E
2017-08-01
Programmed -1 ribosomal frameshifting is a mechanism of gene expression whereby specific signals within messenger RNAs direct a proportion of ribosomes to shift -1 nt and continue translating in the new reading frame. Such frameshifting normally depends on an RNA structure stimulator 3'-adjacent to a 'slippery' heptanucleotide shift site sequence. Recently we identified an unusual frameshifting mechanism in encephalomyocarditis virus, where the stimulator involves a trans-acting virus protein. Thus, in contrast to other examples of -1 frameshifting, the efficiency of frameshifting in encephalomyocarditis virus is best studied in the context of virus infection. Here we use metabolic labelling to analyse the frameshifting efficiency of wild-type and mutant viruses. Confirming previous results, frameshifting depends on a G_GUU_UUU shift site sequence and a 3'-adjacent stem-loop structure, but is not appreciably affected by the 'StopGo' sequence present ~30 nt upstream. At late timepoints, frameshifting was estimated to be 46-76 % efficient.
Fiore, Nicola; Fajardo, Thor V M; Prodan, Simona; Herranz, María Carmen; Aparicio, Frederic; Montealegre, Jaime; Elena, Santiago F; Pallás, Vicente; Sánchez-Navarro, Jesús
2008-01-01
Prunus necrotic ringspot virus (PNRSV) is distributed worldwide, but no molecular data have been previously reported from South American isolates. The nucleotide sequences corresponding to the movement (MP) and coat (CP) proteins of 23 isolates of PNRSV from Chile, Brazil, and Uruguay, and from different Prunus species, have been obtained. Phylogenetic analysis performed with full-length MP and CP sequences from all the PNRSV isolates confirmed the clustering of the isolates into the previously reported PV32-I, PV96-II and PE5-III phylogroups. No association was found between specific sequences and host, geographic origin or symptomatology. Comparative analysis showed that both MP and CP have phylogroup-specific amino acids and all of the motifs previously characterized for both proteins. The study of the distribution of synonymous and nonsynonymous changes along both open reading frames revealed that most amino acid sites are under the effect of negative purifying selection.
Simultaneous message framing and error detection
NASA Technical Reports Server (NTRS)
Frey, A. H., Jr.
1968-01-01
Circuitry simultaneously inserts message framing information and detects noise errors in binary code data transmissions. Separate message groups are framed without requiring both framing bits and error-checking bits, and predetermined message sequence are separated from other message sequences without being hampered by intervening noise.
Using message framing to promote acceptance of the human papillomavirus vaccine.
Gerend, Mary A; Shepherd, Janet E
2007-11-01
Use of message framing for encouraging vaccination, an increasingly common preventive health behavior, has received little empirical investigation. The authors examined the relative effectiveness of gain-versus loss-framed messages in promoting acceptance of a vaccine against human papillomavirus (HPV)-a virus responsible for virtually all cases of cervical cancer. Undergraduate women (N = 121) were randomly assigned to read a booklet describing the benefits of receiving (gain-framed message) or the costs of not receiving (loss-framed message) a prophylactic HPV vaccine. After reading the booklet, participants indicated their intent to obtain the HPV vaccine. A 5-item composite representing intentions to obtain the HPV vaccine. The effect of message framing on HPV vaccine acceptance was moderated by risky sexual behavior and approach avoidance motivation. A loss-framed message led to greater HPV vaccination intentions than a gain framed message but only among participants who had multiple sexual partners and participants who infrequently used condoms. The loss-frame advantage was also observed among participants high in avoidance motivation. Findings highlight characteristics of the message recipient that may affect the success of framed messages promoting vaccine acceptance. This study has practical implications for the development of health communications promoting vaccination.
Rivera-Vega, L; Mittapalli, O
2010-08-01
Emerald ash borer (EAB, Agrilus planipennis), an exotic invasive pest, has killed millions of ash trees (Fraxinus spp.) in North America and continues to threaten the very survival of the entire Fraxinus genus. Despite its high-impact status, to date very little knowledge exists for this devastating insect pest at the molecular level. Mariner-like elements (MLEs) are transposable elements, which are ubiquitous in occurrence in insects and other invertebrates. Because of their low specificity and broad host range, they can be used for epitope-tagging, gene mapping, and in vitro mutagenesis. The majority of the known MLEs are inactive due to in-frame shifts and stop codons within the open reading frame (ORF). We report on the cloning and characterization of two MLEs in A. planipennis genome (Apmar1 and Apmar2). Southern analysis indicated a very high copy number for Apmar1 and a moderate copy number for Apmar2. Phylogenetic analysis revealed that both elements belong to the irritans subfamily. Based on the high copy number for Apmar1, the full-length sequence was obtained using degenerate primers designed to the inverted terminal repeat (ITR) sequences of irritans MLEs. The recovered nucleotide sequence for Apmar1 consisted of 1,292 bases with perfect ITRs, and an ORF of 1,050 bases encoding a putative transposase of 349 amino acids. The deduced amino acid sequence of Apmar1 contained the conserved regions of mariner transposases including WVPHEL and YSPDLAP, and the D,D(34)D motif. Both Apmar1 and Apmar2 could represent useful genetic tools and provide insights on EAB adaptation.
Pankovics, Péter; Simmonds, Peter
2011-01-01
A novel positive-sense, single-stranded RNA (+ssRNA) virus (Halastavi árva RNA virus, HalV; JN000306) with di-cistronic genome organization was serendipitously identified in intestinal contents of freshwater carps (Cyprinus carpio) fished by line-fishing from fishpond “Lőrinte halastó” located in Veszprém County, Hungary. The complete nucleotide (nt) sequence of the genomic RNA is 9565 nt in length and contains two long - non-in-frame - open reading frames (ORFs), which are separated by an intergenic region. The ORF1 (replicase) is preceded by an untranslated sequence of 827 nt, while an untranslated region of 139 nt follows the ORF2 (capsid proteins). The deduced amino acid (aa) sequences of the ORFs showed only low (less than 32%) and partial similarity to the non-structural (2C-like helicase, 3C-like cystein protease and 3D-like RNA dependent RNA polymerase) and structural proteins (VP2/VP4/VP3) of virus families in Picornavirales especially to members of the viruses with dicistronic genome. Halastavi árva RNA virus is present in intestinal contents of omnivorous freshwater carps but the origin and the host species of this virus remains unknown. The unique viral sequence and the actual position indicate that Halastavi árva RNA virus seems to be the first member of a new di-cistronic ssRNA virus. Further studies are required to investigate the specific host species (and spectrum), ecology and role of Halastavi árva RNA virus in the nature. PMID:22195010
Transposon Tn10 contains two structural genes with opposite polarity between tetA and IS10R.
Schollmeier, K; Hillen, W
1984-01-01
The nucleotide sequence of the central part of Tn10 has been determined from the rightmost HindIII site to IS10R. This sequence contains two open reading frames with opposite polarity. The in vivo transcription start points in this sequence have been determined by S1 mapping. These results define one minor and two major promoters. The transcription starts of the two major promoters are only 18 base pairs apart, and the transcripts show different polarity and overlap by 18 base pairs. The nucleotide sequence reveals two regions with palindromic symmetry which may serve as operators. Their possible involvement in the regulation of transcription of both genes is discussed. Taken together these results allow for a maximal coding capacity of 138 amino acids directed toward IS10R and 197 amino acids directed toward tetA. The possible function of these gene products is discussed. The accompanying article (Braus et al., J. Bacteriol. 160:504-509, 1984) presents evidence that these genes are expressed. Images PMID:6094471
Vina-Rodriguez, Ariel; Schlosser, Josephine; Becher, Dietmar; Kaden, Volker; Groschup, Martin H; Eiden, Martin
2015-05-22
An increasing number of indigenous cases of hepatitis E caused by genotype 3 viruses (HEV-3) have been diagnosed all around the word, particularly in industrialized countries. Hepatitis E is a zoonotic disease and accumulating evidence indicates that domestic pigs and wild boars are the main reservoirs of HEV-3. A detailed analysis of HEV-3 subtypes could help to determine the interplay of human activity, the role of animals as reservoirs and cross species transmission. Although complete genome sequences are most appropriate for HEV subtype determination, in most cases only partial genomic sequences are available. We therefore carried out a subtype classification analysis, which uses regions from all three open reading frames of the genome. Using this approach, more than 1000 published HEV-3 isolates were subtyped. Newly recovered HEV partial sequences from hunted German wild boars were also included in this study. These sequences were assigned to genotype 3 and clustered within subtype 3a, 3i and, unexpectedly, one of them within the subtype 3b, a first non-human report of this subtype in Europe.
[Isolation and function of genes regulating aphB expression in Vibrio cholerae].
Chen, Haili; Zhu, Zhaoqin; Zhong, Zengtao; Zhu, Jun; Kan, Biao
2012-02-04
We identified genes that regulate the expression of aphB, the gene encoding a key virulence regulator in Vibrio cholerae O1 E1 Tor C6706(-). We constructed a transposon library in V. cholerae C6706 strain containing a P(aphB)-luxCDABE and P(aphB)-lacZ transcriptional reporter plasmids. Using a chemiluminescence imager system, we rapidly detected aphB promoter expression level at a large scale. We then sequenced the transposon insertion sites by arbitrary PCR and sequencing analysis. We obtained two candidate mutants T1 and T2 which displayed reduced aphB expression from approximately 40,000 transposon insertion mutants. Sequencing analysis shows that Tn inserted in vc1585 reading frame in the T1 mutant and Tn inserted in the end of coding sequence of vc1602 in the T2 mutant. By using a genetic screen, we identified two potential genes that may involve in regulation of the expression of the key virulence regulator AphB. This study sheds light on our further investigation to fully understand V. cholerae virulence gene regulatory cascades.
Komatsu, Ken; Yamashita, Kazuo; Sugawara, Kota; Verbeek, Martin; Fujita, Naoko; Hanada, Kaoru; Uehara-Ichiki, Tamaki; Fuji, Shin-Ichi
2017-02-01
Plantago asiatica mosaic virus (PlAMV) is a member of the genus Potexvirus and has an exceptionally wide host range. It causes severe damage to lilies. Here we report on the complete nucleotide sequences of two new Japanese PlAMV isolates, one from the eudicot weed Viola grypoceras (PlAMV-Vi), and the other from the eudicot shrub Nandina domestica Thunb. (PlAMV-NJ). Their genomes contain five open reading frames (ORFs), which is characteristic of potexviruses. Surprisingly, the isolates showed only 76.0-78.0 % sequence identity with each other and with other PlAMV isolates, including isolates from Japanese lily and American nandina. Amino acid alignments of the replicase coding region encoded by ORF1 showed that the regions between the methyltransferase and helicase domains were less conserved than other regions, with several insertions and/or deletions. Phylogenetic analyses of the full-length nucleotide sequences revealed a moderate correlation between phylogenetic clustering and the original host plants of the PlAMV isolates. This study revealed the presence of two highly divergent PlAMV isolates in Japan.
Pilloff, Marcela Gabriela; Bilen, Marcos Fabián; Belaich, Mariano Nicolás; Lozano, Mario Enrique; Ghiringhelli, Pablo Daniel
2003-01-01
The gp64 locus of Anticarsia gemmatalis multicapsid nucleopolyhedrovirus isolate Santa Fe (AgMNPV-SF) was characterised molecularly in our laboratory. To this end, we have located and cloned a AgMNPV-SF genomic DNA fragment containing the gp64 gene and sequenced the complete gp64 locus. Nucleotide sequence analysis indicated that the AgMNPV gp64 gene consists of a 1500 nucleotide open reading frame (ORF), encoding a protein of 499 amino acids. Of the seven gp64 homologues identified to date, the AgMNPV gp64 ORF shared most sequence similarity with the gp64 gene of Orgyia pseudotsugata MNPV. The GP64 from AgMNPV is the smallest baculoviral envelope glycoprotein found to date, differing in 10 or more residues from the other group I nucleopolyhedroviruses. The biological activity of AgMNPV GP64 protein was assessed by cell fusion assays in UFL-AG-286 cells using the obtained recombinant plasmids. In the upstream and downstream regions, relative to the gp64 ORF, we found different conserved transcriptional and post-transcriptional regulatory elements, respectively.
Myohara, Maroko; Niva, Cintia Carla; Lee, Jae Min
2006-08-01
To identify genes specifically activated during annelid regeneration, suppression subtractive hybridization was performed with cDNAs from regenerating and intact Enchytraeus japonensis, a terrestrial oligochaete that can regenerate a complete organism from small body fragments within 4-5 days. Filter array screening subsequently revealed that about 38% of the forward-subtracted cDNA clones contained genes that were upregulated during regeneration. Two hundred seventy-nine of these clones were sequenced and found to contain 165 different sequences (79 known and 86 unknown). Nine clones were fully sequenced and four of these sequences were matched to known genes for glutamine synthetase, glucosidase 1, retinal protein 4, and phosphoribosylaminoimidazole carboxylase, respectively. The remaining five clones encoded an unknown open-reading frame. The expression levels of these genes were highest during blastema formation. Our present results, therefore, demonstrate the great potential of annelids as a new experimental subject for the exploration of unknown genes that play critical roles in animal regeneration.
Maurino, Fernanda; Dumón, Analía D; Llauger, Gabriela; Alemandri, Vanina; de Haro, Luis A; Mattio, M Fernanda; Del Vas, Mariana; Laguna, Irma Graciela; Giménez Pecci, María de la Paz
2018-01-01
A rhabdovirus infecting maize and wheat crops in Argentina was molecularly characterized. Through next-generation sequencing (NGS) of symptomatic leaf samples, the complete genome was obtained of two isolates of maize yellow striate virus (MYSV), a putative new rhabdovirus, differing by only 0.4% at the nucleotide level. The MYSV genome consists of 12,654 nucleotides for maize and wheat virus isolates, and shares 71% nucleotide sequence identity with the complete genome of barley yellow striate mosaic virus (BYSMV, NC028244). Ten open reading frames (ORFs) were predicted in the MYSV genome from the antigenomic strand and were compared with their BYSMV counterparts. The highest amino acid sequence identity of the MYSV and BYSMV proteins was 80% between the L proteins, and the lowest was 37% between the proteins 4. Phylogenetic analysis suggested that the MYSV isolates are new members of the genus Cytorhabdovirus, family Rhabdoviridae. Yellow striate, affecting maize and wheat crops in Argentina, is an emergent disease that presents a potential economic risk for these widely distributed crops.
Analysis of the complete genome of subgroup A' hepatitis B virus isolates from South Africa.
Kramvis, Anna; Weitzmann, Louise; Owiredu, William K B A; Kew, Michael C
2002-04-01
A phylogenetic analysis is presented of six complete and seven pre-S1/S2/S gene sequences of hepatitis B virus (HBV) isolates from South Africa. Five of the full-length sequences and all of the pre-S2/S sequences have been previously reported. Four of the six complete genomes and three of the five incomplete sequences clustered with subgroup A', a unique segment of genotype A of HBV previously identified in 60% of South African isolates using analysis of the pre-S2/S region alone. This separation was also evident when the polymerase open reading frame was analysed, but not on analysis of either the X or pre-core/core genes. Amino acids were identified in the pre-S1 and polymerase regions specific to subgroup A'. In common with genotype D, 10 of 11 genotype A South African isolates had an 11 amino acid deletion in the amino end of the pre-S1 region. This deletion is also found in hepadnaviruses from non-human primates.
A new ALF from Litopenaeus vannamei and its SNPs related to WSSV resistance
NASA Astrophysics Data System (ADS)
Liu, Jingwen; Yu, Yang; Li, Fuhua; Zhang, Xiaojun; Xiang, Jianhai
2014-11-01
Anti-lipopolysaccharide factors (ALFs) are basic components of the crustacean immune system that defend against a range of pathogens. The cDNA sequence of a new ALF, designated nLvALF2, with an open reading frame encoding 132 amino acids was cloned. Its deduced amino acid sequence contained the conserved functional domain of ALFs, the LPS binding domain (LBD). Its genomic sequence consisted of three exons and four introns. nLvALF2 was mainly expressed in the Oka organ and gills of shrimps. The transcriptional level of nLvALF2 increased significantly after white spot syndrome virus (WSSV) infection, suggesting its important roles in protecting shrimps from WSSV. Single nucleotide polymorphisms (SNPs) were found in the genomic sequence of nLvALF2, of which 38 were analyzed for associations with the susceptibility/resistance of shrimps to WSSV. The loci g.2422 A>G, g.2466 T>C, and g.2529 G>A were significantly associated with the resistance to WSSV ( P<0.05). These SNP loci could be developed as markers for selection of WSSV-resistant varieties of Litopenaeus vannamei.
Xia, Xichao; Liu, Rongzhi; Li, Yi; Xue, Shipeng; Liu, Qingchun; Jiang, Xiao; Zhang, Wenjuan; Ding, Ke
2014-09-01
Hyaluronidase is a common component of scorpion venom and has been considered as "spreading factor" that promotes a fast penetration of the venom in the anaphylactic reaction. In the current study, a novel full-length of hyaluronidase BmHYI and three noncoding isoforms of BmHYII, BmHYIII and BmHYIV were cloned by using a combined strategy based on peptide sequencing and Rapid Amplification of cDNA Ends (RACE). BmHYI has 410 amino acid residues containing the catalytic, positional and five potential N-glycosylation sites. The deduced protein sequence of BmHYI shares significant identity with venom hyaluronidases from bees and snakes. The phylogenetic analysis showed early divergence and independent evolution of BmHYI from other hyaluronidases. An extraordinarily high level of sequence similarity was detected among four sequences. But, BmHYII, BmHYIII and BmHYIV were short of stop-codon in the open reading frame and poly(A) signal in the 3' end. Copyright © 2014 Elsevier B.V. All rights reserved.
Nguyen, Thong T; Suryamohan, Kushal; Kuriakose, Boney; Janakiraman, Vasantharajan; Reichelt, Mike; Chaudhuri, Subhra; Guillory, Joseph; Divakaran, Neethu; Rabins, P E; Goel, Ridhi; Deka, Bhabesh; Sarkar, Suman; Ekka, Preety; Tsai, Yu-Chih; Vargas, Derek; Santhosh, Sam; Mohan, Sangeetha; Chin, Chen-Shan; Korlach, Jonas; Thomas, George; Babu, Azariah; Seshagiri, Somasekar
2018-06-12
We sequenced the Hyposidra talaca NPV (HytaNPV) double stranded circular DNA genome using PacBio single molecule sequencing technology. We found that the HytaNPV genome is 139,089 bp long with a GC content of 39.6%. It encodes 141 open reading frames (ORFs) including the 37 baculovirus core genes, 25 genes conserved among lepidopteran baculoviruses, 72 genes known in baculovirus, and 7 genes unique to the HytaNPV genome. It is a group II alphabaculovirus that codes for the F protein and lacks the gp64 gene found in group I alphabaculovirus viruses. Using RNA-seq, we confirmed the expression of the ORFs identified in the HytaNPV genome. Phylogenetic analysis showed HytaNPV to be closest to BusuNPV, SujuNPV and EcobNPV that infect other tea pests, Buzura suppressaria, Sucra jujuba, and Ectropis oblique, respectively. We identified repeat elements and a conserved non-coding baculovirus element in the genome. Analysis of the putative promoter sequences identified motif consistent with the temporal expression of the genes observed in the RNA-seq data.
Molecular cloning of a cDNA coding for GTP cyclohydrolase I from Dictyostelium discoideum.
Witter, K; Cahill, D J; Werner, T; Ziegler, I; Rödl, W; Bacher, A; Gütlich, M
1996-01-01
The GTP cyclohydrolase I (GTP-CH) gene of the cellular slime mould Dictyostelium discoideum has been cloned and sequenced. The 855 bp cDNA of this gene contains the open reading frame (ORF) encoding 232 amino acids with a predicted molecular mass of approx. 26 kDa. Southern blot analysis indicated the presence of a single gene for GTP-CH in Dictyostelium. PCR amplification of the ORF from chromosomal DNA and sequencing showed the existence of a 101 bp intron in the GTP-CH gene of Dictyostelium discoideum. The amino acid sequence has 47% and 49% positional identity to those of the human and yeast enzymes respectively. Most of the sequence variation between species is located in the N-terminal part of the protein. The overall identity with the E. coli protein is markedly lower. The enzyme was expressed in E. coli and purified as a 68 kDa fusion protein with the maltose-binding protein of E. coli. GTP-CH of Dictyostelium is heat-stable and showed maximal activity at 60 degrees C. The Km value for GTP is 50 microM. PMID:8870645
A Survey of Protein Structures from Archaeal Viruses
Dellas, Nikki; Lawrence, C. Martin; Young, Mark J.
2013-01-01
Viruses that infect the third domain of life, Archaea, are a newly emerging field of interest. To date, all characterized archaeal viruses infect archaea that thrive in extreme conditions, such as halophilic, hyperthermophilic, and methanogenic environments. Viruses in general, especially those replicating in extreme environments, contain highly mosaic genomes with open reading frames (ORFs) whose sequences are often dissimilar to all other known ORFs. It has been estimated that approximately 85% of virally encoded ORFs do not match known sequences in the nucleic acid databases, and this percentage is even higher for archaeal viruses (typically 90%–100%). This statistic suggests that either virus genomes represent a larger segment of sequence space and/or that viruses encode genes of novel fold and/or function. Because the overall three-dimensional fold of a protein evolves more slowly than its sequence, efforts have been geared toward structural characterization of proteins encoded by archaeal viruses in order to gain insight into their potential functions. In this short review, we provide multiple examples where structural characterization of archaeal viral proteins has indeed provided significant functional and evolutionary insight. PMID:25371334
Molecular Cloning and Sequence Analysis of a Phenylalanine Ammonia-Lyase Gene from Dendrobium
Cai, Yongping; Lin, Yi
2013-01-01
In this study, a phenylalanine ammonia-lyase (PAL) gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748) has 2,458 bps and contains a complete open reading frame (ORF) of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum. PMID:23638048
Tange, N; Jong-Young, L; Mikawa, N; Hirono, I; Aoki, T
1997-12-01
A cDNA clone of rainbow trout (Oncorhynchus mykiss) transferrin was obtained from a liver cDNA library. The 2537-bp cDNA sequence contained an open reading frame encoding 691 amino acids and the 5' and 3' noncoding regions. The amino acid sequences at the iron-binding sites and the two N-linked glycosylation sites, and the cysteine residues were consistent with known, conserved vertebrate transferrin cDNA sequences. Single N-linked glycosylation sites existed on the N- and C-lobe. The deduced amino acid sequence of the rainbow trout transferrin cDNA had 92.9% identities with transferrin of coho salmon (Oncorhynchus kisutch); 85%, Atlantic salmon (Salmo salar); 67.3%, medaka (Oryzias latipes); 61.3% Atlantic cod (Gadus morhua); and 59.7%, Japanese flounder (Paralichthys olivaceus). The long and accurate polymerase chain reaction (LA-PCR) was used to amplify approximately 6.5 kb of the transferrin gene from rainbow trout genomic DNA. Restriction fragment length polymorphisms (RFLPs) of the LA-PCR products revealed three digestion patterns in 22 samples.
Arricau, N; Hermant, D; Waxin, H; Popoff, M Y
1997-01-01
Analysis of the nucleotide sequence of a 4-kb DNA fragment located between the sip and iag loci on Salmonella typhi chromosome revealed three open reading frames, termed sipF, ctpA and stpA. The 82-amino-acid (aa) sipF product showed extensive similarity to the lacP protein from S. typhimurium. The StpA protein (535 aa) exhibited significant similarity to both Yersinia enterocolitica YopE cytotoxin and YopH tyrosine phosphatase. The CtpA polypeptide (130 aa) might be the molecular chaperone of the StpA protein.
Bes, M T; Hernández, J A; Peleato, M L; Fillat, M F
2001-01-15
A gene coding for a Fur (ferric uptake regulation) protein from the cyanobacterium Anabaena PCC 7119 has been cloned and overexpressed in Escherichia coli. DNA sequence analysis confirmed the presence of a 151-amino-acid open reading frame that showed homology with the Fur proteins reported for the unicellular cyanobacteria Synechococcus 7942 and Synechocystis PCC 6803. Two putative Fur-binding sites were detected in the promoter regions of the fur gene from Anabaena. Partially purified recombinant Fur binds to the flavodoxin promoter as well as its own promoter. This suggests that the Fur gene is autoregulated in Anabaena.
Pecker, I; Avraham, K B; Gilbert, D J; Savitsky, K; Rotman, G; Harnik, R; Fukao, T; Schröck, E; Hirotsune, S; Tagle, D A; Collins, F S; Wynshaw-Boris, A; Ried, T; Copeland, N G; Jenkins, N A; Shiloh, Y; Ziv, Y
1996-07-01
Atm, the mouse homolog of the human ATM gene defective in ataxia-telangiectasia (A-T), has been identified. The entire coding sequence of the Atm transcript was cloned and found to contain an open reading frame encoding a protein of 3066 amino acids with 84% overall identity and 91% similarity to the human ATM protein. Variable levels of expression of Atm were observed in different tissues. Fluorescence in situ hybridization and linkage analysis located the Atm gene on mouse chromosome 9, band 9C, in a region homologous to the ATM region on human chromosome 11q22-q23.
Ribosome profiling reveals the what, when, where and how of protein synthesis.
Brar, Gloria A; Weissman, Jonathan S
2015-11-01
Ribosome profiling, which involves the deep sequencing of ribosome-protected mRNA fragments, is a powerful tool for globally monitoring translation in vivo. The method has facilitated discovery of the regulation of gene expression underlying diverse and complex biological processes, of important aspects of the mechanism of protein synthesis, and even of new proteins, by providing a systematic approach for experimental annotation of coding regions. Here, we introduce the methodology of ribosome profiling and discuss examples in which this approach has been a key factor in guiding biological discovery, including its prominent role in identifying thousands of novel translated short open reading frames and alternative translation products.
Two novel Alphaflexiviridae members revealed by deep sequencing of the Vanilla (Orchidaceae) virome.
Grisoni, Michel; Marais, Armelle; Filloux, Denis; Saison, Anne; Faure, Chantal; Julian, Charlotte; Theil, Sébastien; Contreras, Sandy; Teycheney, Pierre-Yves; Roumagnac, Philippe; Candresse, Thierry
2017-12-01
The genomes of two novel viruses were assembled from 454 pyrosequencing data obtained from vanilla leaves from La Réunion. Based on genome organization and homologies, one agent was unambiguously classified as a member of the genus Potexvirus and named vanilla virus X (VVX). The second one, vanilla latent virus (VLV), is phylogenetically close to three unclassified members of the family Alphaflexiviridae with similarity to allexiviruses, and despite the presence of an additional 8-kDa open reading frame, we propose to include VLV as a new member of the genus Allexivirus. Both VVX and VLV were mechanically transmitted to vanilla plants, resulting in asymptomatic infections.
The in vivo use of alternate 3'-splice sites in group I introns.
Sellem, C H; Belcour, L
1994-04-11
Alternative splicing of group I introns has been postulated as a possible mechanism that would ensure the translation of proteins encoded into intronic open reading frames, discontinuous with the upstream exon and lacking an initiation signal. Alternate splice sites were previously depicted according to secondary structures of several group I introns. We present here strong evidence that, in the case of Podospora anserina nad 1-i4 and cox1-i7 mitochondrial introns, alternative splicing events do occur in vivo. Indeed, by PCR experiments we have detected molecules whose sequence is precisely that expected if the predicted alternate 3'-splice sites were used.
Moorman, Marjolein; van den Putte, Bas
2008-10-01
This study explores the combined effect of message framing, intention to quit smoking, and nicotine dependence on the persuasiveness of smoking cessation messages. Pre- and post-message measures of quit intention, attitude toward smoking cessation, and perceived behavioral control were taken in two separate waves from current cigarette smokers with varying levels of nicotine dependence (N=151). In the second wave, participants were randomly assigned to one of two groups. In the first group, participants read a smoking cessation message which emphasized the benefits of quitting (positive frame). In the second group participants read a message which emphasized the costs of not quitting (negative frame). Results show that smokers' intentions to quit smoking and their level of nicotine dependence jointly influence the persuasiveness of positive and negative message frames. When nicotine dependence and quitting intention are both high, a negative frame works best. Conversely, a positive frame is preferable when nicotine dependence or quitting intention is low. Smokers' level of processing is proposed as the underlying mechanism explaining the different effects of message frames.
Farnbacher, Michael J; Krause, Horst H; Hagel, Alexander F; Raithel, Martin; Neurath, Markus F; Schneider, Thomas
2014-03-01
OBJECTIVE. Colon capsule endoscopy (CCE) proved to be highly sensitive in detection of colorectal polyps (CP). Major limitation is the time-consuming video reading. The aim of this prospective, double-center study was to assess the theoretical time-saving potential and its possible impact on the reliability of "QuickView" (QV), in the presentation of CP as compared to normal mode (NM). METHODS. During NM reading of 65 CCE videos (mean patient´s age 56 years), all frames showing CPs were collected and compared to the number of frames presented by QV at increasing QV settings (10, 20, ... 80%). Reliability of QV in presenting polyps <6 mm and ≥6 mm (significant polyp), and identifying patients for subsequent therapeutic colonoscopy, capsule egestion rate, cleansing level, and estimated time-saving potential were assessed. RESULTS. At a 30% QV setting, the QV video presented 89% of the significant polyps and 86% of any polyps with ≥1 frame (per-polyp analysis) identified in NM before. At a 10% QV setting, 98% of the 52 patients with significant polyps could be identified (per-patient analysis) by QV video analysis. Capsule excretion rate was 74% and colon cleanliness was adequate in 85%. QV´s presentation rate correlates to the QV setting, the polyp size, and the number of frames per finding. CONCLUSIONS. Depending on its setting, the reliability of QV in presenting CP as compared to NM reading is notable. However, if no significant polyp is presented by QV, NM reading must be performed afterwards. The reduction of frames to be analyzed in QV might speed up identification of candidates for therapeutic colonoscopy.
Ustav, M; Stenlund, A
1991-01-01
Bovine papillomavirus (BPV) DNA is maintained as an episome with a constant copy number in transformed cells and is stably inherited. To study BPV replication we have developed a transient replication assay based on a highly efficient electroporation procedure. Using this assay we have determined that in the context of the viral genome two of the viral open reading frames, E1 and E2, are required for replication. Furthermore we show that when produced from expression vectors in the absence of other viral gene products, the full length E2 transactivator polypeptide and a 72 kd polypeptide encoded by the E1 open reading frame in its entirety, are both necessary and sufficient for replication BPV in C127 cells. Images PMID:1846806
AmpliVar: mutation detection in high-throughput sequence from amplicon-based libraries.
Hsu, Arthur L; Kondrashova, Olga; Lunke, Sebastian; Love, Clare J; Meldrum, Cliff; Marquis-Nicholson, Renate; Corboy, Greg; Pham, Kym; Wakefield, Matthew; Waring, Paul M; Taylor, Graham R
2015-04-01
Conventional means of identifying variants in high-throughput sequencing align each read against a reference sequence, and then call variants at each position. Here, we demonstrate an orthogonal means of identifying sequence variation by grouping the reads as amplicons prior to any alignment. We used AmpliVar to make key-value hashes of sequence reads and group reads as individual amplicons using a table of flanking sequences. Low-abundance reads were removed according to a selectable threshold, and reads above this threshold were aligned as groups, rather than as individual reads, permitting the use of sensitive alignment tools. We show that this approach is more sensitive, more specific, and more computationally efficient than comparable methods for the analysis of amplicon-based high-throughput sequencing data. The method can be extended to enable alignment-free confirmation of variants seen in hybridization capture target-enrichment data. © 2015 WILEY PERIODICALS, INC.
ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data.
Luo, Guan-Zheng; Yang, Wei; Ma, Ying-Ke; Wang, Xiu-Jie
2014-02-01
Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics for genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number of reads mapped to known microRNAs and other classes of short non-coding RNAs, coverage of short reads on genes, expression abundance of sequence reads as well as some other analysis functions are also supported. The versatile search functions enable users to select sequence reads according to their sub-sequences, expression abundance, genomic location, relationship to genes, etc. A specialized genome browser is integrated to visualize the genomic distribution of short reads. ISRNA also supports management and comparison among multiple datasets. ISRNA is implemented in Java/C++/Perl/MySQL and can be freely accessed at http://omicslab.genetics.ac.cn/ISRNA/.
Characterization and chromosomal mapping of the human TFG gene involved in thyroid carcinoma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mencinger, M.; Panagopoulos, I.; Andreasson, P.
1997-05-01
Homology searches in the Expressed Sequence Tag Database were performed using SPYGQ-rich regions as query sequences to find genes encoding protein regions similar to the N-terminal parts of the sarcoma-associated EWS and FUS proteins. Clone 22911 (T74973), encoding a SPYGQ-rich region in its 5{prime} end, and several other clones that overlapped 22911 were selected. The combined data made it possible to assemble a full-length cDNA sequence. This cDNA sequence is 1677 bp, containing an initiation codon ATG, an open reading frame of 400 amino acids, a poly(A) signal, and a poly(A) tail. We found 100% identity between the 5{prime} partmore » of the consensus sequence and the 598-bp-long sequence named TFG. The TFG sequence is fused to the 3{prime} end of NTRK1, generating the TRK-T3 fusion transcript found in papillary thyroid carcinoma. The cDNA therefore represents the full-length transcript of the TFG gene. TFG was localized to 3q11-q12 by fluorescence in situ hybridization. The 3{prime} and the 5{prime} ends of the TFG cDNA probe hybridized to a 2.2-kb band on Northern blot filters in all tissues examined. 28 refs., 5 figs., 1 tab.« less
VKCDB: voltage-gated K+ channel database updated and upgraded.
Gallin, Warren J; Boutet, Patrick A
2011-01-01
The Voltage-gated K(+) Channel DataBase (VKCDB) (http://vkcdb.biology.ualberta.ca) makes a comprehensive set of sequence data readily available for phylogenetic and comparative analysis. The current update contains 2063 entries for full-length or nearly full-length unique channel sequences from Bacteria (477), Archaea (18) and Eukaryotes (1568), an increase from 346 solely eukaryotic entries in the original release. In addition to protein sequences for channels, corresponding nucleotide sequences of the open reading frames corresponding to the amino acid sequences are now available and can be extracted in parallel with sets of protein sequences. Channels are categorized into subfamilies by phylogenetic analysis and by using hidden Markov model analyses. Although the raw database contains a number of fragmentary, duplicated, obsolete and non-channel sequences that were collected in early steps of data collection, the web interface will only return entries that have been validated as likely K(+) channels. The retrieval function of the web interface allows retrieval of entries that contain a substantial fraction of the core structural elements of VKCs, fragmentary entries, or both. The full database can be downloaded as either a MySQL dump or as an XML dump from the web site. We have now implemented automated updates at quarterly intervals.
RNA processing in Neurospora crassa mitochondria: use of transfer RNA sequences as signals.
Breitenberger, C A; Browning, K S; Alzner-DeWeerd, B; RajBhandary, U L
1985-01-01
We have used RNA gel transfer hybridization, S1 nuclease mapping and primer extension to analyze transcripts derived from several genes in Neurospora crassa mitochondria. The transcripts studied include those for cytochrome oxidase subunit III, 17S rRNA and an unidentified open reading frame. In all three cases, initial transcripts are long, include tRNA sequences, and are subsequently processed to generate the mature RNAs. We find that endpoints of the most abundant transcripts generally coincide with those of tRNA sequences. We therefore conclude that tRNA sequences in long transcripts act as primary signals for RNA processing in N. crassa mitochondria. The situation is somewhat analogous to that observed in mammalian mitochondrial systems. The difference, however, is that in mammalian mitochondria, noncoding spacers between tRNA, rRNA and protein genes are very short and in many cases non-existent, allowing no room for intergenic RNA processing signals whereas, in N. crassa mtDNA, intergenic non-coding sequences are usually several hundred nucleotides long and contain highly conserved GC-rich palindromic sequences. Since these GC-rich palindromic sequences are retained in the processed mature RNAs, we conclude that they do not serve as signals for RNA processing. Images Fig. 2. Fig. 3. Fig. 4. Fig. 5. Fig. 6. Fig. 7. PMID:2990893
The first genome sequence of a metatherian herpesvirus: Macropodid herpesvirus 1.
Vaz, Paola K; Mahony, Timothy J; Hartley, Carol A; Fowler, Elizabeth V; Ficorilli, Nino; Lee, Sang W; Gilkerson, James R; Browning, Glenn F; Devlin, Joanne M
2016-01-22
While many placental herpesvirus genomes have been fully sequenced, the complete genome of a marsupial herpesvirus has not been described. Here we present the first genome sequence of a metatherian herpesvirus, Macropodid herpesvirus 1 (MaHV-1). The MaHV-1 viral genome was sequenced using an Illumina MiSeq sequencer, de novo assembly was performed and the genome was annotated. The MaHV-1 genome was 140 kbp in length and clustered phylogenetically with the primate simplexviruses, sharing 67% nucleotide sequence identity with Human herpesviruses 1 and 2. The MaHV-1 genome contained 66 predicted open reading frames (ORFs) homologous to those in other herpesvirus genomes, but lacked homologues of UL3, UL4, UL56 and glycoprotein J. This is the first alphaherpesvirus genome that has been found to lack the UL3 and UL4 homologues. We identified six novel ORFs and confirmed their transcription by RT-PCR. This is the first genome sequence of a herpesvirus that infects metatherians, a taxonomically unique mammalian clade. Members of the Simplexvirus genus are remarkably conserved, so the absence of ORFs otherwise retained in eutherian and avian alphaherpesviruses contributes to our understanding of the Alphaherpesvirinae. Further study of metatherian herpesvirus genetics and pathogenesis provides a unique approach to understanding herpesvirus-mammalian interactions.
Shao, Zhiyong; Graf, Shannon; Chaga, Oleg Y; Lavrov, Dennis V
2006-10-15
The 16,937-nuceotide sequence of the linear mitochondrial DNA (mt-DNA) molecule of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa) - the first mtDNA sequence from the class Scypozoa and the first sequence of a linear mtDNA from Metazoa - has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs. In addition, two open reading frames of 324 and 969 base pairs in length have been found. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the polymerase [but not the exonuclease] domain of family B DNA polymerases, and this ORF has been tentatively identified as dnab. This is the first report of dnab in animal mtDNA. The genes in A. aurita mtDNA are arranged in two clusters with opposite transcriptional polarities; transcription proceeding toward the ends of the molecule. The determined sequences at the ends of the molecule are nearly identical but inverted and lack any obvious potential secondary structures or telomere-like repeat elements. The acquisition of mitochondrial genomic data for the second class of Cnidaria allows us to reconstruct characteristic features of mitochondrial evolution in this animal phylum.
Haplotype estimation using sequencing reads.
Delaneau, Olivier; Howie, Bryan; Cox, Anthony J; Zagury, Jean-François; Marchini, Jonathan
2013-10-03
High-throughput sequencing technologies produce short sequence reads that can contain phase information if they span two or more heterozygote genotypes. This information is not routinely used by current methods that infer haplotypes from genotype data. We have extended the SHAPEIT2 method to use phase-informative sequencing reads to improve phasing accuracy. Our model incorporates the read information in a probabilistic model through base quality scores within each read. The method is primarily designed for high-coverage sequence data or data sets that already have genotypes called. One important application is phasing of single samples sequenced at high coverage for use in medical sequencing and studies of rare diseases. Our method can also use existing panels of reference haplotypes. We tested the method by using a mother-father-child trio sequenced at high-coverage by Illumina together with the low-coverage sequence data from the 1000 Genomes Project (1000GP). We found that use of phase-informative reads increases the mean distance between switch errors by 22% from 274.4 kb to 328.6 kb. We also used male chromosome X haplotypes from the 1000GP samples to simulate sequencing reads with varying insert size, read length, and base error rate. When using short 100 bp paired-end reads, we found that using mixtures of insert sizes produced the best results. When using longer reads with high error rates (5-20 kb read with 4%-15% error per base), phasing performance was substantially improved. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Thomas, Sean; Martinez, L L Isadora Trejo; Westenberger, Scott J; Sturm, Nancy R
2007-05-24
The structurally complex network of minicircles and maxicircles comprising the mitochondrial DNA of kinetoplastids mirrors the complexity of the RNA editing process that is required for faithful expression of encrypted maxicircle genes. Although a few of the guide RNAs that direct this editing process have been discovered on maxicircles, guide RNAs are mostly found on the minicircles. The nuclear and maxicircle genomes have been sequenced and assembled for Trypanosoma cruzi, the causative agent of Chagas disease, however the complement of 1.4-kb minicircles, carrying four guide RNA genes per molecule in this parasite, has been less thoroughly characterised. Fifty-four CL Brener and 53 Esmeraldo strain minicircle sequence reads were extracted from T. cruzi whole genome shotgun sequencing data. With these sequences and all published T. cruzi minicircle sequences, 108 unique guide RNAs from all known T. cruzi minicircle sequences and two guide RNAs from the CL Brener maxicircle were predicted using a local alignment algorithm and mapped onto predicted or experimentally determined sequences of edited maxicircle open reading frames. For half of the sequences no statistically significant guide RNA could be assigned. Likely positions of these unidentified gRNAs in T. cruzi minicircle sequences are estimated using a simple Hidden Markov Model. With the local alignment predictions as a standard, the HMM had an ~85% chance of correctly identifying at least 20 nucleotides of guide RNA from a given minicircle sequence. Inter-minicircle recombination was documented. Variable regions contain species-specific areas of distinct nucleotide preference. Two maxicircle guide RNA genes were found. The identification of new minicircle sequences and the further characterization of all published minicircles are presented, including the first observation of recombination between minicircles. Extrapolation suggests a level of 4% recombinants in the population, supporting a relatively high recombination rate that may serve to minimize the persistence of gRNA pseudogenes. Characteristic nucleotide preferences observed within variable regions provide potential clues regarding the transcription and maturation of T. cruzi guide RNAs. Based on these preferences, a method of predicting T. cruzi guide RNAs using only primary minicircle sequence data was created.
Read clouds uncover variation in complex regions of the human genome
Bishara, Alex; Liu, Yuling; Weng, Ziming; Kashef-Haghighi, Dorna; Newburger, Daniel E.; West, Robert; Sidow, Arend; Batzoglou, Serafim
2015-01-01
Although an increasing amount of human genetic variation is being identified and recorded, determining variants within repeated sequences of the human genome remains a challenge. Most population and genome-wide association studies have therefore been unable to consider variation in these regions. Core to the problem is the lack of a sequencing technology that produces reads with sufficient length and accuracy to enable unique mapping. Here, we present a novel methodology of using read clouds, obtained by accurate short-read sequencing of DNA derived from long fragment libraries, to confidently align short reads within repeat regions and enable accurate variant discovery. Our novel algorithm, Random Field Aligner (RFA), captures the relationships among the short reads governed by the long read process via a Markov Random Field. We utilized a modified version of the Illumina TruSeq synthetic long-read protocol, which yielded shallow-sequenced read clouds. We test RFA through extensive simulations and apply it to discover variants on the NA12878 human sample, for which shallow TruSeq read cloud sequencing data are available, and on an invasive breast carcinoma genome that we sequenced using the same method. We demonstrate that RFA facilitates accurate recovery of variation in 155 Mb of the human genome, including 94% of 67 Mb of segmental duplication sequence and 96% of 11 Mb of transcribed sequence, that are currently hidden from short-read technologies. PMID:26286554
Hadji Sfaxi, Imen; Ezzine, Aymen; Coquet, Laurent; Cosette, Pascal; Jouenne, Thierry; Marzouki, M Nejib
2012-09-01
Superoxide dismutases (SODs; EC 1.15.1.1) are key enzymes in the cells protection against oxidant agents. Thus, SODs play a major role in the protection of aerobic organisms against oxygen-mediated damages. Three SOD isoforms were previously identified by zymogram staining from Allium sativum bulbs. The purified Cu, Zn-SOD2 shows an antagonist effect to an anticancer drug and alleviate cytotoxicity inside tumor cells lines B16F0 (mouse melanoma cells) and PAE (porcine aortic endothelial cells). To extend the characterization of Allium SODs and their corresponding genes, a proteomic approach was applied involving two-dimensional gel electrophoresis and LC-MS/MS analyses. From peptide sequence data obtained by mass spectrometry and sequences homologies, primers were defined and a cDNA fragment of 456 bp was amplified by RT-PCR. The cDNA nucleotide sequence analysis revealed an open reading frame coding for 152 residues. The deduced amino acid sequence showed high identity (82-87%) with sequences of Cu, Zn-SODs from other plant species. Molecular analysis was achieved by a protein 3D structural model.
Selection of homeotic proteins for binding to a human DNA replication origin.
de Stanchina, E; Gabellini, D; Norio, P; Giacca, M; Peverali, F A; Riva, S; Falaschi, A; Biamonti, G
2000-06-09
We have previously shown that a cell cycle-dependent nucleoprotein complex assembles in vivo on a 74 bp sequence within the human DNA replication origin associated to the Lamin B2 gene. Here, we report the identification, using a one-hybrid screen in yeast, of three proteins interacting with the 74 bp sequence. All of them, namely HOXA13, HOXC10 and HOXC13, are orthologues of the Abdominal-B gene of Drosophila melanogaster and are members of the homeogene family of developmental regulators. We describe the complete open reading frame sequence of HOXC10 and HOXC13 along with the structure of the HoxC13 gene. The specificity of binding of these two proteins to the Lamin B2 origin is confirmed by both band-shift and in vitro footprinting assays. In addition, the ability of HOXC10 and HOXC13 to increase the activity of a promoter containing the 74 bp sequence, as assayed by CAT-assay experiments, demonstrates a direct interaction of these homeoproteins with the origin sequence in mammalian cells. We also show that HOXC10 expression is cell-type-dependent and positively correlates with cell proliferation. Copyright 2000 Academic Press.
Nam, Bo-Hye; Seo, Jung-Kil; Lee, Min Jeong; Kim, Young-Ok; Kim, Dong-Gyun; An, Cheul Min; Park, Nam Gyu
2015-07-01
An antimicrobial peptide, ∼5 kDa in size, was isolated and purified in its active form from the mantle of the Pacific oyster Crassostrea gigas by C18 reversed-phase high-performance liquid chromatography. Matrix-assisted laser desorption ionisation time-of-flight analysis revealed 4656.4 Da of the purified and unreduced peptide. A comparison of the N-terminal amino acid sequence of oyster antimicrobial peptide with deduced amino acid sequences in our local expressed sequence tag (EST) database of C. gigas (unpublished data) revealed that the oyster antimicrobial peptide sequence entirely matched the deduced amino acid sequence of an EST clone (HM-8_A04), which was highly homologous with the β-thymosin of other species. The cDNA possessed a 126-bp open reading frame that encoded a protein of 41 amino acids. To confirm the antimicrobial activity of C. gigas β-thymosin, we overexpressed a recombinant β-thymosin (rcgTβ) using a pET22 expression plasmid in an Escherichia coli system. The antimicrobial activity of rcgTβ was evaluated and demonstrated using a bacterial growth inhibition test in both liquid and solid cultures. Copyright © 2015 Elsevier Ltd. All rights reserved.
Wu, C J; Janssen, G R
1996-10-01
The Streptomyces vinaceus viomycin phosphotransferase (vph) mRNA contains an untranslated leader with a conventional Shine-Dalgarno homology. The vph leader was removed by ligation of the vph coding sequence to the transcriptional start site of a Streptomyces or an Escherichia coli promoter, such that transcription would initiate at the first position of the vph start codon. Analysis of mRNA demonstrated that transcription initiated primarily at the A of the vph AUG translational start codon in both Streptomyces lividans and E. coli; cells expressing the unleadered vph mRNA were resistant to viomycin indicating that the Shine-Dalgarno sequence, or other features contained within the leader, was not necessary for vph translation. Addition of four nucleotides (5'-AUGC-3') onto the 5' end of the unleadered vph mRNA resulted in translation initiation from the vph start codon and the AUG triplet contained within the added sequence. Translational fusions of vph sequence to a Tn5 neo reporter gene indicated that the first 16 codons of vph coding sequence were sufficient to specify the translational start site and reading frame for expression of neomycin resistance in both E. coli and S. lividans.
Yasuno, Rie; Wada, Hajime
1998-01-01
Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide. PMID:9808738
Simon, J W; Slabas, A R
1998-09-18
The GenBank database was searched using the E. coli malonyl CoA:ACP transacylase (MCAT) sequence, for plant protein/cDNA sequences corresponding to MCAT, a component of plant fatty acid synthetase (FAS), for which the plant cDNA has not been isolated. A 272-bp Zea mays EST sequence (GenBank accession number: AA030706) was identified which has strong homology to the E. coli MCAT. A PCR derived cDNA probe from Zea mays was used to screen a Brassica napus (rape) cDNA library. This resulted in the isolation of a 1200-bp cDNA clone which encodes an open reading frame corresponding to a protein of 351 amino acids. The protein shows 47% homology to the E. coli MCAT amino acid sequence in the coding region for the mature protein. Expression of a plasmid (pMCATrap2) containing the plant cDNA sequence in Fab D89, an E. coli mutant, in MCAT activity restores growth demonstrating functional complementation and direct function of the cloned cDNA. This is the first functional evidence supporting the identification of a plant cDNA for MCAT.
Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K
2016-04-18
Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA.
Groves, Benjamin; Kuchina, Anna; Rosenberg, Alexander B.; Jojic, Nebojsa; Fields, Stanley; Seelig, Georg
2017-01-01
Our ability to predict protein expression from DNA sequence alone remains poor, reflecting our limited understanding of cis-regulatory grammar and hampering the design of engineered genes for synthetic biology applications. Here, we generate a model that predicts the protein expression of the 5′ untranslated region (UTR) of mRNAs in the yeast Saccharomyces cerevisiae. We constructed a library of half a million 50-nucleotide-long random 5′ UTRs and assayed their activity in a massively parallel growth selection experiment. The resulting data allow us to quantify the impact on protein expression of Kozak sequence composition, upstream open reading frames (uORFs), and secondary structure. We trained a convolutional neural network (CNN) on the random library and showed that it performs well at predicting the protein expression of both a held-out set of the random 5′ UTRs as well as native S. cerevisiae 5′ UTRs. The model additionally was used to computationally evolve highly active 5′ UTRs. We confirmed experimentally that the great majority of the evolved sequences led to higher protein expression rates than the starting sequences, demonstrating the predictive power of this model. PMID:29097404
The primitive code and repeats of base oligomers as the primordial protein-encoding sequence.
Ohno, S; Epplen, J T
1983-01-01
Even if the prebiotic self-replication of nucleic acids and the subsequent emergence of primitive, enzyme-independent tRNAs are accepted as plausible, the origin of life by spontaneous generation still appears improbable. This is because the just-emerged primitive translational machinery had to cope with base sequences that were not preselected for their coding potentials. Particularly if the primitive mitochondria-like code with four chain-terminating base triplets preceded the universal code, the translation of long, randomly generated, base sequences at this critical stage would have merely resulted in the production of short oligopeptides instead of long polypeptide chains. We present the base sequence of a mouse transcript containing tetranucleotide repeats conserved during evolution. Even if translated in accordance with the primitive mitochondria-like code, this transcript in its three reading frames can yield 245-, 246-, and 251-residue-long tetrapeptidic periodical polypeptides that are already acquiring longer periodicities. We contend that the first set of base sequences translated at the beginning of life were such oligonucleotide repeats. By quickly acquiring longer periodicities, their products must have soon gained characteristic secondary structures--alpha-helical or beta-sheet or both. PMID:6574491
The Repeat Sequences and Elevated Substitution Rates of the Chloroplast accD Gene in Cupressophytes
Li, Jia; Su, Yingjuan; Wang, Ting
2018-01-01
The plastid accD gene encodes a subunit of the acetyl-CoA carboxylase (ACCase) enzyme. The length of accD gene has been supposed to expand in Cryptomeria japonica, Taiwania cryptomerioides, Cephalotaxus, Taxus chinensis, and Podocarpus lambertii, and the main reason for this phenomenon was the existence of tandemly repeated sequences. However, it is still unknown whether the accD gene length in other cupressophytes has expanded. Here, in order to investigate how widespread this phenomenon was, 18 accD sequences and its surrounding regions of cupressophyte were sequenced and analyzed. Together with 39 GenBank sequence data, our taxon sampling covered all the extant gymnosperm orders. The repetitive elements and substitution rates of accD among 57 gymnosperm species were analyzed, the results show: (1) Reading frame length of accD gene in 18 cupressophytes species has also expanded. (2) Many repetitive elements were identified in accD gene of cupressophyte lineages. (3) The synonymous and non-synonymous substitution rates of accD were accelerated in cupressophytes. (4) accD was located in rearrangement endpoints. These results suggested that repetitive elements may mediate the chloroplast genome rearrangement and accelerated the substitution rates. PMID:29731764
DNA sequence analysis of the photosynthesis region of Rhodobacter sphaeroides 2.4.1.
Choudhary, M; Kaplan, S
2000-02-15
This paper describes the DNA sequence of the photosynthesis region of Rhodobacter sphaeroides 2.4.1 (T). The photosynthesis gene cluster is located within a approximately 73 kb Ase I genomic DNA fragment containing the puf, puhA, cycA and puc operons. A total of 65 open reading frames (ORFs) have been identified, of which 61 showed significant similarity to genes/proteins of other organisms while only four did not reveal any significant sequence similarity to any gene/protein sequences in the database. The data were compared with the corresponding genes/ORFs from a different strain of R.sphaeroides and Rhodobacter capsulatus, a close relative of R. sphaeroides. A detailed analysis of the gene organization in the photosynthesis region revealed a similar gene order in both species with some notable differences located to the pucBAC = cycA region. In addition, photosynthesis gene regulatory protein (PpsR, FNR, IHF) binding motifs in upstream sequences of a number of photosynthesis genes have been identified and shown to differ between these two species. The difference in gene organization relative to pucBAC and cycA suggests that this region originated independently of the photosynthesis gene cluster of R.sphaeroides.
LaPolla, R J; Mayne, K M; Davidson, N
1984-01-01
A mouse cDNA clone has been isolated that contains the complete coding region of a protein highly homologous to the delta subunit of the Torpedo acetylcholine receptor (AcChoR). The cDNA library was constructed in the vector lambda 10 from membrane-associated poly(A)+ RNA from BC3H-1 mouse cells. Surprisingly, the delta clone was selected by hybridization with cDNA encoding the gamma subunit of the Torpedo AcChoR. The nucleotide sequence of the mouse cDNA clone contains an open reading frame of 520 amino acids. This amino acid sequence exhibits 59% and 50% sequence homology to the Torpedo AcChoR delta and gamma subunits, respectively. However, the mouse nucleotide sequence has several stretches of high homology with the Torpedo gamma subunit cDNA, but not with delta. The mouse protein has the same general structural features as do the Torpedo subunits. It is encoded by a 3.3-kilobase mRNA. There is probably only one, but at most two, chromosomal genes coding for this or closely related sequences. Images PMID:6096870
Cloning of an avilamycin biosynthetic gene cluster from Streptomyces viridochromogenes Tü57.
Gaisser, S; Trefzer, A; Stockert, S; Kirschning, A; Bechthold, A
1997-01-01
A 65-kb region of DNA from Streptomyces viridochromogenes Tü57, containing genes encoding proteins involved in the biosynthesis of avilamycins, was isolated. The DNA sequence of a 6.4-kb fragment from this region revealed four open reading frames (ORF1 to ORF4), three of which are fully contained within the sequenced fragment. The deduced amino acid sequence of AviM, encoded by ORF2, shows 37% identity to a 6-methylsalicylic acid synthase from Penicillium patulum. Cultures of S. lividans TK24 and S. coelicolor CH999 containing plasmids with ORF2 on a 5.5-kb PstI fragment were able to produce orsellinic acid, an unreduced version of 6-methylsalicylic acid. The amino acid sequence encoded by ORF3 (AviD) is 62% identical to that of StrD, a dTDP-glucose synthase from S. griseus. The deduced amino acid sequence of AviE, encoded by ORF4, shows 55% identity to a dTDP-glucose dehydratase (StrE) from S. griseus. Gene insertional inactivation experiments of aviE abolished avilamycin production, indicating the involvement of aviE in the biosynthesis of avilamycins. PMID:9335272