terminal repeat sequences: Topics by Science.gov

Sample records for terminal repeat sequences

Variation, Repetition, And Choice

PubMed Central

Abreu-Rodrigues, Josele; Lattal, Kennon A; dos Santos, Cristiano V; Matos, Ricardo A

2005-01-01

Experiment 1 investigated the controlling properties of variability contingencies on choice between repeated and variable responding. Pigeons were exposed to concurrent-chains schedules with two alternatives. In the REPEAT alternative, reinforcers in the terminal link depended on a single sequence of four responses. In the VARY alternative, a response sequence in the terminal link was reinforced only if it differed from the n previous sequences (lag criterion). The REPEAT contingency generated low, constant levels of sequence variation whereas the VARY contingency produced levels of sequence variation that increased with the lag criterion. Preference for the REPEAT alternative tended to increase directly with the degree of variation required for reinforcement. Experiment 2 examined the potential confounding effects in Experiment 1 of immediacy of reinforcement by yoking the interreinforcer intervals in the REPEAT alternative to those in the VARY alternative. Again, preference for REPEAT was a function of the lag criterion. Choice between varying and repeating behavior is discussed with respect to obtained behavioral variability, probability of reinforcement, delay of reinforcement, and switching within a sequence. PMID:15828592
All gene-sized DNA molecules in four species of hypotrichs have the same terminal sequence and an unusual 3' terminus.

PubMed Central

Klobutcher, L A; Swanton, M T; Donini, P; Prescott, D M

1981-01-01

In hypotrichous ciliates, all of the macronuclear DNA is in the form of low molecular weight molecules with an average size of approximately 2200 base pairs. Total macronuclear DNA from four hypotrichs has been shown to have inverted terminal repeats by direct sequence analysis. In Oxytricha nova, Oxytricha sp., and Stylonychia pustulata, this terminal sequence may be written as 5'-C4A4C4A4C4 ... 3'-G4T4G4T4G4T4G4T4G4 ... In Euplotes aediculatus, the sequences is similar but differs in the lengths of the duplex region (28 base pairs) and of the putative 3' extension (14 base pairs). Also in Euplotes, a second common sequence of 5 base pairs (A-A-C-T-T-T-T-G-A-A) occurs internal to the terminal repeat and a 17-base-pair heterogeneous region: 5'-C4A4C4A4C4A4C4(X)17T-T-G-A-A ... 3'-G2T4G4T4G4T4G4T4G4T4G4(X)17A-A-C-T-T ... The length of the terminal repeat sequence for O. nova was confirmed in cloned macronuclear DNA molecules. Images PMID:6265931
Structure and stability of the ankyrin domain of the Drosophila Notch receptor.

PubMed

Zweifel, Mark E; Leahy, Daniel J; Hughson, Frederick M; Barrick, Doug

2003-11-01

The Notch receptor contains a conserved ankyrin repeat domain that is required for Notch-mediated signal transduction. The ankyrin domain of Drosophila Notch contains six ankyrin sequence repeats previously identified as closely matching the ankyrin repeat consensus sequence, and a putative seventh C-terminal sequence repeat that exhibits lower similarity to the consensus sequence. To better understand the role of the Notch ankyrin domain in Notch-mediated signaling and to examine how structure is distributed among the seven ankyrin sequence repeats, we have determined the crystal structure of this domain to 2.0 angstroms resolution. The seventh, C-terminal, ankyrin sequence repeat adopts a regular ankyrin fold, but the first, N-terminal ankyrin repeat, which contains a 15-residue insertion, appears to be largely disordered. The structure reveals a substantial interface between ankyrin polypeptides, showing a high degree of shape and charge complementarity, which may be related to homotypic interactions suggested from indirect studies. However, the Notch ankyrin domain remains largely monomeric in solution, demonstrating that this interface alone is not sufficient to promote tight association. Using the structure, we have classified reported mutations within the Notch ankyrin domain that are known to disrupt signaling into those that affect buried residues and those restricted to surface residues. We show that the buried substitutions greatly decrease protein stability, whereas the surface substitutions have only a marginal affect on stability. The surface substitutions are thus likely to interfere with Notch signaling by disrupting specific Notch-effector interactions and map the sites of these interactions.
Repetitive DNA and Plant Domestication: Variation in Copy Number and Proximity to Genes of LTR-Retrotransposons among Wild and Cultivated Sunflower (Helianthus annuus) Genotypes

PubMed Central

Mascagni, Flavia; Barghini, Elena; Giordani, Tommaso; Rieseberg, Loren H.; Cavallini, Andrea; Natali, Lucia

2015-01-01

The sunflower (Helianthus annuus) genome contains a very large proportion of transposable elements, especially long terminal repeat retrotransposons. However, knowledge on the retrotransposon-related variability within this species is still limited. We used next-generation sequencing (NGS) technologies to perform a quantitative and qualitative survey of intraspecific variation of the retrotransposon fraction of the genome across 15 genotypes—7 wild accessions and 8 cultivars—of H. annuus. By mapping the Illumina reads of the 15 genotypes onto a library of sunflower long terminal repeat retrotransposons, we observed considerable variability in redundancy among genotypes, at both superfamily and family levels. In another analysis, we mapped Illumina paired reads to two sets of sequences, that is, long terminal repeat retrotransposons and protein-encoding sequences, and evaluated the extent of retrotransposon proximity to genes in the sunflower genome by counting the number of paired reads in which one read mapped to a retrotransposon and the other to a gene. Large variability among genotypes was also ascertained for retrotransposon proximity to genes. Both long terminal repeat retrotransposon redundancy and proximity to genes varied among retrotransposon families and also between cultivated and wild genotypes. Such differences are discussed in relation to the possible role of long terminal repeat retrotransposons in the domestication of sunflower. PMID:26608057
Interstitial telomeric sequences in vertebrate chromosomes: Origin, function, instability and evolution.

PubMed

Bolzán, Alejandro D

2017-07-01

By definition, telomeric sequences are located at the very ends or terminal regions of chromosomes. However, several vertebrate species show blocks of (TTAGGG)n repeats present in non-terminal regions of chromosomes, the so-called interstitial telomeric sequences (ITSs), interstitial telomeric repeats or interstitial telomeric bands, which include those intrachromosomal telomeric-like repeats located near (pericentromeric ITSs) or within the centromere (centromeric ITSs) and those telomeric repeats located between the centromere and the telomere (i.e., truly interstitial telomeric sequences) of eukaryotic chromosomes. According with their sequence organization, localization and flanking sequences, ITSs can be classified into four types: 1) short ITSs, 2) subtelomeric ITSs, 3) fusion ITSs, and 4) heterochromatic ITSs. The first three types have been described mainly in the human genome, whereas heterochromatic ITSs have been found in several vertebrate species but not in humans. Several lines of evidence suggest that ITSs play a significant role in genome instability and evolution. This review aims to summarize our current knowledge about the origin, function, instability and evolution of these telomeric-like repeats in vertebrate chromosomes. Copyright © 2017 Elsevier B.V. All rights reserved.
Repetitive DNA and Plant Domestication: Variation in Copy Number and Proximity to Genes of LTR-Retrotransposons among Wild and Cultivated Sunflower (Helianthus annuus) Genotypes.

PubMed

Mascagni, Flavia; Barghini, Elena; Giordani, Tommaso; Rieseberg, Loren H; Cavallini, Andrea; Natali, Lucia

2015-11-24

The sunflower (Helianthus annuus) genome contains a very large proportion of transposable elements, especially long terminal repeat retrotransposons. However, knowledge on the retrotransposon-related variability within this species is still limited. We used next-generation sequencing (NGS) technologies to perform a quantitative and qualitative survey of intraspecific variation of the retrotransposon fraction of the genome across 15 genotypes--7 wild accessions and 8 cultivars--of H. annuus. By mapping the Illumina reads of the 15 genotypes onto a library of sunflower long terminal repeat retrotransposons, we observed considerable variability in redundancy among genotypes, at both superfamily and family levels. In another analysis, we mapped Illumina paired reads to two sets of sequences, that is, long terminal repeat retrotransposons and protein-encoding sequences, and evaluated the extent of retrotransposon proximity to genes in the sunflower genome by counting the number of paired reads in which one read mapped to a retrotransposon and the other to a gene. Large variability among genotypes was also ascertained for retrotransposon proximity to genes. Both long terminal repeat retrotransposon redundancy and proximity to genes varied among retrotransposon families and also between cultivated and wild genotypes. Such differences are discussed in relation to the possible role of long terminal repeat retrotransposons in the domestication of sunflower. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
An additional function of the rough endoplasmic reticulum protein complex prolyl 3-hydroxylase 1·cartilage-associated protein·cyclophilin B: the CXXXC motif reveals disulfide isomerase activity in vitro.

PubMed

Ishikawa, Yoshihiro; Bächinger, Hans Peter

2013-11-01

Collagen biosynthesis occurs in the rough endoplasmic reticulum, and many molecular chaperones and folding enzymes are involved in this process. The folding mechanism of type I procollagen has been well characterized, and protein disulfide isomerase (PDI) has been suggested as a key player in the formation of the correct disulfide bonds in the noncollagenous carboxyl-terminal and amino-terminal propeptides. Prolyl 3-hydroxylase 1 (P3H1) forms a hetero-trimeric complex with cartilage-associated protein and cyclophilin B (CypB). This complex is a multifunctional complex acting as a prolyl 3-hydroxylase, a peptidyl prolyl cis-trans isomerase, and a molecular chaperone. Two major domains are predicted from the primary sequence of P3H1: an amino-terminal domain and a carboxyl-terminal domain corresponding to the 2-oxoglutarate- and iron-dependent dioxygenase domains similar to the α-subunit of prolyl 4-hydroxylase and lysyl hydroxylases. The amino-terminal domain contains four CXXXC sequence repeats. The primary sequence of cartilage-associated protein is homologous to the amino-terminal domain of P3H1 and also contains four CXXXC sequence repeats. However, the function of the CXXXC sequence repeats is not known. Several publications have reported that short peptides containing a CXC or a CXXC sequence show oxido-reductase activity similar to PDI in vitro. We hypothesize that CXXXC motifs have oxido-reductase activity similar to the CXXC motif in PDI. We have tested the enzyme activities on model substrates in vitro using a GCRALCG peptide and the P3H1 complex. Our results suggest that this complex could function as a disulfide isomerase in the rough endoplasmic reticulum.
Evidence for an uncommon alpha-actinin protein in Trichomonas vaginalis.

PubMed

Bricheux, G; Coffe, G; Pradel, N; Brugerolle, G

1998-09-15

As part of our ongoing project of identification of actin-binding proteins implicated in the cell transition (flagellate to amoeboid/adherent) of Trichomonas vaginalis, we have characterized an alpha-actinin-related protein in this parasite. The protein (P100) has a molecular mass of 100 kDa and an isoelectric point of 5.5. A monoclonal antibody raised against this protein co-localizes with the actin network. P100 gene transcripts are co-expressed with actin throughout the cell cycle. Analysis of the deduced protein sequence reveals three domains: an N-terminal actin-binding region; a central region rich in alpha-helix; and a C-terminal domain with Ca(2+)-binding capacity. Whereas the N- and C-terminal regions are well-conserved as compared to other alpha-actinins, we observe in the central region an atypical distribution of residues in five repeats. The sequence of the repeats does not show any homology with the rod domain of the other alpha-actinins, except for the first repeat which shows some similarity. The four other repeats of T. vaginalis P100 appear to result from a duplication event which is not detectable in the other sequences.
Comparison of the carboxy-terminal DP-repeat region in the co-chaperones Hop and Hip

PubMed Central

Nelson, Gregory M.; Huffman, Holly; Smith, David F.

2003-01-01

Functional steroid receptor complexes are assembled and maintained by an ordered pathway of interactions involving multiple components of the cellular chaperone machinery. Two of these components, Hop and Hip, serve as co-chaperones to the major heat shock proteins (Hsps), Hsp70 and Hsp90, and participate in intermediate stages of receptor assembly. In an effort to better understand the functions of Hop and Hip in the assembly process, we focused on a region of similarity located near the C-terminus of each co-chaperone. Contained within this region is a repeated sequence motif we have termed the DP repeat. Earlier mutagenesis studies implicated the DP repeat of either Hop or Hip in Hsp70 binding and in normal assembly of the co-chaperones with progesterone receptor (PR) complexes. We report here that the DP repeat lies within a protease-resistant domain that extends to or is near the C-terminus of both co-chaperones. Point mutations in the DP repeats render the C-terminal regions hypersensitive to proteolysis. In addition, a Hop DP mutant displays altered proteolytic digestion patterns, which suggest that the DP-repeat region influences the folding of other Hop domains. Although the respective DP regions of Hop and Hip share sequence and structural similarities, they are not functionally interchangeable. Moreover, a double-point mutation within the second DP-repeat unit of Hop that converts this to the sequence found in Hip disrupts Hop function; however, the corresponding mutation in Hip does not alter its function. We conclude that the DP repeats are important structural elements within a C-terminal domain, which is important for Hop and Hip function. PMID:14627198
Comparison of the carboxy-terminal DP-repeat region in the co-chaperones Hop and Hip.

PubMed

Nelson, Gregory M; Huffman, Holly; Smith, David F

2003-01-01

Functional steroid receptor complexes are assembled and maintained by an ordered pathway of interactions involving multiple components of the cellular chaperone machinery. Two of these components, Hop and Hip, serve as co-chaperones to the major heat shock proteins (Hsps), Hsp70 and Hsp90, and participate in intermediate stages of receptor assembly. In an effort to better understand the functions of Hop and Hip in the assembly process, we focused on a region of similarity located near the C-terminus of each co-chaperone. Contained within this region is a repeated sequence motif we have termed the DP repeat. Earlier mutagenesis studies implicated the DP repeat of either Hop or Hip in Hsp70 binding and in normal assembly of the co-chaperones with progesterone receptor (PR) complexes. We report here that the DP repeat lies within a protease-resistant domain that extends to or is near the C-terminus of both co-chaperones. Point mutations in the DP repeats render the C-terminal regions hypersensitive to proteolysis. In addition, a Hop DP mutant displays altered proteolytic digestion patterns, which suggest that the DP-repeat region influences the folding of other Hop domains. Although the respective DP regions of Hop and Hip share sequence and structural similarities, they are not functionally interchangeable. Moreover, a double-point mutation within the second DP-repeat unit of Hop that converts this to the sequence found in Hip disrupts Hop function; however, the corresponding mutation in Hip does not alter its function. We conclude that the DP repeats are important structural elements within a C-terminal domain, which is important for Hop and Hip function.
The repeating nucleotide sequence in the repetitive mitochondrial DNA from a "low-density" petite mutant of yeast.

PubMed Central

Van Kreijl, C F; Bos, J L

1977-01-01

The repeating nucleotide sequence of 68 base pairs in the mtDNA from an ethidium-induced cytoplasmic petite mutant of yeast has been determined. For sequence analysis specifically primed and terminated RNA copies, obtained by in vitro transcription of the separated strands, were use. The sequence consists of 66 consecutive AT base pairs flanked by two GC pairs and comprises nearly all of the mutant mitochondrial genome. The sequence, moreover, also represents the first part of wild-type mtDNA sequence so far. Images PMID:198740
Aggregation landscapes of Huntingtin exon 1 protein fragments and the critical repeat length for the onset of Huntington’s disease

PubMed Central

Chen, Mingchen; Wolynes, Peter G.

2017-01-01

Huntington’s disease (HD) is a neurodegenerative disease caused by an abnormal expansion in the polyglutamine (polyQ) track of the Huntingtin (HTT) protein. The severity of the disease depends on the polyQ repeat length, arising only in patients with proteins having 36 repeats or more. Previous studies have shown that the aggregation of N-terminal fragments (encoded by HTT exon 1) underlies the disease pathology in mouse models and that the HTT exon 1 gene product can self-assemble into amyloid structures. Here, we provide detailed structural mechanisms for aggregation of several protein fragments encoded by HTT exon 1 by using the associative memory, water-mediated, structure and energy model (AWSEM) to construct their free energy landscapes. We find that the addition of the N-terminal 17-residue sequence (NT17) facilitates polyQ aggregation by encouraging the formation of prefibrillar oligomers, whereas adding the C-terminal polyproline sequence (P10) inhibits aggregation. The combination of both terminal additions in HTT exon 1 fragment leads to a complex aggregation mechanism with a basic core that resembles that found for the aggregation of pure polyQ repeats using AWSEM. At the extrapolated physiological concentration, although the grand canonical free energy profiles are uphill for HTT exon 1 fragments having 20 or 30 glutamines, the aggregation landscape for fragments with 40 repeats has become downhill. This computational prediction agrees with the critical length found for the onset of HD and suggests potential therapies based on blocking early binding events involving the terminal additions to the polyQ repeats. PMID:28400517
RNA circularization reveals terminal sequence heterogeneity in a double-stranded RNA virus.

PubMed

Widmer, G

1993-03-01

Double-stranded RNA viruses (dsRNA), termed LRV1, have been found in several strains of the protozoan parasite Leishmania. With the aim of constructing a full-length cDNA copy of the viral genome, including its terminal sequences, a protocol based on PCR amplification across the 3'-5' junction of circularized RNA was developed. This method proved to be applicable to dsRNA. It provided a relatively simple alternative to one-sided PCR, without loss of specificity inherent in the use of generic primers. LRV1 terminal nucleotide sequences obtained by this method showed a considerable variation in length, particularly at the 5' end of the positive strand, as well as the potential for forming 3' overhangs. The opposite genomic end terminates in 0, 1, or 2 TCA trinucleotide repeats. These results are compared with terminal sequences derived from one-sided PCR experiments.
Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

PubMed

Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

2016-09-01

Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.
Rapid construction of insulated genetic circuits via synthetic sequence-guided isothermal assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Torella, JP; Boehm, CR; Lienert, F

2013-12-28

In vitro recombination methods have enabled one-step construction of large DNA sequences from multiple parts. Although synthetic biological circuits can in principle be assembled in the same fashion, they typically contain repeated sequence elements such as standard promoters and terminators that interfere with homologous recombination. Here we use a computational approach to design synthetic, biologically inactive unique nucleotide sequences (UNSes) that facilitate accurate ordered assembly. Importantly, our designed UNSes make it possible to assemble parts with repeated terminator and insulator sequences, and thereby create insulated functional genetic circuits in bacteria and mammalian cells. Using UNS-guided assembly to construct repeating promoter-gene-terminatormore » parts, we systematically varied gene expression to optimize production of a deoxychromoviridans biosynthetic pathway in Escherichia coli. We then used this system to construct complex eukaryotic AND-logic gates for genomic integration into embryonic stem cells. Construction was performed by using a standardized series of UNS-bearing BioBrick-compatible vectors, which enable modular assembly and facilitate reuse of individual parts. UNS-guided isothermal assembly is broadly applicable to the construction and optimization of genetic circuits and particularly those requiring tight insulation, such as complex biosynthetic pathways, sensors, counters and logic gates.« less
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence.

PubMed

Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J

2015-09-18

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence

DOE PAGES

Lahr, Roni M.; Mack, Seshat M.; Heroux, Annie; ...

2015-07-22

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. Amore » putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. Ultimately, these studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis.« less
Variation, Repetition, and Choice

ERIC Educational Resources Information Center

Abreu-Rodrigues, Josele; Lattal, Kennon A.; dos Santos, Cristiano V.; Matos, Ricardo A.

2005-01-01

Experiment 1 investigated the controlling properties of variability contingencies on choice between repeated and variable responding. Pigeons were exposed to concurrent-chains schedules with two alternatives. In the REPEAT alternative, reinforcers in the terminal link depended on a single sequence of four responses. In the VARY alternative, a…
Sequence of retrovirus provirus resembles that of bacterial transposable elements

NASA Astrophysics Data System (ADS)

Shimotohno, Kunitada; Mizutani, Satoshi; Temin, Howard M.

1980-06-01

The nucleotide sequences of the terminal regions of an infectious integrated retrovirus cloned in the modified λ phage cloning vector Charon 4A have been elucidated. There is a 569-base pair direct repeat at both ends of the viral DNA. The cell-virus junctions at each end consist of a 5-base pair direct repeat of cell DNA next to a 3-base pair inverted repeat of viral DNA. This structure resembles that of a transposable element and is consistent with the protovirus hypothesis that retroviruses evolved from the cell genome.
Terminal sequence importance of de novo proteins from binary-patterned library: stable artificial proteins with 11- or 12-amino acid alphabet.

PubMed

Okura, Hiromichi; Takahashi, Tsuyoshi; Mihara, Hisakazu

2012-06-01

Successful approaches of de novo protein design suggest a great potential to create novel structural folds and to understand natural rules of protein folding. For these purposes, smaller and simpler de novo proteins have been developed. Here, we constructed smaller proteins by removing the terminal sequences from stable de novo vTAJ proteins and compared stabilities between mutant and original proteins. vTAJ proteins were screened from an α3β3 binary-patterned library which was designed with polar/ nonpolar periodicities of α-helix and β-sheet. vTAJ proteins have the additional terminal sequences due to the method of constructing the genetically repeated library sequences. By removing the parts of the sequences, we successfully obtained the stable smaller de novo protein mutants with fewer amino acid alphabets than the originals. However, these mutants showed the differences on ANS binding properties and stabilities against denaturant and pH change. The terminal sequences, which were designed just as flexible linkers not as secondary structure units, sufficiently affected these physicochemical details. This study showed implications for adjusting protein stabilities by designing N- and C-terminal sequences.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zweifel,M.; Leahy, D.; Barrick, D.

Deltex is a cytosolic effector of Notch signaling thought to bind through its N-terminal domain to the Notch receptor. Here we report the structure of the Drosophila Deltex N-terminal domain, which contains two tandem WWE sequence repeats. The WWE repeats, which adopt a novel fold, are related by an approximate two-fold axis of rotation. Although the WWE repeats are structurally distinct, they interact extensively and form a deep cleft at their junction that appears well suited for ligand binding. The two repeats are thermodynamically coupled; this coupling is mediated in part by a conserved segment that is immediately C-terminal tomore » the second WWE domain. We demonstrate that although the Deltex WWE tandem is monomeric in solution, it forms a heterodimer with the ankyrin domain of the Notch receptor. These results provide structural and functional insight into how Deltex modulates Notch signaling, and how WWE modules recognize targets for ubiquitination.« less
The central domain of bovine submaxillary mucin consists of over 50 tandem repeats of 329 amino acids. Chromosomal localization of the BSM1 gene and relations to ovine and porcine counterparts.

PubMed

Jiang, W; Gupta, D; Gallagher, D; Davis, S; Bhavanandan, V P

2000-04-01

We previously elucidated five distinct protein domains (I-V) for bovine submaxillary mucin, which is encoded by two genes, BSM1 and BSM2. Using Southern blot analysis, genomic cloning and sequencing of the BSM1 gene, we now show that the central domain (V) consists of approximately 55 tandem repeats of 329 amino acids and that domains III-V are encoded by a 58.4-kb exon, the largest exon known for all genes to date. The BSM1 gene was mapped by fluorescence in situ hybridization to the proximal half of chromosome 5 at bands q2. 2-q2.3. The amino-acid sequence of six tandem repeats (two full and four partial) were found to have only 92-94% identities. We propose that the variability in the amino-acid sequences of the mucin tandem repeat is important for generating the combinatorial library of saccharides that are necessary for the protective function of mucins. The deduced peptide sequences of the central domain match those determined from the purified bovine submaxillary mucin and also show 68-94% identity to published peptide sequences of ovine submaxillary mucin. This indicates that the core protein of ovine submaxillary mucin is closely related to that of bovine submaxillary mucin and contains similar tandem repeats in the central domain. In contrast, the central domain of porcine submaxillary mucin is reported to consist of 81-amino-acid tandem repeats. However, both bovine submaxillary mucin and porcine submaxillary mucin contain similar N-terminal and C-terminal domains and the corresponding genes are in the conserved linkage regions of the respective genomes.
Multiple bidirectional initiations and terminations of transcription in the Marek's disease virus long repeat regions.

PubMed Central

Chen, X B; Velicer, L F

1991-01-01

Marek's disease is an oncogenic disease of chickens caused by a herpesvirus, Marek's disease virus (MDV). Serial in vitro passage of pathogenic MDV results in amplification of a 132-bp direct repeat in the MDV genome's TRL and IRL repeat regions and loss of tumorigenicity. This led to the hypothesis that upon such expansion, one or more tumor-inducing genes fail to be expressed. In this report a group of cDNAs mapping in the expanded regions were isolated from a pathogenic MDV strain in which the 132-bp direct repeat number was found to range between one and seven. Partial cDNA sequencing and S1 nuclease protection analysis revealed that the corresponding transcripts are either initiated or terminated within or near the expanded regions at multiple sites in both rightward and leftward directions. Furthermore, each 132-bp repeat contains one TATA box and two polyadenylation consensus sequences in each direction. These RNAs contain a partial copy or one or more full copies of the 132-bp direct repeat at either their 5' or 3' end. Northern (RNA) blot analysis showed that the majority of transcripts are 1.8 kb in size, while the minor species range in size from 0.67 to 3.1 kb. Together, these data raise the possibility that the 132-bp direct repeat, and indirectly its copy number, may be involved in the regulation of transcriptional initiation and termination and therefore in the generation of four groups of transcripts from the TRL and IRL, although this remains to be demonstrated. Images PMID:1850022
Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8)

PubMed Central

Russo, James J.; Bohenzky, Roy A.; Chien, Ming-Cheng; Chen, Jing; Yan, Ming; Maddalena, Dawn; Parry, J. Preston; Peruzzi, Daniela; Edelman, Isidore S.; Chang, Yuan; Moore, Patrick S.

1996-01-01

The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor. PMID:8962146
Telomere extension by telomerase and ALT generates variant repeats by mechanistically distinct processes

PubMed Central

Lee, Michael; Hills, Mark; Conomos, Dimitri; Stutz, Michael D.; Dagg, Rebecca A.; Lau, Loretta M.S.; Reddel, Roger R.; Pickett, Hilda A.

2014-01-01

Telomeres are terminal repetitive DNA sequences on chromosomes, and are considered to comprise almost exclusively hexameric TTAGGG repeats. We have evaluated telomere sequence content in human cells using whole-genome sequencing followed by telomere read extraction in a panel of mortal cell strains and immortal cell lines. We identified a wide range of telomere variant repeats in human cells, and found evidence that variant repeats are generated by mechanistically distinct processes during telomerase- and ALT-mediated telomere lengthening. Telomerase-mediated telomere extension resulted in biased repeat synthesis of variant repeats that differed from the canonical sequence at positions 1 and 3, but not at positions 2, 4, 5 or 6. This indicates that telomerase is most likely an error-prone reverse transcriptase that misincorporates nucleotides at specific positions on the telomerase RNA template. In contrast, cell lines that use the ALT pathway contained a large range of variant repeats that varied greatly between lines. This is consistent with variant repeats spreading from proximal telomeric regions throughout telomeres in a stochastic manner by recombination-mediated templating of DNA synthesis. The presence of unexpectedly large numbers of variant repeats in cells utilizing either telomere maintenance mechanism suggests a conserved role for variant sequences at human telomeres. PMID:24225324
TALE-Like Effectors Are an Ancestral Feature of the Ralstonia solanacearum Species Complex and Converge in DNA Targeting Specificity.

PubMed

Schandry, Niklas; de Lange, Orlando; Prior, Philippe; Lahaye, Thomas

2016-01-01

Ralstonia solanacearum, a species complex of bacterial plant pathogens divided into four monophyletic phylotypes, causes plant diseases in tropical climates around the world. Some strains exhibit a broad host range on solanaceous hosts, while others are highly host-specific as for example some banana-pathogenic strains. Previous studies showed that transcription activator-like (TAL) effectors from Ralstonia, termed RipTALs, are capable of activating reporter genes in planta, if these are preceded by a matching effector binding element (EBE). RipTALs target DNA via their central repeat domain (CRD), where one repeat pairs with one DNA-base of the given EBE. The repeat variable diresidue dictates base repeat specificity in a predictable fashion, known as the TALE code. In this work, we analyze RipTALs across all phylotypes of the Ralstonia solanacearum species complex. We find that RipTALs are prevalent in phylotypes I and IV but absent from most phylotype III and II strains (10/12, 8/14, 1/24, and 1/5 strains contained a RipTAL, respectively). RipTALs originating from strains of the same phylotype show high levels of sequence similarity (>98%) in the N-terminal and C-terminal regions, while RipTALs isolated from different phylotypes show 47-91% sequence similarity in those regions, giving rise to four RipTAL classes. We show that, despite sequence divergence, the base preference for guanine, mediated by the N-terminal region, is conserved across RipTALs of all classes. Using the number and order of repeats found in the CRD, we functionally sub-classify RipTALs, introduce a new simple nomenclature, and predict matching EBEs for all seven distinct RipTALs identified. We experimentally study RipTAL EBEs and uncover that some RipTALs are able to target the EBEs of other RipTALs, referred to as cross-reactivity. In particular, RipTALs from strains with a broad host range on solanaceous hosts cross-react on each other's EBEs. Investigation of sequence divergence between RipTAL repeats allows for a reconstruction of repeat array biogenesis, for example through slipped strand mispairing or gene conversion. Using these studies we show how RipTALs of broad host range strains evolved convergently toward a shared target sequence. Finally, we discuss the differences between TALE-likes of plant pathogens in the context of disease ecology.
Simian immunodeficiency viruses from African green monkeys display unusual genetic diversity.

PubMed Central

Johnson, P R; Fomsgaard, A; Allan, J; Gravell, M; London, W T; Olmsted, R A; Hirsch, V M

1990-01-01

African green monkeys are asymptomatic carriers of simian immunodeficiency viruses (SIV), commonly called SIVagm. As many as 50% of African green monkeys in the wild may be SIV seropositive. This high seroprevalence rate and the potential for genetic variation of lentiviruses suggested to us that African green monkeys may harbor widely differing genotypes of SIVagm. To investigate this hypothesis, we determined the entire nucleotide sequence of an infectious proviral molecular clone of SIVagm (155-4) and partial sequences (long terminal repeat and Gag) of three other distinct SIVagm isolates (90, gri-1, and ver-1). Comparisons among the SIVagm isolates revealed extreme diversity at the nucleotide and amino acid levels. Long terminal repeat nucleotide sequences varied up to 35% and Gag protein sequences varied up to 30%. The variability among SIVagm isolates exceeded the variability among any other group of primate lentiviruses. Our data suggest that SIVagm has been in the African green monkey population for a long time and may be the oldest primate lentivirus group in existence. PMID:2304139
Two tandemly repeated telomere-associated sequences in Nicotiana plumbaginifolia.

PubMed

Chen, C M; Wang, C T; Wang, C J; Ho, C H; Kao, Y Y; Chen, C C

1997-12-01

Two tandemly repeated telomere-associated sequences, NP3R and NP4R, have been isolated from Nicotiana plumbaginifolia. The length of a repeating unit for NP3R and NP4R is 165 and 180 nucleotides respectively. The abundance of NP3R, NP4R and telomeric repeats is, respectively, 8.4 x 10(4), 6 x 10(3) and 1.5 x 10(6) copies per haploid genome of N. plumbaginifolia. Fluorescence in situ hybridization revealed that NP3R is located at the ends and/or in interstitial regions of all 10 chromosomes and NP4R on the terminal regions of three chromosomes in the haploid genome of N. plumbaginifolia. Sequence homology search revealed that not only are NP3R and NP4R homologous to HRS60 and GRS, respectively, two tandem repeats isolated from N. tabacum, but that NP3R and NP4R are also related to each other, suggesting that they originated from a common ancestral sequence. The role of these repeated sequences in chromosome healing is discussed based on the observation that two to three copies of a telomere-similar sequence were present in each repeating unit of NP3R and NP4R.
Identification of Novel Inverted Terminal Repeat (ITR) Deletions of Human Adenovirus (AD) From Infected Host: Virulent Ads Containing Mixed Populations of Genomic Sequences

DTIC Science & Technology

2006-11-01

terminal repetition of adenvirus type 4 DNA. Gene 18:329-334. 20. Van der Veen , J., and J. H. Dijkman . 1962. Association of type 21 adenovirus with acute respiratory illness in military recruits. Am J Hyg 76:149-159.
Molecular cloning and long terminal repeat sequences of human endogenous retrovirus genes related to types A and B retrovirus genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ono, M.

1986-06-01

By using a DNA fragment primarily encoding the reverse transcriptase (pol) region of the Syrian hamster intracisternal A particle (IAP; type A retrovirus) gene as a probe, human endogenous retrovirus genes, tentatively termed HERV-K genes, were cloned from a fetal human liver gene library. Typical HERV-K genes were 9.1 or 9.4 kilobases in length, having long terminal repeats (LTRs) of ca. 970 base pairs. Many structural features commonly observed on the retrovirus LTRs, such as the TATAA box, polyadenylation signal, and terminal inverted repeats, were present on each LTR, and a lysine (K) tRNA having a CUU anticodon was identifiedmore » as a presumed primer tRNA. The HERV-K LTR, however, had little sequence homology to either the IAP LTR or other typical oncovirus LTRs. By filter hybridization, the number of HERV-K genes was estimated to be ca. 50 copies per haploid human genome. The cloned mouse mammary tumor virus (type B) gene was found to hybridize with both the HERV-K and IAP genes to essentially the same extent.« less
Adeno-associated virus inverted terminal repeats stimulate gene editing.

PubMed

Hirsch, M L

2015-02-01

Advancements in genome editing have relied on technologies to specifically damage DNA which, in turn, stimulates DNA repair including homologous recombination (HR). As off-target concerns complicate the therapeutic translation of site-specific DNA endonucleases, an alternative strategy to stimulate gene editing based on fragile DNA was investigated. To do this, an episomal gene-editing reporter was generated by a disruptive insertion of the adeno-associated virus (AAV) inverted terminal repeat (ITR) into the egfp gene. Compared with a non-structured DNA control sequence, the ITR induced DNA damage as evidenced by increased gamma-H2AX and Mre11 foci formation. As local DNA damage stimulates HR, ITR-mediated gene editing was investigated using DNA oligonucleotides as repair substrates. The AAV ITR stimulated gene editing >1000-fold in a replication-independent manner and was not biased by the polarity of the repair oligonucleotide. Analysis of additional human DNA sequences demonstrated stimulation of gene editing to varying degrees. In particular, inverted yet not direct, Alu repeats induced gene editing, suggesting a role for DNA structure in the repair event. Collectively, the results demonstrate that inverted DNA repeats stimulate gene editing via double-strand break repair in an episomal context and allude to efficient gene editing of the human chromosome using fragile DNA sequences.
The open reading frames in the 3' long terminal repeats of several mouse mammary tumor virus integrants encode V beta 3-specific superantigens

PubMed Central

1992-01-01

Mice expressing the minor lymphocyte stimulation antigens, Mls-1a, -2a, or -3a, singly on the B10.BR background have been generated. Mls phenotypes correlate with the integration of mouse mammary tumor viruses (MTV) in the mouse genome. The open reading frames within the 3' long terminal repeats of the integrated MTVs 1, 3, 6, and 13 encode V beta 3-specific superantigens. Sequence data for these viral superantigens is presented, indicating that it is the COOH-terminal portion of the viral superantigen that interacts with the T cell receptor V beta element. PMID:1309854
Orthologs in Arabidopsis thaliana of the Hsp70 interacting protein Hip

PubMed Central

Webb, Mary Alice; Cavaletto, John M.; Klanrit, Preekamol; Thompson, Gary A.

2001-01-01

The Hsp70-interacting protein Hip binds to the adenosine triphosphatase domain of Hsp70, stabilizing it in the adenosine 5′-diphosphate–ligated conformation and promoting binding of target polypeptides. In mammalian cells, Hip is a component of the cytoplasmic chaperone heterocomplex that regulates signal transduction via interaction with hormone receptors and protein kinases. Analysis of the complete genome sequence of the model flowering plant Arabidopsis thaliana revealed 2 genes encoding Hip orthologs. The deduced sequence of AtHip-1 consists of 441 amino acid residues and is 42% identical to human Hip. AtHip-1 contains the same functional domains characterized in mammalian Hip, including an N-terminal dimerization domain, an acidic domain, 3 tetratricopeptide repeats flanked by a highly charged region, a series of degenerate GGMP repeats, and a C-terminal region similar to the Sti1/Hop/p60 protein. The deduced amino acid sequence of AtHip-2 consists of 380 amino acid residues. AtHip-2 consists of a truncated Hip-like domain that is 46% identical to human Hip, followed by a C-terminal domain related to thioredoxin. AtHip-2 is 63% identical to another Hip-thioredoxin protein recently identified in Vitis labrusca (grape). The truncated Hip domain in AtHip-2 includes the amino terminus, the acidic domain, and tetratricopeptide repeats with flanking charged region. Analyses of expressed sequence tag databases indicate that both AtHip-1 and AtHip-2 are expressed in A thaliana and that orthologs of Hip are also expressed widely in other plants. The similarity between AtHip-1 and its mammalian orthologs is consistent with a similar role in plant cells. The sequence of AtHip-2 suggests the possibility of additional unique chaperone functions. PMID:11599566
The site-specific ribosomal DNA insertion element R1Bm belongs to a class of non-long-terminal-repeat retrotransposons.

PubMed Central

Xiong, Y; Eickbush, T H

1988-01-01

Two types of insertion elements, R1 and R2 (previously called type I and type II), are known to interrupt the 28S ribosomal genes of several insect species. In the silkmoth, Bombyx mori, each element occupies approximately 10% of the estimated 240 ribosomal DNA units, while at most only a few copies are located outside the ribosomal DNA units. We present here the complete nucleotide sequence of an R1 insertion from B. mori (R1Bm). This 5.1-kilobase element contains two overlapping open reading frames (ORFs) which together occupy 88% of its length. ORF1 is 461 amino acids in length and exhibits characteristics of retroviral gag genes. ORF2 is 1,051 amino acids in length and contains homology to reverse transcriptase-like enzymes. The analysis of 3' and 5' ends of independent isolates from the ribosomal locus supports the suggestion that R1 is still functioning as a transposable element. The precise location of the element within the genome implies that its transposition must occur with remarkable insertion sequence specificity. Comparison of the deduced amino acid sequences from six retrotransposons, R1 and R2 of B. mori, I factor and F element of Drosophila melanogaster, L1 of Mus domesticus, and Ingi of Trypanosoma brucei, reveals a relatively high level of sequence homology in the reverse transcriptase region. Like R1, these elements lack long terminal repeats. We have therefore named this class of related elements the non-long-terminal-repeat (non-LTR) retrotransposons. Images PMID:2447482
Functions of the 3′ and 5′ genome RNA regions of members of the genus Flavivirus

PubMed Central

Brinton, Margo A.; Basu, Mausumi

2015-01-01

The positive sense genomes of members of the genus Flavivirus in the family Flaviviridae are ~11 kb nts in length and have a 5′ type I cap but no 3′ poly A. The 5′ and 3′ terminal regions contain short conserved sequences that are proposed to be repeated remnants of an ancient sequence. However, the functions of most of these conserved sequences have not yet been determined. The terminal regions of the genome also contain multiple conserved RNA structures. Functional data for many of these structures has been obtained. Three sets of complementary 3′ and 5′ terminal region sequences, some of which are located in conserved RNA structures, interact to form a panhandle structure that is required for initiation of minus strand RNA synthesis with the 5′ terminal structure functioning as the promoter. How the switch from the terminal RNA structure base pairing to the long distance RNA-RNA interaction is triggered and regulated is not well understood but evidence suggests involvement of a cell protein binding to three sites on the 3′ terminal RNA structures and a cis-acting metastable 3′ RNA element in the 3′ terminal structure. Cell proteins may also be involved in facilitating exponential replication of nascent genomic RNA within replication vesicles at later times of infection cycle. Other conserved RNA structures and/or sequences in the 5′ and 3′ terminal regions have been proposed to regulate genome translation. Additional functions of the 5′ and 3′ terminal sequences have also been reported. PMID:25683510
Grasshopper, a long terminal repeat (LTR) retroelement in the phytopathogenic fungus Magnaporthe grisea.

PubMed

Dobinson, K F; Harris, R E; Hamer, J E

1993-01-01

The fungal phytopathogen Magnaporthe grisea parasitizes a wide variety of gramineous hosts. In the course of investigating the genetic relationship between pathogen genotype and host specificity we identified a retroelement that is present in some strains of M. grisea that infect finger millet and goosegrass (members of the plant genus Eleusine). The element, designated grasshopper (grh), is present in multiple copies and dispersed throughout the genome. DNA sequence analysis showed that grasshopper contains 198 base pair direct, long terminal repeats (LTRs) with features characteristic of retroviral and retrotransposon LTRs. Within the element we identified an open reading frame with sequences homologous to the reverse transcriptase, RNaseH, and integrase domains of retroelement pol genes. Comparison of the open reading frame with sequences from other retroelements showed that grh is related to the gypsy family of retrotransposons. Comparisons of the distribution of the grasshopper element with other dispersed repeated DNA sequences in M. grisea indicated that grasshopper was present in a broadly dispersed subgroup of Eleusine pathogens, suggesting that the element was acquired subsequent to the evolution of this host-specific form. We present arguments that the amplification of different retroelements within populations of M. grisea is a consequence of the clonal organization of the fungal populations.
The human immunodeficiency virus type 1 long terminal repeat specifies two different transcription complexes, only one of which is regulated by Tat.

PubMed Central

Lu, X; Welsh, T M; Peterlin, B M

1993-01-01

The human immunodeficiency virus type 1 long terminal repeat sets up two different transcription complexes, which have been called processive and nonprocessive complexes. By mutating and substituting cis-acting sequences, we mapped elements of the human immunodeficiency virus long terminal repeat that are responsible for creating each transcription complex. Whereas processive complexes are efficiently assembled by upstream promoter elements in the absence of the TATA box, nonprocessive complexes absolutely require the TATA box. Moreover, the TATA box alone can set up these nonprocessive complexes, and nonprocessive but not processive complexes are trans activated by Tat. Finally, a strong DNA-binding site between the TATA box and trans-activation-responsive region interferes with either the assembly or movement of these nonprocessive complexes and diminishes the effects of Tat. Thus, Tat affects a critical step in the formation of elongation-competent transcription complexes. Images PMID:8445708
Dynamic probability of reinforcement for cooperation: Random game termination in the centipede game.

PubMed

Krockow, Eva M; Colman, Andrew M; Pulford, Briony D

2018-03-01

Experimental games have previously been used to study principles of human interaction. Many such games are characterized by iterated or repeated designs that model dynamic relationships, including reciprocal cooperation. To enable the study of infinite game repetitions and to avoid endgame effects of lower cooperation toward the final game round, investigators have introduced random termination rules. This study extends previous research that has focused narrowly on repeated Prisoner's Dilemma games by conducting a controlled experiment of two-player, random termination Centipede games involving probabilistic reinforcement and characterized by the longest decision sequences reported in the empirical literature to date (24 decision nodes). Specifically, we assessed mean exit points and cooperation rates, and compared the effects of four different termination rules: no random game termination, random game termination with constant termination probability, random game termination with increasing termination probability, and random game termination with decreasing termination probability. We found that although mean exit points were lower for games with shorter expected game lengths, the subjects' cooperativeness was significantly reduced only in the most extreme condition with decreasing computer termination probability and an expected game length of two decision nodes. © 2018 Society for the Experimental Analysis of Behavior.
Adenovirus sequences required for replication in vivo.

PubMed Central

Wang, K; Pearson, G D

1985-01-01

We have studied the in vivo replication properties of plasmids carrying deletion mutations within cloned adenovirus terminal sequences. Deletion mapping located the adenovirus DNA replication origin entirely within the first 67 bp of the adenovirus inverted terminal repeat. This region could be further subdivided into two functional domains: a minimal replication origin and an adjacent auxillary region which boosted the efficiency of replication by more than 100-fold. The minimal origin occupies the first 18 to 21 bp and includes sequences conserved between all adenovirus serotypes. The adjacent auxillary region extends past nucleotide 36 but not past nucleotide 67 and contains the binding site for nuclear factor I. Images PMID:2991857
Synthetic signal sequences that enable efficient secretory protein production in the yeast Kluyveromyces marxianus.

PubMed

Yarimizu, Tohru; Nakamura, Mikiko; Hoshida, Hisashi; Akada, Rinji

2015-02-14

Targeting of cellular proteins to the extracellular environment is directed by a secretory signal sequence located at the N-terminus of a secretory protein. These signal sequences usually contain an N-terminal basic amino acid followed by a stretch containing hydrophobic residues, although no consensus signal sequence has been identified. In this study, simple modeling of signal sequences was attempted using Gaussia princeps secretory luciferase (GLuc) in the yeast Kluyveromyces marxianus, which allowed comprehensive recombinant gene construction to substitute synthetic signal sequences. Mutational analysis of the GLuc signal sequence revealed that the GLuc hydrophobic peptide length was lower limit for effective secretion and that the N-terminal basic residue was indispensable. Deletion of the 16th Glu caused enhanced levels of secreted protein, suggesting that this hydrophilic residue defined the boundary of a hydrophobic peptide stretch. Consequently, we redesigned this domain as a repeat of a single hydrophobic amino acid between the N-terminal Lys and C-terminal Glu. Stretches consisting of Phe, Leu, Ile, or Met were effective for secretion but the number of residues affected secretory activity. A stretch containing sixteen consecutive methionine residues (M16) showed the highest activity; the M16 sequence was therefore utilized for the secretory production of human leukemia inhibitory factor protein in yeast, resulting in enhanced secreted protein yield. We present a new concept for the provision of secretory signal sequence ability in the yeast K. marxianus, determined by the number of residues of a single hydrophobic residue located between N-terminal basic and C-terminal acidic amino acid boundaries.

Antigenic Diversity of the Plasmodium vivax Circumsporozoite Protein in Parasite Isolates of Western Colombia

PubMed Central

Hernández-Martínez, Miguel Ángel; Escalante, Ananías A.; Arévalo-Herrera, Myriam; Herrera, Sócrates

2011-01-01

Circumsporozoite (CS) protein is a malaria antigen involved in sporozoite invasion of hepatocytes, and thus considered to have good vaccine potential. We evaluated the polymorphism of the Plasmodium vivax CS gene in 24 parasite isolates collected from malaria-endemic areas of Colombia. We sequenced 27 alleles, most of which (25/27) corresponded to the VK247 genotype and the remainder to the VK210 type. All VK247 alleles presented a mutation (Gly → Asn) at position 28 in the N-terminal region, whereas the C-terminal presented three insertions: the ANKKAGDAG, which is common in all VK247 isolates; 12 alleles presented the insertion GAGGQAAGGNAANKKAGDAG; and 5 alleles presented the insertion GGNAGGNA. Both repeat regions were polymorphic in gene sequence and size. Sequences coding for B-, T-CD4+, and T-CD8+ cell epitopes were found to be conserved. This study confirms the high polymorphism of the repeat domain and the highly conserved nature of the flanking regions. PMID:21292878
Characterization of gene encoding amylopullulanase from plant-originated lactic acid bacterium, Lactobacillus plantarum L137.

PubMed

Kim, Jong-Hyun; Sunako, Michihiro; Ono, Hisayo; Murooka, Yoshikatsu; Fukusaki, Eiichiro; Yamashita, Mitsuo

2008-11-01

A starch-hydrolyzing lactic acid bacterium, Lactobacillus plantarum L137, was isolated from traditional fermented food made from fish and rice in the Philippines. A gene (apuA) encoding an amylolytic enzyme from Lactobacillus plantarum L137 was cloned, and its nucleotide sequence was determined. The apuA gene consisted of an open reading frame of 6171 bp encoding a protein of 2056 amino acids, the molecular mass of which was calculated to be 215,625 Da. The catalytic domains of amylase and pullulanase were located in the same region within the middle of the N-terminal region. The deduced amino acid sequence revealed four highly conserved regions that are common among amylolytic enzymes. In the N-terminal region, a six-amino-acid sequence (Asp-Ala/Thr-Ala-Asn-Ser-Thr) is repeated 39 times, and a three-amino-acid sequence (Gln-Pro-Thr) is repeated 50 times in the C-terminal region. The apuA gene was subcloned in L. plantarum NCL21, which is a plasmid-cured derivative of the wild-type L137 strain and has no amylopullulanase activity, and the gene was overexpressed under the control of its own promoter. The ApuA enzyme from this recombinant L. plantarum NCL21 harboring apuA gene was purified. The enzyme has both alpha-amylase and pullulanase activities. The N-terminal sequence of the purified enzyme showed that the signal peptide was cleaved at Ala(36) and the molecular mass of the mature extracellular enzyme is 211,537 Da. The major reaction products from soluble starch were maltotriose (G3) and maltotetraose (G4). Only maltotriose (G3) was produced from pullulan. From these results, we concluded that ApuA is an amylolytic enzyme belonging to the amylopullulanase family.
Leucine-Rich Repeat Kinase 2 Binds to Neuronal Vesicles through Protein Interactions Mediated by Its C-Terminal WD40 Domain

PubMed Central

Piccoli, Giovanni; Onofri, Franco; Cirnaru, Maria Daniela; Kaiser, Christoph J. O.; Jagtap, Pravinkumar; Kastenmüller, Andreas; Pischedda, Francesca; Marte, Antonella; von Zweydorf, Felix; Vogt, Andreas; Giesert, Florian; Pan, Lifeng; Antonucci, Flavia; Kiel, Christina; Zhang, Mingjie; Weinkauf, Sevil; Sattler, Michael; Sala, Carlo; Matteoli, Michela; Ueffing, Marius

2014-01-01

Mutations in the leucine-rich repeat kinase 2 gene (LRRK2) are associated with familial and sporadic Parkinson's disease (PD). LRRK2 is a complex protein that consists of multiple domains, including predicted C-terminal WD40 repeats. In this study, we analyzed functional and molecular features conferred by the WD40 domain. Electron microscopic analysis of the purified LRRK2 C-terminal domain revealed doughnut-shaped particles, providing experimental evidence for its WD40 fold. We demonstrate that LRRK2 WD40 binds and sequesters synaptic vesicles via interaction with vesicle-associated proteins. In fact, a domain-based pulldown approach combined with mass spectrometric analysis identified LRRK2 as being part of a highly specific protein network involved in synaptic vesicle trafficking. In addition, we found that a C-terminal sequence variant associated with an increased risk of developing PD, G2385R, correlates with a reduced binding affinity of LRRK2 WD40 to synaptic vesicles. Our data demonstrate a critical role of the WD40 domain within LRRK2 function. PMID:24687852
Resistance to Change and Preference for Variable versus Fixed Response Sequences

ERIC Educational Resources Information Center

Arantes, Joana; Berg, Mark E.; Le, Dien; Grace, Randolph C.

2012-01-01

In Experiment 1, 4 pigeons were trained on a multiple chain schedule in which the initial link was a variable-interval (VI) 20-s schedule signalled by a red or green center key, and terminal links required four responses made to the left (L) and/or right (R) keys. In the REPEAT component, signalled by red keylights, only LRLR terminal-link…
Not so bad after all: retroviruses and long terminal repeat retrotransposons as a source of new genes in vertebrates.

PubMed

Naville, M; Warren, I A; Haftek-Terreau, Z; Chalopin, D; Brunet, F; Levin, P; Galiana, D; Volff, J-N

2016-04-01

Viruses and transposable elements, once considered as purely junk and selfish sequences, have repeatedly been used as a source of novel protein-coding genes during the evolution of most eukaryotic lineages, a phenomenon called 'molecular domestication'. This is exemplified perfectly in mammals and other vertebrates, where many genes derived from long terminal repeat (LTR) retroelements (retroviruses and LTR retrotransposons) have been identified through comparative genomics and functional analyses. In particular, genes derived from gag structural protein and envelope (env) genes, as well as from the integrase-coding and protease-coding sequences, have been identified in humans and other vertebrates. Retroelement-derived genes are involved in many important biological processes including placenta formation, cognitive functions in the brain and immunity against retroelements, as well as in cell proliferation, apoptosis and cancer. These observations support an important role of retroelement-derived genes in the evolution and diversification of the vertebrate lineage. Copyright © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Ten tandem repeats of {beta}-hCG 109-118 enhance immunogenicity and anti-tumor effects of {beta}-hCG C-terminal peptide carried by mycobacterial heat-shock protein HSP65

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang Yankai; Yan Rong; He Yi

2006-07-14

The {beta}-subunit of human chorionic gonadotropin ({beta}-hCG) is secreted by many kinds of tumors and it has been used as an ideal target antigen to develop vaccines against tumors. In view of the low immunogenicity of this self-peptide,we designed a method based on isocaudamer technique to repeat tandemly the 10-residue sequence X of {beta}-hCG (109-118), then 10 tandemly repeated copies of the 10-residue sequence combined with {beta}-hCG C-terminal 37 peptides were fused to mycobacterial heat-shock protein 65 to construct a fusion protein HSP65-X10-{beta}hCGCTP37 as an immunogen. In this study, we examined the effect of the tandem repeats of this 10-residuemore » sequence in eliciting an immune by comparing the immunogenicity and anti-tumor effects of the two immunogens, HSP65-X10-{beta}hCGCTP37 and HSP65-{beta}hCGCTP37 (without the 10 tandem repeats). Immunization of mice with the fusion protein HSP65-X10-{beta}hCGCTP37 elicited much higher levels of specific anti-{beta}-hCG antibodies and more effectively inhibited the growth of Lewis lung carcinoma (LLC) in vivo than with HSP65-{beta}hCGCTP37, which should suggest that HSP65-X10-{beta}hCGCTP37 may be an effective protein vaccine for the treatment of {beta}-hCG-dependent tumors and multiple tandem repeats of a certain epitope are an efficient method to overcome the low immunogenicity of self-peptide antigens.« less
Analysis of the primary structure of the long terminal repeat and the gag and pol genes of the human spumaretrovirus.

PubMed Central

Maurer, B; Bannert, H; Darai, G; Flügel, R M

1988-01-01

The nucleotide sequence of the human spumaretrovirus (HSRV) genome was determined. The 5' long terminal repeat region was analyzed by strong stop cDNA synthesis and S1 nuclease mapping. The length of the RU5 region was determined and found to be 346 nucleotides long. The 5' long terminal repeat is 1,123 base pairs long and is bound by an 18-base-pair primer-binding site complementary to the 3' end of mammalian lysine-1,2-specific tRNA. Open reading frames for gag and pol genes were identified. Surprisingly, the HSRV gag protein does not contain the cysteine motif of the nucleic acid-binding proteins found in and typical of all other retroviral gag proteins; instead the HSRV gag gene encodes a strongly basic protein reminiscent of those of hepatitis B virus and retrotransposons. The carboxy-terminal part of the HSRV gag gene products encodes a protease domain. The pol gene overlaps the gag gene and is postulated to be synthesized as a gag/pol precursor via translational frameshifting analogous to that of Rous sarcoma virus, with 7 nucleotides immediately upstream of the termination codons of gag conserved between the two viral genomes. The HSRV pol gene is 2,730 nucleotides long, and its deduced protein sequence is readily subdivided into three well-conserved domains, the reverse transcriptase, the RNase H, and the integrase. Although the degree of homology of the HSRV reverse transcriptase domain is highest to that of murine leukemia virus, the HSRV genomic organization is more similar to that of human and simian immunodeficiency viruses. The data justify classifying the spumaretroviruses as a third subfamily of Retroviridae. Images PMID:2451755
Prediction of Transcriptional Terminators in Bacillus subtilis and Related Species

PubMed Central

de Hoon, Michiel J. L.; Makita, Yuko; Nakai, Kenta; Miyano, Satoru

2005-01-01

In prokaryotes, genes belonging to the same operon are transcribed in a single mRNA molecule. Transcription starts as the RNA polymerase binds to the promoter and continues until it reaches a transcriptional terminator. Some terminators rely on the presence of the Rho protein, whereas others function independently of Rho. Such Rho-independent terminators consist of an inverted repeat followed by a stretch of thymine residues, allowing us to predict their presence directly from the DNA sequence. Unlike in Escherichia coli, the Rho protein is dispensable in Bacillus subtilis, suggesting a limited role for Rho-dependent termination in this organism and possibly in other Firmicutes. We analyzed 463 experimentally known terminating sequences in B. subtilis and found a decision rule to distinguish Rho-independent transcriptional terminators from non-terminating sequences. The decision rule allowed us to find the boundaries of operons in B. subtilis with a sensitivity and specificity of about 94%. Using the same decision rule, we found an average sensitivity of 94% for 57 bacteria belonging to the Firmicutes phylum, and a considerably lower sensitivity for other bacteria. Our analysis shows that Rho-independent termination is dominant for Firmicutes in general, and that the properties of the transcriptional terminators are conserved. Terminator prediction can be used to reliably predict the operon structure in these organisms, even in the absence of experimentally known operons. Genome-wide predictions of Rho-independent terminators for the 57 Firmicutes are available in the Supporting Information section. PMID:16110342
Molecular structure and chromosome distribution of three repetitive DNA families in Anemone hortensis L. (Ranunculaceae).

PubMed

Mlinarec, Jelena; Chester, Mike; Siljak-Yakovlev, Sonja; Papes, Drazena; Leitch, Andrew R; Besendorfer, Visnja

2009-01-01

The structure, abundance and location of repetitive DNA sequences on chromosomes can characterize the nature of higher plant genomes. Here we report on three new repeat DNA families isolated from Anemone hortensis L.; (i) AhTR1, a family of satellite DNA (stDNA) composed of a 554-561 bp long EcoRV monomer; (ii) AhTR2, a stDNA family composed of a 743 bp long HindIII monomer and; (iii) AhDR, a repeat family composed of a 945 bp long HindIII fragment that exhibits some sequence similarity to Ty3/gypsy-like retroelements. Fluorescence in-situ hybridization (FISH) to metaphase chromosomes of A. hortensis (2n = 16) revealed that both AhTR1 and AhTR2 sequences co-localized with DAPI-positive AT-rich heterochromatic regions. AhTR1 sequences occur at intercalary DAPI bands while AhTR2 sequences occur at 8-10 terminally located heterochromatic blocks. In contrast AhDR sequences are dispersed over all chromosomes as expected of a Ty3/gypsy-like element. AhTR2 and AhTR1 repeat families include polyA- and polyT-tracks, AT/TA-motifs and a pentanucleotide sequence (CAAAA) that may have consequences for chromatin packing and sequence homogeneity. AhTR2 repeats also contain TTTAGGG motifs and degenerate variants. We suggest that they arose by interspersion of telomeric repeats with subtelomeric repeats, before hybrid unit(s) amplified through the heterochromatic domain. The three repetitive DNA families together occupy approximately 10% of the A. hortensis genome. Comparative analyses of eight Anemone species revealed that the divergence of the A. hortensis genome was accompanied by considerable modification and/or amplification of repeats.
Complete Genome Sequence of a Naturally Occurring Simian Foamy Virus Isolate from Rhesus Macaque (SFVmmu_K3T).

PubMed

Nandakumar, Subhiksha; Bae, Eunhae H; Khan, Arifa S

2017-08-17

The full-length genome sequence of a simian foamy virus (SFVmmu_K3T), isolated from a rhesus macaque ( Macaca mulatta ), was obtained using high-throughput sequencing. SFVmmu_K3T consisted of 12,983 bp and had a genomic organization similar to that of other SFVs, with long terminal repeats (LTRs) and open reading frames for Gag, Pol, Env, Tas, and Bet.
Molecular cloning and sequence analysis of the gene coding for the 57kDa soluble antigen of the salmonid fish pathogen Renibacterium salmoninarum

USGS Publications Warehouse

Chien, Maw-Sheng; Gilbert , Teresa L.; Huang, Chienjin; Landolt, Marsha L.; O'Hara, Patrick J.; Winton, James R.

1992-01-01

The complete sequence coding for the 57-kDa major soluble antigen of the salmonid fish pathogen, Renibacterium salmoninarum, was determined. The gene contained an opening reading frame of 1671 nucleotides coding for a protein of 557 amino acids with a calculated Mr value of 57190. The first 26 amino acids constituted a signal peptide. The deduced sequence for amino acid residues 27–61 was in agreement with the 35 N-terminal amino acid residues determined by microsequencing, suggesting the protein in synthesized as a 557-amino acid precursor and processed to produce a mature protein of Mr 54505. Two regions of the protein contained imperfect direct repeats. The first region contained two copies of an 81-residue repeat, the second contained five copies of an unrelated 25-residue repeat. Also, a perfect inverted repeat (including three in-frame UAA stop codons) was observed at the carboxyl-terminus of the gene.
The sequence of camelpox virus shows it is most closely related to variola virus, the cause of smallpox.

PubMed

Gubser, Caroline; Smith, Geoffrey L

2002-04-01

Camelpox virus (CMPV) and variola virus (VAR) are orthopoxviruses (OPVs) that share several biological features and cause high mortality and morbidity in their single host species. The sequence of a virulent CMPV strain was determined; it is 202182 bp long, with inverted terminal repeats (ITRs) of 6045 bp and has 206 predicted open reading frames (ORFs). As for other poxviruses, the genes are tightly packed with little non-coding sequence. Most genes within 25 kb of each terminus are transcribed outwards towards the terminus, whereas genes within the centre of the genome are transcribed from either DNA strand. The central region of the genome contains genes that are highly conserved in other OPVs and 87 of these are conserved in all sequenced chordopoxviruses. In contrast, genes towards either terminus are more variable and encode proteins involved in host range, virulence or immunomodulation. In some cases, these are broken versions of genes found in other OPVs. The relationship of CMPV to other OPVs was analysed by comparisons of DNA and predicted protein sequences, repeats within the ITRs and arrangement of ORFs within the terminal regions. Each comparison gave the same conclusion: CMPV is the closest known virus to variola virus, the cause of smallpox.
Large diversity of the piggyBac-like elements in the genome of Tribolium castaneum

PubMed Central

Wang, Jianjun; Du, Yuzhou; Wang, Suzhi; Brown, Sue; Park, Yoonseong

2011-01-01

The piggyBac transposable element, originally discovered in the cabbage looper, Trichoplusia ni, has been widely used in insect transgenesis including the red flour beetle Tribolium castaneum. We surveyed piggyBac-like (PLE) sequences in the genome of Tribolium castaneum by homology searches using as queries the diverse PLE sequences that have been described previously. The search yielded a total of 32 piggyBac-like elements (TcPLEs) which were classified into 14 distinct groups. Most of the TcPLEs contain defective functional motifs in that they are lacking inverted terminal repeats or have disrupted open reading frames. Only one single copy of TcPLE1 appears to be intact with imperfect 16 bp inverted terminal repeats flanking an open reading frame encoding a transposase of 571 amino acid residues. Many copies of TcPLEs were found to be inserted into or close to other transposon-like sequences. This large diversity of TcPLEs with generally low copy numbers suggests multiple invasions of the TcPLEs over a long evolutionary time without extensive multiplications or occurrence of rapid loss of TcPLEs copies. PMID:18342253
Structure of genes and an insertion element in the methane producing archaebacterium Methanobrevibacter smithii.

PubMed

Hamilton, P T; Reeve, J N

1985-01-01

DNA fragments cloned from the methanogenic archaebacterium Methanobrevibacter smithii which complement mutations in the purE and proC genes of E. coli have been sequenced. Sequence analyses, transposon mutagenesis and expression in E. coli minicells indicate that purE and proC complementations result from the synthesis of M. smithii polypeptides with molecular weights of 36,697 and 27,836 respectively. The encoding genes appear to be located in operons. The M. smithii genome contains 69% A/T basepairs (bp) which is reflected in unusual codon usages and intergenic regions containing approximately 85% A/T bp. An insertion element, designated ISM1, was found within the cloned M. smithii DNA located adjacent to the proC complementing region. ISM1 is 1381 bp in length, has 29 bp terminal inverted repeat sequences and contains one major ORF encoded in 87% of the ISM1 sequence. ISM1 is mobile, present in approximately 10 copies per genome and integration duplicates 8 bp at the site of insertion. The duplicated sequences show homology with sequences within the 29 bp terminal repeat sequence of ISM1. Comparison of our data with sequences from halophilic archaebacteria suggests that 5'GAANTTTCA and 5'TTTTAATATAAA may be consensus promoter sequences for archaebacteria. These sequences closely resemble the consensus sequences which precede Drosophila heat-shock genes (Pelham 1982; Davidson et al. 1983). Methanogens appear to employ the eubacterial system of mRNA: 16SrRNA hybridization to ensure initiation of translation; the consensus ribosome binding sequence is 5'AGGTGA.
Identification and Characterization of Multiple Spidroin 1 Genes Encoding Major Ampullate Silk Proteins in Nephila clavipes

PubMed Central

Gaines, William A.; Marcotte, William R.

2010-01-01

Spider dragline silk is primarily composed of proteins called major ampullate spidroins (MaSp) that consist of a large repeat array flanked by non-repetitive N- and C-terminal domains. Until recently, there has been little evidence for more than one gene encoding each of the two major spidroin silk proteins, MaSp1 and MaSp2. Here, we report the deduced N-terminal domain sequences for two distinct MaSp1 genes from Nephila clavipes (MaSp1A and MaSp1B) and for MaSp2. All three MaSp genes are co-expressed in the major ampullate gland. A search of the GenBank database also revealed two distinct MaSp1 C-terminal domain sequences. Sequencing confirmed that both MaSp1 genes are present in all seven Nephila clavipes spiders examined. The presence of nucleotide polymorphisms in these genes confirmed that MaSp1A and MaSp1B are distinct genetic loci and not merely alleles of the same gene. We have experimentally determined the transcription start sites for all three MaSp genes and established preliminary pairing between the two MaSp1 N- and C-terminal domains. Phylogenetic analysis of these new sequences and other published MaSp N- and C-terminal domain sequences illustrated that duplications of MaSp genes may be widespread among spider species. PMID:18828837
The short interspersed repetitive element of Trypanosoma cruzi, SIRE, is part of VIPER, an unusual retroelement related to long terminal repeat retrotransposons

PubMed Central

Vázquez, Martín; Ben-Dov, Claudia; Lorenzi, Hernan; Moore, Troy; Schijman, Alejandro; Levin, Mariano J.

2000-01-01

The short interspersed repetitive element (SIRE) of Trypanosoma cruzi was first detected when comparing the sequences of loci that encode the TcP2β genes. It is present in about 1,500–3,000 copies per genome, depending on the strain, and it is distributed in all chromosomes. An initial analysis of SIRE sequences from 21 genomic fragments allowed us to derive a consensus nucleotide sequence and structure for the element, consisting of three regions (I, II, and III) each harboring distinctive features. Analysis of 158 transcribed SIREs demonstrates that the consensus is highly conserved. The sequences of 51 cDNAs show that SIRE is included in the 3′ end of several mRNAs, always transcribed from the sense strand, contributing the polyadenylation site in 63% of the cases. This study led to the characterization of VIPER (vestigial interposed retroelement), a 2,326-bp-long unusual retroelement. VIPER's 5′ end is formed by the first 182 bp of SIRE, whereas its 3′ end is formed by the last 220 bp of the element. Both SIRE moieties are connected by a 1,924-bp-long fragment that carries a unique ORF encoding a complete reverse transcriptase-RNase H gene whose 15 C-terminal amino acids derive from codons specified by SIRE's region II. The amino acid sequence of VIPER's reverse transcriptase-RNase H shares significant homology to that of long terminal repeat retrotransposons. The fact that SIRE and VIPER sequences are found only in the T. cruzi genome may be of relevance for studies concerning the evolution and the genome flexibility of this protozoan parasite. PMID:10688909
Identification of multiple binding sites for the THAP domain of the Galileo transposase in the long terminal inverted-repeats.

PubMed

Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald

2013-08-01

Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Characterization of proviruses cloned from mink cell focus-forming virus-infected cellular DNA.

PubMed Central

Khan, A S; Repaske, R; Garon, C F; Chan, H W; Rowe, W P; Martin, M A

1982-01-01

Two proviruses were cloned from EcoRI-digested DNA extracted from mink cells chronically infected with AKR mink cell focus-forming (MCF) 247 murine leukemia virus (MuLV), using a lambda phage host vector system. One cloned MuLV DNA fragment (designated MCF 1) contained sequences extending 6.8 kilobases from an EcoRI restriction site in the 5' long terminal repeat (LTR) to an EcoRI site located in the envelope (env) region and was indistinguishable by restriction endonuclease mapping for 5.1 kilobases (except for the EcoRI site in the LTR) from the 5' end of AKR ecotropic proviral DNA. The DNA segment extending from 5.1 to 6.8 kilobases contained several restriction sites that were not present in the AKR ecotropic provirus. A 0.5-kilobase DNA segment located at the 3' end of MCF 1 DNA contained sequences which hybridized to a xenotropic env-specific DNA probe but not to labeled ecotropic env-specific DNA. This dual character of MCF 1 proviral DNA was also confirmed by analyzing heteroduplex molecules by electron microscopy. The second cloned proviral DNA (designated MCF 2) was a 6.9-kilobase EcoRI DNA fragment which contained LTR sequences at each end and a 2.0-kilobase deletion encompassing most of the env region. The MCF 2 proviral DNA proved to be a useful reagent for detecting LTRs electron microscopically due to the presence of nonoverlapping, terminally located LTR sequences which effected its circularization with DNAs containing homologous LTR sequences. Nucleotide sequence analysis demonstrated the presence of a 104-base-pair direct repeat in the LTR of MCF 2 DNA. In contrast, only a single copy of the reiterated component of the direct repeat was present in MCF 1 DNA. Images PMID:6281459
Spectroscopic insights into quadruplexes of five-repeat telomere DNA sequences upon G-block damage.

PubMed

Dvořáková, Zuzana; Vorlíčková, Michaela; Renčiuk, Daniel

2017-11-01

The DNA lesions, resulting from oxidative damage, were shown to destabilize human telomere four-repeat quadruplex and to alter its structure. Long telomere DNA, as a repetitive sequence, offers, however, other mechanisms of dealing with the lesion: extrusion of the damaged repeat into loop or shifting the quadruplex position by one repeat. Using circular dichroism and UV absorption spectroscopy and polyacrylamide electrophoresis, we studied consequences of lesions at different positions of the model five-repeat human telomere DNA sequences on the structure and stability of their quadruplexes in sodium and in potassium. The repeats affected by lesion are preferentially positioned as terminal overhangs of the core quadruplex structurally similar to the four-repeat one. Forced affecting of the inner repeats leads to presence of variety of more parallel folds in potassium. In sodium the designed models form mixture of two dominant antiparallel quadruplexes whose population varies with the position of the affected repeat. The shapes of quadruplex CD spectra, namely the height of dominant peaks, significantly correlate with melting temperatures. Lesion in one guanine tract of a more than four repeats long human telomere DNA sequence may cause re-positioning of its quadruplex arrangement associated with a shift of the structure to less common quadruplex conformations. The type of the quadruplex depends on the loop position and external conditions. The telomere DNA quadruplexes are quite resistant to the effect of point mutations due to the telomere DNA repetitive nature, although their structure and, consequently, function might be altered. Copyright © 2017. Published by Elsevier B.V.
Genome of Horsepox Virus

PubMed Central

Tulman, E. R.; Delhon, G.; Afonso, C. L.; Lu, Z.; Zsak, L.; Sandybaev, N. T.; Kerembekova, U. Z.; Zaitsev, V. L.; Kutish, G. F.; Rock, D. L.

2006-01-01

Here we present the genomic sequence of horsepox virus (HSPV) isolate MNR-76, an orthopoxvirus (OPV) isolated in 1976 from diseased Mongolian horses. The 212-kbp genome contained 7.5-kbp inverted terminal repeats and lacked extensive terminal tandem repetition. HSPV contained 236 open reading frames (ORFs) with similarity to those in other OPVs, with those in the central 100-kbp region most conserved relative to other OPVs. Phylogenetic analysis of the conserved region indicated that HSPV is closely related to sequenced isolates of vaccinia virus (VACV) and rabbitpox virus, clearly grouping together these VACV-like viruses. Fifty-four HSPV ORFs likely represented fragments of 25 orthologous OPV genes, including in the central region the only known fragmented form of an OPV ribonucleotide reductase large subunit gene. In terminal genomic regions, HSPV lacked full-length homologues of genes variably fragmented in other VACV-like viruses but was unique in fragmentation of the homologue of VACV strain Copenhagen B6R, a gene intact in other known VACV-like viruses. Notably, HSPV contained in terminal genomic regions 17 kbp of OPV-like sequence absent in known VACV-like viruses, including fragments of genes intact in other OPVs and approximately 1.4 kb of sequence present only in cowpox virus (CPXV). HSPV also contained seven full-length genes fragmented or missing in other VACV-like viruses, including intact homologues of the CPXV strain GRI-90 D2L/I4R CrmB and D13L CD30-like tumor necrosis factor receptors, D3L/I3R and C1L ankyrin repeat proteins, B19R kelch-like protein, D7L BTB/POZ domain protein, and B22R variola virus B22R-like protein. These results indicated that HSPV contains unique genomic features likely contributing to a unique virulence/host range phenotype. They also indicated that while closely related to known VACV-like viruses, HSPV contains additional, potentially ancestral sequences absent in other VACV-like viruses. PMID:16940536

Cassandra retrotransposons carry independently transcribed 5S RNA

PubMed Central

Kalendar, Ruslan; Tanskanen, Jaakko; Chang, Wei; Antonius, Kristiina; Sela, Hanan; Peleg, Ofer; Schulman, Alan H.

2008-01-01

We report a group of TRIMs (terminal-repeat retrotransposons in miniature), which are small nonautonomous retrotransposons. These elements, named Cassandra, universally carry conserved 5S RNA sequences and associated RNA polymerase (pol) III promoters and terminators in their long terminal repeats (LTRs). They were found in all vascular plants investigated. Uniquely for LTR retrotransposons, Cassandra produces noncapped, polyadenylated transcripts from the 5S pol III promoter. Capped, read-through transcripts containing Cassandra sequences can also be detected in RNA and in EST databases. The predicted Cassandra RNA 5S secondary structures resemble those for cellular 5S rRNA, with high information content specifically in the pol III promoter region. Genic integration sites are common for Cassandra, an unusual feature for abundant retrotransposons. The 5S in each LTR produces a tandem 5S arrangement with an inter-5S spacing resembling that of cellular 5S. The distribution of 5S genes is very variable in flowering plants and may be partially explained by Cassandra activity. Cassandra thus appears both to have adapted a ubiquitous cellular gene for ribosomal RNA for use as a promoter and to parasitize an as-yet-unidentified group of retrotransposons for the proteins needed in its lifecycle. PMID:18408163
Deletion of internal structured repeats increases the stability of a leucine-rich repeat protein, YopM

PubMed Central

Barrick, Doug

2011-01-01

Mapping the stability distributions of proteins in their native folded states provides a critical link between structure, thermodynamics, and function. Linear repeat proteins have proven more amenable to this kind of mapping than globular proteins. C-terminal deletion studies of YopM, a large, linear leucine-rich repeat (LRR) protein, show that stability is distributed quite heterogeneously, yet a high level of cooperativity is maintained [1]. Key components of this distribution are three interfaces that strongly stabilize adjacent sequences, thereby maintaining structural integrity and promoting cooperativity. To better understand the distribution of interaction energy around these critical interfaces, we studied internal (rather than terminal) deletions of three LRRs in this region, including one of these stabilizing interfaces. Contrary to our expectation that deletion of structured repeats should be destabilizing, we find that internal deletion of folded repeats can actually stabilize the native state, suggesting that these repeats are destabilizing, although paradoxically, they are folded in the native state. We identified two residues within this destabilizing segment that deviate from the consensus sequence at a position that normally forms a stacked leucine ladder in the hydrophobic core. Replacement of these nonconsensus residues with leucine is stabilizing. This stability enhancement can be reproduced in the context of nonnative interfaces, but it requires an extended hydrophobic core. Our results demonstrate that different LRRs vary widely in their contribution to stability, and that this variation is context-dependent. These two factors are likely to determine the types of rearrangements that lead to folded, functional proteins, and in turn, are likely to restrict the pathways available for the evolution of linear repeat proteins. PMID:21764506
The 28S–18S rDNA intergenic spacer from Crithidia fasciculata: repeated sequences, length heterogeneity, putative processing sites and potential interactions between U3 small nucleolar RNA and the ribosomal RNA precursor

PubMed Central

Schnare, Murray N.; Collings, James C.; Spencer, David F.; Gray, Michael W.

2000-01-01

In Crithidia fasciculata, the ribosomal RNA (rRNA) gene repeats range in size from ∼11 to 12 kb. This length heterogeneity is localized to a region of the intergenic spacer (IGS) that contains tandemly repeated copies of a 19mer sequence. The IGS also contains four copies of an ∼55 nt repeat that has an internal inverted repeat and is also present in the IGS of Leishmania species. We have mapped the C.fasciculata transcription initiation site as well as two other reverse transcriptase stop sites that may be analogous to the A0 and A′ pre-rRNA processing sites within the 5′ external transcribed spacer (ETS) of other eukaryotes. Features that could influence processing at these sites include two stretches of conserved primary sequence and three secondary structure elements present in the 5′ ETS. We also characterized the C.fasciculata U3 snoRNA, which has the potential for base-pairing with pre-rRNA sequences. Finally, we demonstrate that biosynthesis of large subunit rRNA in both C.fasciculata and Trypanosoma brucei involves 3′-terminal addition of three A residues that are not present in the corresponding DNA sequences. PMID:10982863
Molecular identification and characterization of clustered regularly interspaced short palindromic repeats (CRISPRs) in a urease-positive thermophilic Campylobacter sp. (UPTC).

PubMed

Tasaki, E; Hirayama, J; Tazumi, A; Hayashi, K; Hara, Y; Ueno, H; Moore, J E; Millar, B C; Matsuda, M

2012-02-01

Novel clustered regularly-interspaced short palindromic repeats (CRISPRs) locus [7,500 base pairs (bp) in length] occurred in the urease-positive thermophilic Campylobacter (UPTC) Japanese isolate, CF89-12. The 7,500 bp gene loci consisted of the 5'-methylaminomethyl-2-thiouridylate methyltransferase gene, putative (P) CRISPR associated (p-Cas), putative open reading frames, Cas1 and Cas2, leader sequence region (146 bp), 12 CRISPRs consensus sequence repeats (each 36 bp) separated by a non-repetitive unique spacer region of similar length (26-31 bp) and the phosphatidyl glycerophosphatase A gene. When the CRISPRs loci in the UPTC CF89-12 and five C. jejuni isolates were compared with one another, these six isolates contained p-Cas, Cas1 and Cas2 within the loci. Four to 12 CRISPRs consensus sequence repeats separated by a non-repetitive unique spacer region occurred in six isolates and the nucleotide sequences of those repeats gave approximately 92-100% similarity with each other. However, no sequence similarity occurred in the unique spacer regions among these isolates. The putative σ(70) transcriptional promoter and the hypothetical ρ-independent terminator structures for the CRISPRs and Cas were detected. No in vivo transcription of p-Cas, Cas1 and Cas2 was confirmed in the UPTC cells.
Evolution of genes and repeats in the Nimrod superfamily.

PubMed

Somogyi, Kálmán; Sipos, Botond; Pénzes, Zsolt; Kurucz, Eva; Zsámboki, János; Hultmark, Dan; Andó, István

2008-11-01

The recently identified Nimrod superfamily is characterized by the presence of a special type of EGF repeat, the NIM repeat, located right after a typical CCXGY/W amino acid motif. On the basis of structural features, nimrod genes can be divided into three types. The proteins encoded by Draper-type genes have an EMI domain at the N-terminal part and only one copy of the NIM motif, followed by a variable number of EGF-like repeats. The products of Nimrod B-type and Nimrod C-type genes (including the eater gene) have different kinds of N-terminal domains, and lack EGF-like repeats but contain a variable number of NIM repeats. Draper and Nimrod C-type (but not Nimrod B-type) proteins carry a transmembrane domain. Several members of the superfamily were claimed to function as receptors in phagocytosis and/or binding of bacteria, which indicates an important role in the cellular immunity and the elimination of apoptotic cells. In this paper, the evolution of the Nimrod superfamily is studied with various methods on the level of genes and repeats. A hypothesis is presented in which the NIM repeat, along with the EMI domain, emerged by structural reorganizations at the end of an EGF-like repeat chain, suggesting a mechanism for the formation of novel types of repeats. The analyses revealed diverse evolutionary patterns in the sequences containing multiple NIM repeats. Although in the Nimrod B and Nimrod C proteins show characteristics of independent evolution, many internal NIM repeats in Eater sequences seem to have undergone concerted evolution. An analysis of the nimrod genes has been performed using phylogenetic and other methods and an evolutionary scenario of the origin and diversification of the Nimrod superfamily is proposed. Our study presents an intriguing example how the evolution of multigene families may contribute to the complexity of the innate immune response.
Efficient production of artificially designed gelatins with a Bacillus brevis system.

PubMed

Kajino, T; Takahashi, H; Hirai, M; Yamada, Y

2000-01-01

Artificially designed gelatins comprising tandemly repeated 30-amino-acid peptide units derived from human alphaI collagen were successfully produced with a Bacillus brevis system. The DNA encoding the peptide unit was synthesized by taking into consideration the codon usage of the host cells, but no clones having a tandemly repeated gene were obtained through the above-mentioned strategy. Minirepeat genes could be selected in vivo from a mixture of every possible sequence encoding an artificial gelatin by randomly ligating the mixed sequence unit and transforming it into Escherichia coli. Larger repeat genes constructed by connecting minirepeat genes obtained by in vivo selection were also stable in the expression host cells. Gelatins derived from the eight-unit and six-unit repeat genes were extracellularly produced at the level of 0.5 g/liter and easily purified by ammonium sulfate fractionation and anion-exchange chromatography. The purified artificial gelatins had the predicted N-terminal sequences and amino acid compositions and a solgel property similar to that of the native gelatin. These results suggest that the selection of a repeat unit sequence stable in an expression host is a shortcut for the efficient production of repetitive proteins and that it can conveniently be achieved by the in vivo selection method. This study revealed the possible industrial application of artificially designed repetitive proteins.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Follis, Kathryn E.; York, Joanne; Nunberg, Jack H.

The fusion subunit of the SARS-CoV S glycoprotein contains two regions of hydrophobic heptad-repeat amino acid sequences that have been shown in biophysical studies to form a six-helix bundle structure typical of the fusion-active core found in Class I viral fusion proteins. Here, we have applied serine-scanning mutagenesis to the C-terminal-most heptad-repeat region in the SARS-CoV S glycoprotein to investigate the functional role of this region in membrane fusion. We show that hydrophobic sidechains at a and d positions only within the short helical segment of the C-terminal heptad-repeat region (I1161, I1165, L1168, A1172, and L1175) are critical for cell-cellmore » fusion. Serine mutations at outlying heptad-repeat residues that form an extended chain in the core structure (V1158, L1179, and L1182) do not affect fusogenicity. Our study provides genetic evidence for the important role of {alpha}-helical packing in promoting S glycoprotein-mediated membrane fusion.« less
Hygromycin B phosphotransferase as a selectable marker for DNA transfer experiments with higher eucaryotic cells.

PubMed

Blochlinger, K; Diggelmann, H

1984-12-01

The DNA coding sequence for the hygromycin B phosphotransferase gene was placed under the control of the regulatory sequences of a cloned long terminal repeat of Moloney sarcoma virus. This construction allowed direct selection for hygromycin B resistance after transfection of eucaryotic cell lines not naturally resistant to this antibiotic, thus providing another dominant marker for DNA transfer in eucaryotic cells.
Hygromycin B phosphotransferase as a selectable marker for DNA transfer experiments with higher eucaryotic cells.

PubMed Central

Blochlinger, K; Diggelmann, H

1984-01-01

The DNA coding sequence for the hygromycin B phosphotransferase gene was placed under the control of the regulatory sequences of a cloned long terminal repeat of Moloney sarcoma virus. This construction allowed direct selection for hygromycin B resistance after transfection of eucaryotic cell lines not naturally resistant to this antibiotic, thus providing another dominant marker for DNA transfer in eucaryotic cells. Images PMID:6098829
Osteoblast-specific factor 2: cloning of a putative bone adhesion protein with homology with the insect protein fasciclin I.

PubMed Central

Takeshita, S; Kikuno, R; Tezuka, K; Amann, E

1993-01-01

A cDNA library prepared from the mouse osteoblastic cell line MC3T3-E1 was screened for the presence of specifically expressed genes by employing a combined subtraction hybridization/differential screening approach. A cDNA was identified and sequenced which encodes a protein designated osteoblast-specific factor 2 (OSF-2) comprising 811 amino acids. OSF-2 has a typical signal sequence, followed by a cysteine-rich domain, a fourfold repeated domain and a C-terminal domain. The protein lacks a typical transmembrane region. The fourfold repeated domain of OSF-2 shows homology with the insect protein fasciclin I. RNA analyses revealed that OSF-2 is expressed in bone and to a lesser extent in lung, but not in other tissues. Mouse OSF-2 cDNA was subsequently used as a probe to clone the human counterpart. Mouse and human OSF-2 show a high amino acid sequence conservation except for the signal sequence and two regions in the C-terminal domain in which 'in-frame' insertions or deletions are observed, implying alternative splicing events. On the basis of the amino acid sequence homology with fasciclin I, we suggest that OSF-2 functions as a homophilic adhesion molecule in bone formation. Images Figure 3 Figure 4 Figure 5 Figure 6 PMID:8363580
Linkage map of the fragments of herpesvirus papio DNA.

PubMed Central

Lee, Y S; Tanaka, A; Lau, R Y; Nonoyama, M; Rabin, H

1981-01-01

Herpesvirus papio (HVP), an Epstein-Barr-like virus, causes lymphoblastoid disease in baboons. The physical map of HVP DNA was constructed for the fragments produced by cleavage of HVP DNA with restriction endonucleases EcoRI, HindIII, SalI, and PvuI, which produced 12, 12, 10, and 4 fragments, respectively. The total molecular size of HVP DNA was calculated as close to 110 megadaltons. The following methods were used for construction of the map; (i) fragments near the ends of HVP DNA were identified by treating viral DNA with lambda exonuclease before restriction enzyme digestion; (ii) fragments containing nucleotide sequences in common with fragments from the second enzyme digest of HVP DNA were examined by Southern blot hybridization; and (iii) the location of some fragments was determined by isolating individual fragments from agarose gels and redigesting the isolated fragments with a second restriction enzyme. Terminal heterogeneity and internal repeats were found to be unique features of HVP DNA molecule. One to five repeats of 0.8 megadaltons were found at both terminal ends. Although the repeats of both ends shared a certain degree of homology, it was not determined whether they were identical repeats. The internal repeat sequence of HVP DNA was found in the EcoRI-C region, which extended from 8.4 to 23 megadaltons from the left end of the molecule. The average number of the repeats was calculated to be seven, and the molecular size was determined to be 1.8 megadaltons. Similar unique features have been reported in EBV DNA (D. Given and E. Kieff, J. Virol. 28:524-542, 1978). Images PMID:6261015
Molecular cloning of crustins from the hemocytes of Brazilian penaeid shrimps.

PubMed

Rosa, Rafael Diego; Bandeira, Paula Terra; Barracco, Margherita Anna

2007-09-01

Crustins are antimicrobial peptides initially identified in the hemocytes of the crab Carcinus maenas (11.5-kDa peptide or carcinin) and recently also recognized in penaeid shrimps and other crustacean species. The aim of this study was to identify sequences encoding for crustins from the hemocytes of four Brazilian penaeid species: Farfantepenaeus paulensis, Farfantepenaeus subtilis, Farfantepenaeus brasiliensis and Litopenaeus schmitti. Using primers based on consensus nucleotide alignment of crustins from different crustaceans, cDNA sequences coding for crustins in all indigenous penaeid species were amplified. The obtained four crustin sequences encoded for peptides containing a hydrophobic N-terminal region rich in glycine repeats and a C-terminal part with 12 cysteine residues and a conserved whey acidic protein domain. All obtained crustin sequences showed high amino acidic similarity among each other and with crustins from litopenaeid shrimps (76-98%). This is the first report of crustins in native Brazilian penaeid shrimps.
Insights on genome size evolution from a miniature inverted repeat transposon driving a satellite DNA.

PubMed

Scalvenzi, Thibault; Pollet, Nicolas

2014-12-01

The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.
Two new miniature inverted-repeat transposable elements in the genome of the clam Donax trunculus.

PubMed

Šatović, Eva; Plohl, Miroslav

2017-10-01

Repetitive sequences are important components of eukaryotic genomes that drive their evolution. Among them are different types of mobile elements that share the ability to spread throughout the genome and form interspersed repeats. To broaden the generally scarce knowledge on bivalves at the genome level, in the clam Donax trunculus we described two new non-autonomous DNA transposons, miniature inverted-repeat transposable elements (MITEs), named DTC M1 and DTC M2. Like other MITEs, they are characterized by their small size, their A + T richness, and the presence of terminal inverted repeats (TIRs). DTC M1 and DTC M2 are 261 and 286 bp long, respectively, and in addition to TIRs, both of them contain a long imperfect palindrome sequence in their central parts. These elements are present in complete and truncated versions within the genome of the clam D. trunculus. The two new MITEs share only structural similarity, but lack any nucleotide sequence similarity to each other. In a search for related elements in databases, blast search revealed within the Crassostrea gigas genome a larger element sharing sequence similarity only to DTC M1 in its TIR sequences. The lack of sequence similarity with any previously published mobile elements indicates that DTC M1 and DTC M2 elements may be unique to D. trunculus.
Complete genome sequence of the english isolate of rat cytomegalovirus (Murid herpesvirus 8).

PubMed

Ettinger, Jakob; Geyer, Henriette; Nitsche, Andreas; Zimmermann, Albert; Brune, Wolfram; Sandford, Gordon R; Hayward, Gary S; Voigt, Sebastian

2012-12-01

The complete genome of the English isolate of rat cytomegalovirus (RCMV-E) was determined. RCMV-E has a 202,946-bp genome with noninverting repeats but without terminal repeats. Thus, it differs significantly in size and genomic arrangement from closely related rodent cytomegaloviruses (CMVs). To account for the differences between the rat CMV isolates of Maastricht and England, RCMV-E was classified as Murid herpesvirus 8 by the International Committee on Taxonomy of Viruses.
Chromosome ends: different sequences may provide conserved functions.

PubMed

Louis, Edward J; Vershinin, Alexander V

2005-07-01

The structures of specific chromosome regions, centromeres and telomeres, present a number of puzzles. As functions performed by these regions are ubiquitous and essential, their DNA, proteins and chromatin structure are expected to be conserved. Recent studies of centromeric DNA from human, Drosophila and plant species have demonstrated that a hidden universal centromere-specific sequence is highly unlikely. The DNA of telomeres is more conserved consisting of a tandemly repeated 6-8 bp Arabidopsis-like sequence in a majority of organisms as diverse as protozoan, fungi, mammals and plants. However, there are alternatives to short DNA repeats at the ends of chromosomes and for telomere elongation by telomerase. Here we focus on the similarities and diversity that exist among the structural elements, DNA sequences and proteins, that make up terminal domains (telomeres and subtelomeres), and how organisms use these in different ways to fulfil the functions of end-replication and end-protection. Copyright (c) 2005 Wiley Periodicals, Inc.
Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie

2009-11-20

RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR)more » shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.« less
Complete covalent structure of statherin, a tyrosine-rich acidic peptide which inhibits calcium phosphate precipitation from human parotid saliva.

PubMed

Schlesinger, D H; Hay, D I

1977-03-10

The complete amino acid sequence of human salivary statherin, a peptide which strongly inhibits precipitation from supersaturated calcium phosphate solutions, and therefore stabilizes supersaturated saliva, has been determined. The NH2-terminal half of this Mr=5380 (43 amino acids) polypeptide was determined by automated Edman degradations (liquid phase) on native statherin. The peptide was digested separately with trypsin, chymotrypsin, and Staphylococcus aureus protease, and the resulting peptides were purified by gel filtration. Manual Edman degradations on purified peptide fragments yielded peptides that completed the amino acid sequence through the penultimate COOH-terminal residue. These analyses, together with carboxypeptidase digestion of native statherin and of peptide fragments of statherin, established the complete sequence of the molecule. The 2 serine residues (positions 2 and 3) in statherin were identified as phosphoserine. The amino acid sequence of human salivary statherin is striking in a number of ways. The NH2-terminal one-third is highly polar and includes three polar dipeptides: H2PO3-Ser-Ser-H2PO3-Arg-Arg-, and Glu-Glu-. The COOH-terminal two-thirds of the molecule is hydrophobic, containing several repeating dipeptides: four of -Gn-Pro-, three of -Tyr-Gln-, two of -Gly-Tyr-, two of-Gln-Tyr-, and two of the tetrapeptide sequence -Pro-Tyr-Gln-Pro-. Unusual cleavage sites in the statherin sequence obtained with chymotrypsin and S. aureus protease were also noted.
The core of tau-paired helical filaments studied by scanning transmission electron microscopy and limited proteolysis.

PubMed

von Bergen, Martin; Barghorn, Stefan; Müller, Shirley A; Pickhardt, Marcus; Biernat, Jacek; Mandelkow, Eva-Maria; Davies, Peter; Aebi, Ueli; Mandelkow, Eckhard

2006-05-23

In Alzheimer's disease and frontotemporal dementias the microtubule-associated protein tau forms intracellular paired helical filaments (PHFs). The filaments formed in vivo consist mainly of full-length molecules of the six different isoforms present in adult brain. The substructure of the PHF core is still elusive. Here we applied scanning transmission electron microscopy (STEM) and limited proteolysis to probe the mass distribution of PHFs and their surface exposure. Tau filaments assembled from the three repeat domain have a mass per length (MPL) of approximately 60 kDa/nm and filaments from full-length tau (htau40DeltaK280 mutant) have approximately 160 kDa/nm, compared with approximately 130 kDa/nm for PHFs from Alzheimer's brain. Polyanionic cofactors such as heparin accelerate assembly but are not incorporated into PHFs. Limited proteolysis combined with N-terminal sequencing and mass spectrometry of fragments reveals a protease-sensitive N-terminal half and semiresistant PHF core starting in the first repeat and reaching to the C-terminus of tau. Continued proteolysis leads to a fragment starting at the end of the first repeat and ending in the fourth repeat. PHFs from tau isoforms with four repeats revealed an additional cleavage site within the middle of the second repeat. Probing the PHFs with antibodies detecting epitopes either over longer stretches in the C-terminal half of tau or in the fourth repeat revealed that they grow in a polar manner. These data describe the physical parameters of the PHFs and enabled us to build a model of the molecular arrangement within the filamentous structures.
The primitive code and repeats of base oligomers as the primordial protein-encoding sequence.

PubMed Central

Ohno, S; Epplen, J T

1983-01-01

Even if the prebiotic self-replication of nucleic acids and the subsequent emergence of primitive, enzyme-independent tRNAs are accepted as plausible, the origin of life by spontaneous generation still appears improbable. This is because the just-emerged primitive translational machinery had to cope with base sequences that were not preselected for their coding potentials. Particularly if the primitive mitochondria-like code with four chain-terminating base triplets preceded the universal code, the translation of long, randomly generated, base sequences at this critical stage would have merely resulted in the production of short oligopeptides instead of long polypeptide chains. We present the base sequence of a mouse transcript containing tetranucleotide repeats conserved during evolution. Even if translated in accordance with the primitive mitochondria-like code, this transcript in its three reading frames can yield 245-, 246-, and 251-residue-long tetrapeptidic periodical polypeptides that are already acquiring longer periodicities. We contend that the first set of base sequences translated at the beginning of life were such oligonucleotide repeats. By quickly acquiring longer periodicities, their products must have soon gained characteristic secondary structures--alpha-helical or beta-sheet or both. PMID:6574491

DNA motifs determining the accuracy of repeat duplication during CRISPR adaptation in Haloarcula hispanica

PubMed Central

Wang, Rui; Li, Ming; Gong, Luyao; Hu, Songnian; Xiang, Hua

2016-01-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) acquire new spacers to generate adaptive immunity in prokaryotes. During spacer integration, the leader-preceded repeat is always accurately duplicated, leading to speculations of a repeat-length ruler. Here in Haloarcula hispanica, we demonstrate that the accurate duplication of its 30-bp repeat requires two conserved mid-repeat motifs, AACCC and GTGGG. The AACCC motif was essential and needed to be ∼10 bp downstream from the leader-repeat junction site, where duplication consistently started. Interestingly, repeat duplication terminated sequence-independently and usually with a specific distance from the GTGGG motif, which seemingly served as an anchor site for a molecular ruler. Accordingly, altering the spacing between the two motifs led to an aberrant duplication size (29, 31, 32 or 33 bp). We propose the adaptation complex may recognize these mid-repeat elements to enable measuring the repeat DNA for spacer integration. PMID:27085805
Mutation of Hip’s Carboxy-Terminal Region Inhibits a Transitional Stage of Progesterone Receptor Assembly

PubMed Central

Prapapanich, Viravan; Chen, Shiying; Smith, David F.

1998-01-01

Steroid receptor complexes are assembled through an ordered, multistep pathway involving multiple components of the cytoplasmic chaperone machinery. Two of these components are Hsp70-binding proteins, Hip and Hop, that have some limited homology in their C-terminal regions, outside the sequences mapped for Hsp70 binding. Within this region of Hip is a DPEV sequence that occurs twice; in Hop, one DPEV sequence plus a partial second sequence occurs. In an effort to better understand Hip function as it relates to assembly of progesterone receptor complexes, the DPEV region of Hip was targeted for mutations. Each DPEV sequence was mutated to an APAV sequence, singly or in combination. The combined mutation, APAV2, was further combined with a deletion of Hip’s tetratricopeptide repeat region that is required for Hsp70 binding or with a deletion of Hip’s GGMP repeat. An additional mutant was prepared by truncation of Hip’s DPEV-containing C terminus. By comparing interactions of various Hip forms with Hsp70, it was determined that mutation of the DPEV sequences created a dominant inhibitory form of Hip. The mutant Hip-Hsp70 complex was not prevented from interacting with progesterone receptor, but the mutant caused a dose-dependent inhibition of receptor assembly with Hsp90. The behavior of the Hip mutant is consistent with a model in which Hip and Hop are required to facilitate the transition from an early receptor complex with Hsp70 into later complexes containing Hsp90. PMID:9447991
Structure determination of a peptide model of the repeated helical domain in Samia cynthia ricini silk fibroin before spinning by a combination of advanced solid-state NMR methods.

PubMed

Nakazawa, Yasumoto; Asakura, Tetsuo

2003-06-18

Fibrous proteins unlike globular proteins, contain repetitive amino acid sequences, giving rise to very regular secondary protein structures. Silk fibroin from a wild silkworm, Samia cynthia ricini, consists of about 100 repeats of alternating polyalanine (poly-Ala) regions of 12-13 residues in length and Gly-rich regions. In this paper, the precise structure of the model peptide, GGAGGGYGGDGG(A)(12)GGAGDGYGAG, which is a typical repeated sequence of the silk fibroin, was determined using a combination of three kinds of solid-state NMR studies; a quantitative use of (13)C CP/MAS NMR chemical shift with conformation-dependent (13)C chemical shift contour plots, 2D spin diffusion (13)C solid-state NMR under off magic angle spinning and rotational echo double resonance. The structure of the model peptide corresponding to the silk fibroin structure before spinning was determined. The torsion angles of the central Ala residue, Ala(19), in the poly-Ala region were determined to be (phi, psi) = (-59 degrees, -48 degrees ) which are values typically associated with alpha-helical structures. However, the torsion angles of the Gly(25) residue adjacent to the C-terminal side of the poly-Ala chain were determined to be (phi, psi) = (-66 degrees, -22 degrees ) and those of Gly(12) and Ala(13) residues at the N-terminal of the poly-Ala chain to be (phi, psi) = (-70 degrees, -30 degrees ). In addition, REDOR experiments indicate that the torsion angles of the two C-terminal Ala residues, Ala(23) and Ala(24), are (phi, psi) = (-66 degrees, -22 degrees ) and those of N-terminal two Ala residues, Ala(13) and Ala(14) are (phi, psi) = (-70 degrees, -30 degrees ). Thus, the local structure of N-terminal and C-terminal residues, and also the neighboring residues of alpha-helical poly-Ala chain in the model peptide is a more strongly wound structure than found in typical alpha-helix structures.
Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae.

PubMed

Oggioni, M R; Claverys, J P

1999-10-01

A survey of all Streptococcus pneumoniae GenBank/EMBL DNA sequence entries and of the public domain sequence (representing more than 90% of the genome) of an S. pneumoniae type 4 strain allowed identification of 108 copies of a 107-bp-long highly repeated intergenic element called RUP (for repeat unit of pneumococcus). Several features of the element, revealed in this study, led to the proposal that RUP is an insertion sequence (IS)-derivative that could still be mobile. Among these features are: (1) a highly significant homology between the terminal inverted repeats (IRs) of RUPs and of IS630-Spn1, a new putative IS of S. pneumoniae; and (2) insertion at a TA dinucleotide, a characteristic target of several members of the IS630 family. Trans-mobilization of RUP is therefore proposed to be mediated by the transposase of IS630-Spn1. To account for the observation that RUPs are distributed among four subtypes which exhibit different degrees of sequence homogeneity, a scenario is invoked based on successive stages of RUP mobility and non-mobility, depending on whether an active transposase is present or absent. In the latter situation, an active transposase could be reintroduced into the species through natural transformation. Examination of sequences flanking RUP revealed a preferential association with ISs. It also provided evidence that RUPs promote sequence rearrangements, thereby contributing to genome flexibility. The possibility that RUP preferentially targets transforming DNA of foreign origin and subsequently favours disruption/rearrangement of exogenous sequences is discussed.
Carboxy-terminal sequence variation of LMP1 gene in Epstein-Barr-virus-associated mononucleosis and tumors from Serbian patients.

PubMed

Banko, Ana; Lazarevic, Ivana; Cupic, Maja; Stevanovic, Goran; Boricic, Ivan; Jovanovic, Tanja

2012-04-01

Seven strains of Epstein-Barr virus (EBV) are defined based on C-terminal sequence variations of the latent membrane protein 1 (LMP1). Some strains, especially those with a 30-bp deletion, are thought to be related to tumorigenic activity and geographical localization. The aims of the study were to determine the prevalence of different LMP1 strains and to investigate sequence variation in the C-terminal region of LMP1 in Serbian isolates. This study included 53 EBV-DNA-positive plasma and tissue block samples from patients with mononucleosis syndrome, renal transplantation, and tumors, mostly nasopharyngeal carcinoma. The sequence of the 506-bp fragment of LMP1 C terminus was used for phylogenetic analyses and identification of LMP1 strains, deletions, and mutations. The majority of isolates were non-deleted (66%), and the rest had 30-bp, rare 69-bp, or yet unknown 27-bp deletions, which were not related to malignant or non-malignant isolate origin. However, the majority of 69-bp deletion isolates were derived from patients with nasopharyngeal carcinoma. Less than five 33-bp repeats were found in the majority of non-deleted isolates (68.6%), whereas most 69-bp deletion isolates (75%) had five or six repeats. Serbian isolates were assigned to four LMP1 strains: B95-8 (32.1%), China 1 (24.5%), North Carolina (NC; 18.9%), and Mediterranean (Med; 24.5%). In NC isolates, three new mutations unique for this strain were identified. EBV EBNA2 genotypes 1 and 2 were both found, with dominance of genotype 1 (90.7%). This study demonstrated noticeable geographical-associated characteristics in the LMP1 C terminus of investigated isolates. Copyright © 2012 Wiley Periodicals, Inc.
Complete sequence of Tvv1, a family of Ty 1 copia-like retrotransposons of Vitis vinifera L., reconstituted by chromosome walking.

PubMed

Pelsy, F.; Merdinoglu, D.

2002-09-01

A chromosome-walking strategy was used to sequence and characterize retrotransposons in the grapevine genome. The reconstitution of a family of retroelements, named Tvv1, was achieved by six successive steps. These elements share a single, highly conserved open reading frame 4,153 nucleotides-long, putatively encoding the gag, pro, int, rt and rh proteins. Comparison of the Tvv1 open reading frame coding potential with those of drosophila copia and tobacco Tnt1, revealed that Tvv1 is closely related to Ty 1 copia-like retrotransposons. A highly variable untranslated leader region, upstream of the open reading frame, allowed us to differentiate Tvv1 variants, which represent a family of at least 28 copies, in varying sizes. This internal region is flanked by two long terminal repeats in direct orientation, sized between 149 and 157 bp. Among elements theoretically sized from 4,970 to 5,550 bp, we describe the full-length sequence of a reference element Tvv1-1, 5,343 nucleotides-long. The full-length sequence of Tvv1-1 compared to pea PDR1 shows a 53.3% identity. In addition, both elements contain long terminal repeats of nearly the same size in which the U5 region could be entirely absent. Therefore, we assume that Tvv1 and PDR1 could constitute a particular class of short LTRs retroelements.
DNA sequence transfer between two high-cysteine chorion gene families in the silkmoth Bombyx mori.

PubMed Central

Iatrou, K; Tsitilou, S G; Kafatos, F C

1984-01-01

We have previously shown that one type of high-cysteine silkmoth chorion protein (Hc-A) has evolved from the A family of chorion proteins by radical modifications of the NH2-terminal and COOH-terminal polypeptide arms: most of the arm sequences have been deleted, while short cysteine- and glycine-containing repeats have expanded into long arrays. Strikingly similar modifications of the arms have led to the evolution of a second type of high-cysteine protein (Hc-B) from the B family of chorion proteins. It appears that the parallel evolution of these high-cysteine-encoding gene families has not been entirely independent: examination of 3' untranslated regions shows evidence of information transfer between the two families. PMID:6589605
Gene-for-genes interactions between cotton R genes and Xanthomonas campestris pv. malvacearum avr genes.

PubMed

De Feyter, R; Yang, Y; Gabriel, D W

1993-01-01

Six plasmid-borne avirulence (avr) genes were previously cloned from strain XcmH of the cotton pathogen, Xanthomonas campestris pv. malvacearum. We have now localized all six avr genes on the cloned fragments by subcloning and Tn5-gusA insertional mutagenesis. None of these avr genes appeared to exhibit exclusively gene-for-gene patterns of interactions with cotton R genes, and avrB4 was demonstrated to confer avr gene-for-R genes (plural) avirulence to X. c. pv. malvacearum on congenic cotton lines carrying either of two different resistance loci, B1 or B4. Furthermore, the B1 locus appeared to confer R gene-for-avr genes resistance to cotton against isogenic X. c. pv. malvacearum strains carrying any one of three avr genes: avrB4, avrb6, or avrB102. Restriction enzyme, Southern blot hybridization, and DNA sequence analyses showed that the XcmH avr genes are all highly similar to each other, to avrBs3 and avrBsP from the pepper pathogen X. c. pv. vesicatoria, and to the host-specific virulence gene pthA from the citrus pathogen X. citri. The XcmH avr genes differed primarily in the multiplicity of a tandemly repeated 102-base pair motif within the central portions of the genes, repeated from 14 to 23 times in members of this gene family. The complete nucleotide sequence of avrb6 revealed that it is 97% identical in DNA sequence to avrB4, avrBs3, avrBsP, and pthA and that 62-bp inverted terminal repeats mark the boundaries of homology between avrb6 and all members of this Xanthomonas virulence/avirulence gene family sequenced to date. The terminal 38 bp of both inverted repeats are highly similar to the 38-bp consensus terminal sequence of the Tn3 family of transposons. Up to 11 members of the avr gene family appear to be present in North American strains of X. c. pv. malvacearum, including XcmH. The high level of homology observed among these avr genes and their presence in multiple copies may explain the gene-for-genes interactions and also the observed high frequencies (10(-3) to 10(-4) per locus) of X. c. pv. malvacearum race change mutations. Five spontaneous race change mutants of XcmH suffered avr locus deletions, strongly indicating intergenic recombination as the primary mechanism for generating new races in X. c. pv. malvacearum.
Genetic determinants of mate recognition in Brachionus manjavacas (Rotifera)

PubMed Central

Snell, Terry W; Shearer, Tonya L; Smith, Hilary A; Kubanek, Julia; Gribble, Kristin E; Welch, David B Mark

2009-01-01

Background Mate choice is of central importance to most animals, influencing population structure, speciation, and ultimately the survival of a species. Mating behavior of male brachionid rotifers is triggered by the product of a chemosensory gene, a glycoprotein on the body surface of females called the mate recognition pheromone. The mate recognition pheromone has been biochemically characterized, but little was known about the gene(s). We describe the isolation and characterization of the mate recognition pheromone gene through protein purification, N-terminal amino acid sequence determination, identification of the mate recognition pheromone gene from a cDNA library, sequencing, and RNAi knockdown to confirm the functional role of the mate recognition pheromone gene in rotifer mating. Results A 29 kD protein capable of eliciting rotifer male circling was isolated by high-performance liquid chromatography. Two transcript types containing the N-terminal sequence were identified in a cDNA library; further characterization by screening a genomic library and by polymerase chain reaction revealed two genes belonging to each type. Each gene begins with a signal peptide region followed by nearly perfect repeats of an 87 to 92 codon motif with no codons between repeats and the final motif prematurely terminated by the stop codon. The two Type A genes contain four and seven repeats and the two Type B genes contain three and five repeats, respectively. Only the Type B gene with three repeats encodes a peptide with a molecular weight of 29 kD. Each repeat of the Type B gene products contains three asparagines as potential sites for N-glycosylation; there are no asparagines in the Type A genes. RNAi with Type A double-stranded RNA did not result in less circling than in the phosphate-buffered saline control, but transfection with Type B double-stranded RNA significantly reduced male circling by 17%. The very low divergence between repeat units, even at synonymous positions, suggests that the repeats are kept nearly identical through a process of concerted evolution. Information-rich molecules like surface glycoproteins are well adapted for chemical communication and aquatic animals may have evolved signaling systems based on these compounds, whereas insects use cuticular hydrocarbons. Conclusion Owing to its critical role in mating, the mate recognition pheromone gene will be a useful molecular marker for exploring the mechanisms and rates of selection and the evolution of reproductive isolation and speciation using rotifers as a model system. The phylogenetic variation in the mate recognition pheromone gene can now be studied in conjunction with the large amount of ecological and population genetic data being gathered for the Brachionus plicatilis species complex to understand better the evolutionary drivers of cryptic speciation. PMID:19740420
Genetic determinants of mate recognition in Brachionus manjavacas (Rotifera).

PubMed

Snell, Terry W; Shearer, Tonya L; Smith, Hilary A; Kubanek, Julia; Gribble, Kristin E; Welch, David B Mark

2009-09-09

Mate choice is of central importance to most animals, influencing population structure, speciation, and ultimately the survival of a species. Mating behavior of male brachionid rotifers is triggered by the product of a chemosensory gene, a glycoprotein on the body surface of females called the mate recognition pheromone. The mate recognition pheromone has been biochemically characterized, but little was known about the gene(s). We describe the isolation and characterization of the mate recognition pheromone gene through protein purification, N-terminal amino acid sequence determination, identification of the mate recognition pheromone gene from a cDNA library, sequencing, and RNAi knockdown to confirm the functional role of the mate recognition pheromone gene in rotifer mating. A 29 kD protein capable of eliciting rotifer male circling was isolated by high-performance liquid chromatography. Two transcript types containing the N-terminal sequence were identified in a cDNA library; further characterization by screening a genomic library and by polymerase chain reaction revealed two genes belonging to each type. Each gene begins with a signal peptide region followed by nearly perfect repeats of an 87 to 92 codon motif with no codons between repeats and the final motif prematurely terminated by the stop codon. The two Type A genes contain four and seven repeats and the two Type B genes contain three and five repeats, respectively. Only the Type B gene with three repeats encodes a peptide with a molecular weight of 29 kD. Each repeat of the Type B gene products contains three asparagines as potential sites for N-glycosylation; there are no asparagines in the Type A genes. RNAi with Type A double-stranded RNA did not result in less circling than in the phosphate-buffered saline control, but transfection with Type B double-stranded RNA significantly reduced male circling by 17%. The very low divergence between repeat units, even at synonymous positions, suggests that the repeats are kept nearly identical through a process of concerted evolution. Information-rich molecules like surface glycoproteins are well adapted for chemical communication and aquatic animals may have evolved signaling systems based on these compounds, whereas insects use cuticular hydrocarbons. Owing to its critical role in mating, the mate recognition pheromone gene will be a useful molecular marker for exploring the mechanisms and rates of selection and the evolution of reproductive isolation and speciation using rotifers as a model system. The phylogenetic variation in the mate recognition pheromone gene can now be studied in conjunction with the large amount of ecological and population genetic data being gathered for the Brachionus plicatilis species complex to understand better the evolutionary drivers of cryptic speciation.
Genetic diversity of Danthonia spicata (L.) Beauv. Based on genomic simple sequence repeat markers

USDA-ARS?s Scientific Manuscript database

Danthonia spicata, commonly known as poverty oatgrass, is a perennial bunch-type grass native to North America. D. spicata has dimorphic seed heads; the hypothesis is that terminal seed heads allow some level of outcrossing and axial seed heads are only self-fertilized. However, there is no genetic ...
Discovery and analysis of an active long terminal repeat-retrotransposable element in Aspergillus oryzae.

PubMed

Jie Jin, Feng; Hara, Seiichi; Sato, Atsushi; Koyama, Yasuji

2014-01-01

Wild-type Aspergillus oryzae RIB40 contains two copies of the AO090005001597 gene. We previously constructed A. oryzae RIB40 strain, RKuAF8B, with multiple chromosomal deletions, in which the AO090005001597 copy number was found to be increased significantly. Sequence analysis indicated that AO090005001597 is part of a putative 6,000-bp retrotransposable element, flanked by two long terminal repeats (LTRs) of 669 bp, with characteristics of retroviruses and retrotransposons, and thus designated AoLTR (A. oryzae LTR-retrotransposable element). AoLTR comprised putative reverse transcriptase, RNase H, and integrase domains. The deduced amino acid sequence alignment of AoLTR showed 94% overall identity with AFLAV, an A. flavus Tf1/sushi retrotransposon. Quantitative real-time RT-PCR showed that AoLTR gene expression was significantly increased in the RKuAF8B, in accordance with the increased copy number. Inverse PCR indicated that the full-length retrotransposable element was randomly integrated into multiple genomic locations. However, no obvious phenotypic changes were associated with the increased AoLTR gene copy number.
Structure and possible function of a G-quadruplex in the long terminal repeat of the proviral HIV-1 genome

PubMed Central

De Nicola, Beatrice; Lech, Christopher J.; Heddi, Brahim; Regmi, Sagar; Frasson, Ilaria; Perrone, Rosalba; Richter, Sara N.; Phan, Anh Tuân

2016-01-01

The long terminal repeat (LTR) of the proviral human immunodeficiency virus (HIV)-1 genome is integral to virus transcription and host cell infection. The guanine-rich U3 region within the LTR promoter, previously shown to form G-quadruplex structures, represents an attractive target to inhibit HIV transcription and replication. In this work, we report the structure of a biologically relevant G-quadruplex within the LTR promoter region of HIV-1. The guanine-rich sequence designated LTR-IV forms a well-defined structure in physiological cationic solution. The nuclear magnetic resonance (NMR) structure of this sequence reveals a parallel-stranded G-quadruplex containing a single-nucleotide thymine bulge, which participates in a conserved stacking interaction with a neighboring single-nucleotide adenine loop. Transcription analysis in a HIV-1 replication competent cell indicates that the LTR-IV region may act as a modulator of G-quadruplex formation in the LTR promoter. Consequently, the LTR-IV G-quadruplex structure presented within this work could represent a valuable target for the design of HIV therapeutics. PMID:27298260
Long Terminal Repeat Retrotransposon Content in Eight Diploid Sunflower Species Inferred from Next-Generation Sequence Data

PubMed Central

Tetreault, Hannah M.; Ungerer, Mark C.

2016-01-01

The most abundant transposable elements (TEs) in plant genomes are Class I long terminal repeat (LTR) retrotransposons represented by superfamilies gypsy and copia. Amplification of these superfamilies directly impacts genome structure and contributes to differential patterns of genome size evolution among plant lineages. Utilizing short-read Illumina data and sequence information from a panel of Helianthus annuus (sunflower) full-length gypsy and copia elements, we explore the contribution of these sequences to genome size variation among eight diploid Helianthus species and an outgroup taxon, Phoebanthus tenuifolius. We also explore transcriptional dynamics of these elements in both leaf and bud tissue via RT-PCR. We demonstrate that most LTR retrotransposon sublineages (i.e., families) display patterns of similar genomic abundance across species. A small number of LTR retrotransposon sublineages exhibit lineage-specific amplification, particularly in the genomes of species with larger estimated nuclear DNA content. RT-PCR assays reveal that some LTR retrotransposon sublineages are transcriptionally active across all species and tissue types, whereas others display species-specific and tissue-specific expression. The species with the largest estimated genome size, H. agrestis, has experienced amplification of LTR retrotransposon sublineages, some of which have proliferated independently in other lineages in the Helianthus phylogeny. PMID:27233667
Repeats of base oligomers as the primordial coding sequences of the primeval earth and their vestiges in modern genes.

PubMed

Ohno, S

1984-01-01

Three outstanding properties uniquely qualify repeats of base oligomers as the primordial coding sequences of all polypeptide chains. First, when compared with randomly generated base sequences in general, they are more likely to have long open reading frames. Second, periodical polypeptide chains specified by such repeats are more likely to assume either alpha-helical or beta-sheet secondary structures than are polypeptide chains of random sequence. Third, provided that the number of bases in the oligomeric unit is not a multiple of 3, these internally repetitious coding sequences are impervious to randomly sustained base substitutions, deletions, and insertions. This is because the recurring periodicity of their polypeptide chains is given by three consecutive copies of the oligomeric unit translated in three different reading frames. Accordingly, when one reading frame is open, the other two are automatically open as well, all three being capable of coding for polypeptide chains of identical periodicity. Under this circumstance, a frame shift due to the deletion or insertion of a number of bases that is not a multiple of 3 fails to alter the down-stream amino acid sequence, and even a base change causing premature chain-termination can silence only one of the three potential coding units. Newly arisen coding sequences in modern organisms are oligomeric repeats, and most of the older genes retain various vestiges of their original internal repetitions. Some of the genes (e.g., oncogenes) have even inherited the property of being impervious to randomly sustained base changes.
Mitochondrial genome of the tomato clownfish Amphiprion frenatus (Pomacentridae, Amphiprioninae).

PubMed

Ye, Le; Hu, Jing; Wu, Kaichang; Wang, Yu; Li, Jianlong

2016-01-01

The complete mitochondrial (mt) genome of the tomato clownfish Amphiprion frenatus was obtained in this study. The circular mtDNA molecule was 16,774 bp in size and the overall nucleotide composition of the H-strand was 29.72% A, 25.81% T, 15.38% G and 29.09% C, with an A + T bias. The complete mitogenome encoded 13 protein-coding genes, 2 rRNAs, 22 tRNAs and a control region (D-loop), with the gene arrangement and translation direction basically identical to other typical vertebrate mitogenomes. The D-loop included termination associated sequence (TAS), central conserved domain (CCD) and conserved sequence block (CSB), and was composed of 6 complete continuity tandem repeat units and an imperfect tandem repeat unit.
“One code to find them all”: a perl tool to conveniently parse RepeatMasker output files

PubMed Central

2014-01-01

Background Of the different bioinformatic methods used to recover transposable elements (TEs) in genome sequences, one of the most commonly used procedures is the homology-based method proposed by the RepeatMasker program. RepeatMasker generates several output files, including the .out file, which provides annotations for all detected repeats in a query sequence. However, a remaining challenge consists of identifying the different copies of TEs that correspond to the identified hits. This step is essential for any evolutionary/comparative analysis of the different copies within a family. Different possibilities can lead to multiple hits corresponding to a unique copy of an element, such as the presence of large deletions/insertions or undetermined bases, and distinct consensus corresponding to a single full-length sequence (like for long terminal repeat (LTR)-retrotransposons). These possibilities must be taken into account to determine the exact number of TE copies. Results We have developed a perl tool that parses the RepeatMasker .out file to better determine the number and positions of TE copies in the query sequence, in addition to computing quantitative information for the different families. To determine the accuracy of the program, we tested it on several RepeatMasker .out files corresponding to two organisms (Drosophila melanogaster and Homo sapiens) for which the TE content has already been largely described and which present great differences in genome size, TE content, and TE families. Conclusions Our tool provides access to detailed information concerning the TE content in a genome at the family level from the .out file of RepeatMasker. This information includes the exact position and orientation of each copy, its proportion in the query sequence, and its quality compared to the reference element. In addition, our tool allows a user to directly retrieve the sequence of each copy and obtain the same detailed information at the family level when a local library with incomplete TE class/subclass information was used with RepeatMasker. We hope that this tool will be helpful for people working on the distribution and evolution of TEs within genomes.
Functional Angucycline-Like Antibiotic Gene Cluster in the Terminal Inverted Repeats of the Streptomyces ambofaciens Linear Chromosome

PubMed Central

Pang, Xiuhua; Aigle, Bertrand; Girardet, Jean-Michel; Mangenot, Sophie; Pernodet, Jean-Luc; Decaris, Bernard; Leblond, Pierre

2004-01-01

Streptomyces ambofaciens has an 8-Mb linear chromosome ending in 200-kb terminal inverted repeats. Analysis of the F6 cosmid overlapping the terminal inverted repeats revealed a locus similar to type II polyketide synthase (PKS) gene clusters. Sequence analysis identified 26 open reading frames, including genes encoding the β-ketoacyl synthase (KS), chain length factor (CLF), and acyl carrier protein (ACP) that make up the minimal PKS. These KS, CLF, and ACP subunits are highly homologous to minimal PKS subunits involved in the biosynthesis of angucycline antibiotics. The genes encoding the KS and ACP subunits are transcribed constitutively but show a remarkable increase in expression after entering transition phase. Five genes, including those encoding the minimal PKS, were replaced by resistance markers to generate single and double mutants (replacement in one and both terminal inverted repeats). Double mutants were unable to produce either diffusible orange pigment or antibacterial activity against Bacillus subtilis. Single mutants showed an intermediate phenotype, suggesting that each copy of the cluster was functional. Transformation of double mutants with a conjugative and integrative form of F6 partially restored both phenotypes. The pigmented and antibacterial compounds were shown to be two distinct molecules produced from the same biosynthetic pathway. High-pressure liquid chromatography analysis of culture extracts from wild-type and double mutants revealed a peak with an associated bioactivity that was absent from the mutants. Two additional genes encoding KS and CLF were present in the cluster. However, disruption of the second KS gene had no effect on either pigment or antibiotic production. PMID:14742212
Molecular cloning of the pheromone biosynthesis-activating neuropeptide in Helicoverpa zea.

PubMed Central

Davis, M T; Vakharia, V N; Henry, J; Kempe, T G; Raina, A K

1992-01-01

Pheromone biosynthesis-activating neuropeptide (PBAN) regulates sex pheromone biosynthesis in female Helicoverpa (Heliothis) zea. Two oligonucleotide probes representing two overlapping amino acid regions of PBAN were used to screen 2.5 x 10(5) recombinant plaques, and a positive recombinant clone was isolated. Sequence analysis of the isolated clone showed that the PBAN gene is interrupted after the codon encoding amino acid 14 by a 0.63-kilobase (kb) intron. Preceding the PBAN amino acid sequence is a 10-amino acid sequence containing a pentapeptide Phe-Thr-Pro-Arg-Leu, which is followed by a Gly-Arg-Arg processing site. Immediately after the PBAN amino acid sequence is a Gly-Arg processing site and a short stretch of 10 amino acids. This 10-amino acid sequence contains a repeat of the PBAN C-terminal pentapeptide Phe-Ser-Pro-Arg-Leu and is terminated by another Gly-Arg processing site. It is suggested that the PBAN gene in H. zea might carry, besides PBAN, a 7- and an 8-residue amidated peptide, which share with PBAN the core C-terminal pentapeptide Phe-(Ser or Thr)-Pro-Arg-Leu-NH2. The C-terminal pentapeptide sequence of PBAN represents the minimum sequence required for pheromonotropic activity in H. zea and also bears a high degree of homology to the pyrokinin family of insect peptides with myotropic activity. It is possible that the putative heptapeptide and octapeptide might be new members of the pyrokinin family, with pheromonotropic and/or myotropic activities. Thus, the PBAN gene products, besides affecting sexual behavior, might have broad influence on many biological processes in H. zea. Images PMID:1729680
Novel Structure of Ty3 Reverse Transcriptase | Center for Cancer Research

Cancer.gov

Retrotransposons are mobile genetic elements that self amplify via a single-stranded RNA intermediate, which is converted to double-stranded DNA by an encoded reverse transcriptase (RT) with both DNA polymerase (pol) and ribonuclease H (RNase) activities. Categorized by whether they contain flanking long terminal repeat (LTR) sequences, retrotransposons play a critical role in

Identification of a non-LTR retrotransposon from the gypsy moth

Treesearch

K.J. Garner; J.M. Slavicek

1999-01-01

A family of highly repetitive elements, named LDT1, has been identified in the gypsy moth, Lymantria dispar. The complete element is 5.4 kb in length and lacks long-terminal repeats, The element contains two open reading frames with a significant amino acid sequence similarity to several non-LTR retrotransposons. The first open reading frame contains...
Comparative molecular cytogenetics of major repetitive sequence families of three Dendrobium species (Orchidaceae) from Bangladesh

PubMed Central

Begum, Rabeya; Alam, Sheikh Shamimul; Menzel, Gerhard; Schmidt, Thomas

2009-01-01

Background and Aims Dendrobium species show tremendous morphological diversity and have broad geographical distribution. As repetitive sequence analysis is a useful tool to investigate the evolution of chromosomes and genomes, the aim of the present study was the characterization of repetitive sequences from Dendrobium moschatum for comparative molecular and cytogenetic studies in the related species Dendrobium aphyllum, Dendrobium aggregatum and representatives from other orchid genera. Methods In order to isolate highly repetitive sequences, a c0t-1 DNA plasmid library was established. Repeats were sequenced and used as probes for Southern hybridization. Sequence divergence was analysed using bioinformatic tools. Repetitive sequences were localized along orchid chromosomes by fluorescence in situ hybridization (FISH). Key Results Characterization of the c0t-1 library resulted in the detection of repetitive sequences including the (GA)n dinucleotide DmoO11, numerous Arabidopsis-like telomeric repeats and the highly amplified dispersed repeat DmoF14. The DmoF14 repeat is conserved in six Dendrobium species but diversified in representative species of three other orchid genera. FISH analyses showed the genome-wide distribution of DmoF14 in D. moschatum, D. aphyllum and D. aggregatum. Hybridization with the telomeric repeats demonstrated Arabidopsis-like telomeres at the chromosome ends of Dendrobium species. However, FISH using the telomeric probe revealed two pairs of chromosomes with strong intercalary signals in D. aphyllum. FISH showed the terminal position of 5S and 18S–5·8S–25S rRNA genes and a characteristic number of rDNA sites in the three Dendrobium species. Conclusions The repeated sequences isolated from D. moschatum c0t-1 DNA constitute major DNA families of the D. moschatum, D. aphyllum and D. aggregatum genomes with DmoF14 representing an ancient component of orchid genomes. Large intercalary telomere-like arrays suggest chromosomal rearrangements in D. aphyllum while the number and localization of rRNA genes as well as the species-specific distribution pattern of an abundant microsatellite reflect the genomic diversity of the three Dendrobium species. PMID:19635741
Concerted evolution of the tandem array encoding primate U2 snRNA occurs in situ, without changing the cytological context of the RNU2 locus.

PubMed Central

Pavelitz, T; Rusché, L; Matera, A G; Scharf, J M; Weiner, A M

1995-01-01

In primates, the tandemly repeated genes encoding U2 small nuclear RNA evolve concertedly, i.e. the sequence of the U2 repeat unit is essentially homogeneous within each species but differs somewhat between species. Using chromosome painting and the NGFR gene as an outside marker, we show that the U2 tandem array (RNU2) has remained at the same chromosomal locus (equivalent to human 17q21) through multiple speciation events over > 35 million years leading to the Old World monkey and hominoid lineages. The data suggest that the U2 tandem repeat, once established in the primate lineage, contained sequence elements favoring perpetuation and concerted evolution of the array in situ, despite a pericentric inversion in chimpanzee, a reciprocal translocation in gorilla and a paracentric inversion in orang utan. Comparison of the 11 kb U2 repeat unit found in baboon and other Old World monkeys with the 6 kb U2 repeat unit in humans and other hominids revealed that an ancestral U2 repeat unit was expanded by insertion of a 5 kb retrovirus bearing 1 kb long terminal repeats (LTRs). Subsequent excision of the provirus by homologous recombination between the LTRs generated a 6 kb U2 repeat unit containing a solo LTR. Remarkably, both junctions between the human U2 tandem array and flanking chromosomal DNA at 17q21 fall within the solo LTR sequence, suggesting a role for the LTR in the origin or maintenance of the primate U2 array. Images PMID:7828589
Sequence of contactin, a 130-kD glycoprotein concentrated in areas of interneuronal contact, defines a new member of the immunoglobulin supergene family in the nervous system

PubMed Central

1988-01-01

The primary amino acid sequence of contactin, a neuronal cell surface glycoprotein of 130 kD that is isolated in association with components of the cytoskeleton (Ranscht, B., D. J. Moss, and C. Thomas. 1984. J. Cell Biol. 99:1803-1813), was deduced from the nucleotide sequence of cDNA clones and is reported here. The cDNA sequence contains an open reading frame for a 1,071-amino acid transmembrane protein with 962 extracellular and 89 cytoplasmic amino acids. In its extracellular portion, the polypeptide features six type 1 and two type 2 repeats. The six amino-terminal type 1 repeats (I-VI) each consist of 81-99 amino acids and contain two cysteine residues that are in the right context to form globular domains as described for molecules with immunoglobulin structure. Within the proposed globular region, contactin shares 31% identical amino acids with the neural cell adhesion molecule NCAM. The two type 2 repeats (I-II) are each composed of 100 amino acids and lack cysteine residues. They are 20-31% identical to fibronectin type III repeats. Both the structural similarity of contactin to molecules of the immunoglobulin supergene family, in particular the amino acid sequence resemblance to NCAM, and its relationship to fibronectin indicate that contactin could be involved in some aspect of cellular adhesion. This suggestion is further strengthened by its localization in neuropil containing axon fascicles and synapses. PMID:3049624
The Peculiar Landscape of Repetitive Sequences in the Olive (Olea europaea L.) Genome

PubMed Central

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-01-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome. PMID:24671744
The peculiar landscape of repetitive sequences in the olive (Olea europaea L.) genome.

PubMed

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-04-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome.
Genome Dynamics and Evolution of the Mla (Powdery Mildew) Resistance Locus in BarleyW⃞

PubMed Central

Wei, Fusheng; Wing, Rod A.; Wise, Roger P.

2002-01-01

Genes that confer defense against pathogens often are clustered in the genome and evolve via diverse mechanisms. To evaluate the organization and content of a major defense gene complex in cereals, we determined the complete sequence of a 261-kb BAC contig from barley cv Morex that spans the Mla (powdery mildew) resistance locus. Among the 32 predicted genes on this contig, 15 are associated with plant defense responses; 6 of these are associated with defense responses to powdery mildew disease but function in different signaling pathways. The Mla region is organized as three gene-rich islands separated by two nested complexes of transposable elements and a 45-kb gene-poor region. A heterochromatic-like region is positioned directly proximal to Mla and is composed of a gene-poor core with 17 families of diverse tandem repeats that overlap a hypermethylated, but transcriptionally active, gene-dense island. Paleontology analysis of long terminal repeat retrotransposons indicates that the present Mla region evolved over a period of >7 million years through a variety of duplication, inversion, and transposon-insertion events. Sequence-based recombination estimates indicate that R genes positioned adjacent to nested long terminal repeat retrotransposons, such as Mla, do not favor recombination as a means of diversification. We present a model for the evolution of the Mla region that encompasses several emerging features of large cereal genomes. PMID:12172030
Emergence of a new human adenovirus type 4 (Ad4) genotype: identification of a novel inverted terminal repeated (ITR) sequence from majority of Ad4 isolates from US military recruits.

PubMed

Houng, Huo-Shu H; Clavio, Sarah; Graham, Katherine; Kuschner, Robert; Sun, Wellington; Russell, Kevin L; Binn, Leonard N

2006-04-01

Ad4 is the principal etiological agent of acute respiratory disease (ARD) in the US military. Discovery of the novel 208bp inverted terminal repeated (ITR) sequence from a recent Ad4 Jax78 field isolate was totally distinct from the analogous 116bp ITR of Ad4 prototype. To investigate the origin and distribution of the novel Ad4 ITR sequence from ARD infections. Direct sequencing of ligated Ad ITR termini. The new Ad4 ITR was highly homologous with the ITRs of human Ad subgroup B. The left post-ITR region of Ad4 Jax78 was found to be highly homologous to the corresponding region of subgroup B Ads: 81% for Ad11 and 98% for Ad3 and Ad7. The right post-ITR region of Ad4 Jax78 contained a truncated classic ITR of the Ad4 prototype. The Ad4 Jax78 ITR most likely evolved from Ad4 prototype by substituting the Ad4 prototype ITR with the subgroup B Ads ITR. The ITR-based PCR assays developed from this study can be used to distinguish the new Ad4 genotype from the classical Ad4 prototype. The new Ad4 genotype was first detected in 1976 from Georgia, USA, and is the main causative agent of ARD infections in US military population.
Identification of multiple binding sites for the THAP domain of the Galileo transposase in the long terminal inverted-repeats☆

PubMed Central

Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald

2013-01-01

Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. PMID:23648487
Stress-induced rearrangement of Fusarium retrotransposon sequences.

PubMed

Anaya, N; Roncero, M I

1996-11-27

Rearrangement of fusarium oxysporum retrotransposon skippy was induced by growth in the presence of potassium chlorate. Three fungal strains, one sensitive to chlorate (Co60) and two resistant to chlorate and deficient for nitrate reductase (Co65 and Co94), were studied by Southern analysis of their genomic DNA. Polymorphism was detected in their hybridization banding pattern, relative to the wild type grown in the absence of chlorate, using various enzymes with or without restriction sites within the retrotransposon. Results were consistent with the assumption that three different events had occurred in strain Co60: genomic amplification of skippy yielding tandem arrays of the element, generation of new skippy sequences, and deletion of skippy sequences. Amplification of Co60 genomic DNA using the polymerase chain reaction and divergent primers derived from the retrotransposon generated a new band, corresponding to one long terminal repeat plus flanking sequences, that was not present in the wild-type strain. Molecular analysis of nitrate reductase-deficient mutants showed that generation and deletion of skippy sequences, but not genomic amplification in tandem repeats, had occurred in their genomes.
From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

PubMed

Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

2016-04-01

The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.
NMR assignments of the N-terminal domain of Nephila clavipes spidroin 1

PubMed Central

Parnham, Stuart; Gaines, William A.; Duggan, Brendan M.; Marcotte, William R.

2011-01-01

The building blocks of spider dragline silk are two fibrous proteins secreted from the major ampullate gland named spidroins 1 and 2 (MaSp1, MaSp2). These proteins consist of a large central domain composed of approximately 100 tandem copies of a 35–40 amino acid repeat sequence. Non-repetitive N and C-terminal domains, of which the C-terminal domain has been implicated to transition from soluble and insoluble states during spinning, flank the repetitive core. The N-terminal domain until recently has been largely unknown due to difficulties in cloning and expression. Here, we report nearly complete assignment for all 1H, 13C, and 15N resonances in the 14 kDa N-terminal domain of major ampullate spidroin 1 (MaSp1-N) of the golden orb-web spider Nephila clavipes. PMID:21152998
Quantitative analysis of TALE-DNA interactions suggests polarity effects.

PubMed

Meckler, Joshua F; Bhakta, Mital S; Kim, Moon-Soo; Ovadia, Robert; Habrian, Chris H; Zykovich, Artem; Yu, Abigail; Lockwood, Sarah H; Morbitzer, Robert; Elsäesser, Janett; Lahaye, Thomas; Segal, David J; Baldwin, Enoch P

2013-04-01

Transcription activator-like effectors (TALEs) have revolutionized the field of genome engineering. We present here a systematic assessment of TALE DNA recognition, using quantitative electrophoretic mobility shift assays and reporter gene activation assays. Within TALE proteins, tandem 34-amino acid repeats recognize one base pair each and direct sequence-specific DNA binding through repeat variable di-residues (RVDs). We found that RVD choice can affect affinity by four orders of magnitude, with the relative RVD contribution in the order NG > HD ≈ NN > NI > NK. The NN repeat preferred the base G over A, whereas the NK repeat bound G with 10(3)-fold lower affinity. We compared AvrBs3, a naturally occurring TALE that recognizes its target using some atypical RVD-base combinations, with a designed TALE that precisely matches 'standard' RVDs with the target bases. This comparison revealed unexpected differences in sensitivity to substitutions of the invariant 5'-T. Another surprising observation was that base mismatches at the 5' end of the target site had more disruptive effects on affinity than those at the 3' end, particularly in designed TALEs. These results provide evidence that TALE-DNA recognition exhibits a hitherto un-described polarity effect, in which the N-terminal repeats contribute more to affinity than C-terminal ones.
Evolution of Transcription Activator-Like Effectors in Xanthomonas oryzae

PubMed Central

Erkes, Annett; Reschke, Maik; Boch, Jens

2017-01-01

Abstract Transcription activator-like effectors (TALEs) are secreted by plant–pathogenic Xanthomonas bacteria into plant cells where they act as transcriptional activators and, hence, are major drivers in reprogramming the plant for the benefit of the pathogen. TALEs possess a highly repetitive DNA-binding domain of typically 34 amino acid (AA) tandem repeats, where AA 12 and 13, termed repeat variable di-residue (RVD), determine target specificity. Different Xanthomonas strains possess different repertoires of TALEs. Here, we study the evolution of TALEs from the level of RVDs determining target specificity down to the level of DNA sequence with focus on rice-pathogenic Xanthomonas oryzae pv. oryzae (Xoo) and Xanthomonas oryzae pv. oryzicola (Xoc) strains. We observe that codon pairs coding for individual RVDs are conserved to a similar degree as the flanking repeat sequence. We find strong indications that TALEs may evolve 1) by base substitutions in codon pairs coding for RVDs, 2) by recombination of N-terminal or C-terminal regions of existing TALEs, or 3) by deletion of individual TALE repeats, and we propose possible mechanisms. We find indications that the reassortment of TALE genes in clusters is mediated by an integron-like mechanism in Xoc. We finally study the effect of the presence/absence and evolutionary modifications of TALEs on transcriptional activation of putative target genes in rice, and find that even single RVD swaps may lead to considerable differences in activation. This correlation allowed a refined prediction of TALE targets, which is the crucial step to decipher their virulence activity. PMID:28637323
Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

PubMed Central

Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

1985-01-01

The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815
Comparative molecular cytogenetic analyses of a major tandemly repeated DNA family and retrotransposon sequences in cultivated jute Corchorus species (Malvaceae).

PubMed

Begum, Rabeya; Zakrzewski, Falk; Menzel, Gerhard; Weber, Beatrice; Alam, Sheikh Shamimul; Schmidt, Thomas

2013-07-01

The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification. A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100-500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling. Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S-5·8S-25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species. The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species.
The Function of Neuroendocrine Cells in Prostate Cancer

DTIC Science & Technology

2013-04-01

integration site. We then performed deep sequencing and aligned reads to the genome. Our analysis revealed that both histological phenotypes are derived from...lentiviral integration site analysis . (B) Laser capture microdissection was performed on individual glands containing both squamous and...lentiviral integration site analysis . LTR: long terminal repeat (viral DNA), PCR: polymerase chain reaction. (D) Venn diagrams depict shared lentiviral
Genome-wide Annotation and Comparative Analysis of Long Terminal Repeat Retrotransposons between Pear Species of P. bretschneideri and P. Communis

PubMed Central

Yin, Hao; Du, Jianchang; Wu, Jun; Wei, Shuwei; Xu, Yingxiu; Tao, Shutian; Wu, Juyou; Zhang, Shaoling

2015-01-01

Recent sequencing of the Oriental pear (P. bretschneideri Rehd.) genome and the availability of the draft genome sequence of Occidental pear (P. communis L.), has provided a good opportunity to characterize the abundance, distribution, timing, and evolution of long terminal repeat retrotransposons (LTR-RTs) in these two important fruit plants. Here, a total of 7247 LTR-RTs, which can be classified into 148 families, have been identified in the assembled Oriental pear genome. Unlike in other plant genomes, approximately 90% of these elements were found to be randomly distributed along the pear chromosomes. Further analysis revealed that the amplification timeframe of elements varies dramatically in different families, super-families and lineages, and the Copia-like elements have highest activity in the recent 0.5 million years (Mys). The data also showed that two genomes evolved with similar evolutionary rates after their split from the common ancestor ~0.77–1.66 million years ago (Mya). Overall, the data provided here will be a valuable resource for further investigating the impact of transposable elements on gene structure, expression, and epigenetic modification in the pear genomes. PMID:26631625
Structure and possible function of a G-quadruplex in the long terminal repeat of the proviral HIV-1 genome.

PubMed

De Nicola, Beatrice; Lech, Christopher J; Heddi, Brahim; Regmi, Sagar; Frasson, Ilaria; Perrone, Rosalba; Richter, Sara N; Phan, Anh Tuân

2016-07-27

The long terminal repeat (LTR) of the proviral human immunodeficiency virus (HIV)-1 genome is integral to virus transcription and host cell infection. The guanine-rich U3 region within the LTR promoter, previously shown to form G-quadruplex structures, represents an attractive target to inhibit HIV transcription and replication. In this work, we report the structure of a biologically relevant G-quadruplex within the LTR promoter region of HIV-1. The guanine-rich sequence designated LTR-IV forms a well-defined structure in physiological cationic solution. The nuclear magnetic resonance (NMR) structure of this sequence reveals a parallel-stranded G-quadruplex containing a single-nucleotide thymine bulge, which participates in a conserved stacking interaction with a neighboring single-nucleotide adenine loop. Transcription analysis in a HIV-1 replication competent cell indicates that the LTR-IV region may act as a modulator of G-quadruplex formation in the LTR promoter. Consequently, the LTR-IV G-quadruplex structure presented within this work could represent a valuable target for the design of HIV therapeutics. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
ACCA phosphopeptide recognition by the BRCT repeats of BRCA1.

PubMed

Ray, Hind; Moreau, Karen; Dizin, Eva; Callebaut, Isabelle; Venezia, Nicole Dalla

2006-06-16

The tumour suppressor gene BRCA1 encodes a 220 kDa protein that participates in multiple cellular processes. The BRCA1 protein contains a tandem of two BRCT repeats at its carboxy-terminal region. The majority of disease-associated BRCA1 mutations affect this region and provide to the BRCT repeats a central role in the BRCA1 tumour suppressor function. The BRCT repeats have been shown to mediate phospho-dependant protein-protein interactions. They recognize phosphorylated peptides using a recognition groove that spans both BRCT repeats. We previously identified an interaction between the tandem of BRCA1 BRCT repeats and ACCA, which was disrupted by germ line BRCA1 mutations that affect the BRCT repeats. We recently showed that BRCA1 modulates ACCA activity through its phospho-dependent binding to ACCA. To delineate the region of ACCA that is crucial for the regulation of its activity by BRCA1, we searched for potential phosphorylation sites in the ACCA sequence that might be recognized by the BRCA1 BRCT repeats. Using sequence analysis and structure modelling, we proposed the Ser1263 residue as the most favourable candidate among six residues, for recognition by the BRCA1 BRCT repeats. Using experimental approaches, such as GST pull-down assay with Bosc cells, we clearly showed that phosphorylation of only Ser1263 was essential for the interaction of ACCA with the BRCT repeats. We finally demonstrated by immunoprecipitation of ACCA in cells, that the whole BRCA1 protein interacts with ACCA when phosphorylated on Ser1263.

Expression of connective tissue growth factor (CTGF/CCN2) in breast cancer cells is associated with increased migration and angiogenesis.

PubMed

Chien, Wenwen; O'Kelly, James; Lu, Daning; Leiter, Amanda; Sohn, Julia; Yin, Dong; Karlan, Beth; Vadgama, Jay; Lyons, Karen M; Koeffler, H Phillip

2011-06-01

Connective tissue growth factor (CTGF/CCN2) belongs to the CCN family of matricellular proteins, comprising Cyr61, CTGF, NovH and WISP1-3. The CCN proteins contain an N-terminal signal peptide followed by four conserved domains sharing sequence similarities with the insulin-like growth factor binding proteins, von Willebrand factor type C repeat, thrombospondin type 1 repeat, and a C-terminal growth factor cysteine knot domain. To investigate the role of CCN2 in breast cancer, we transfected MCF-7 cells with full-length CCN2, and with four mutant constructs in which one of the domains had been deleted. MCF-7 cells stably expressing full-length CCN2 demonstrated reduced cell proliferation, increased migration in Boyden chamber assays and promoted angiogenesis in chorioallantoic membrane assays compared to control cells. Deletion of the C-terminal cysteine knot domain, but not of any other domain-deleted mutants, abolished activities mediated by full-length CCN2. We have dissected the role of CCN2 in breast tumorigenesis on a structural basis.
Nucleotide sequence of soybean chloroplast DNA regions which contain the psb A and trn H genes and cover the ends of the large single copy region and one end of the inverted repeats.

PubMed

Spielmann, A; Stutz, E

1983-10-25

The soybean chloroplast psb A gene (photosystem II thylakoid membrane protein of Mr 32 000, lysine-free) and the trn H gene (tRNAHisGUG), which both map in the large single copy region adjacent to one of the inverted repeat structures (IR1), have been sequenced including flanking regions. The psb A gene shows in its structural part 92% sequence homology with the corresponding genes of spinach and N. debneyi and contains also an open reading frame for 353 aminoacids. The aminoacid sequence of a potential primary translation product (calculated Mr, 38 904, no lysine) diverges from that of spinach and N. debneyi in only two positions in the C-terminal part. The trn H gene has the same polarity as the psb A gene and the coding region is located at the very end of the large single copy region. The deduced sequence of the soybean chloroplast tRNAHisGUG is identical with that of Zea mays chloroplasts. Both ends of the large single copy region were sequenced including a small segment of the adjacent IR1 and IR2.
Topological frustration in βα-repeat proteins: sequence diversity modulates the conserved folding mechanisms of α/β/α sandwich proteins

PubMed Central

Hills, Ronald D.; Kathuria, Sagar V.; Wallace, Louise A.; Day, Iain J.; Brooks, Charles L.; Matthews, C. Robert

2010-01-01

The thermodynamic hypothesis of Anfinsen postulates that structures and stabilities of globular proteins are determined by their amino acid sequences. Chain topology, however, is known to influence the folding reaction, in that motifs with a preponderance of local interactions typically fold more rapidly than those with a larger fraction of non-local interactions. Together, the topology and sequence can modulate the energy landscape and influence the rate at which the protein folds to the native conformation. To explore the relationship of sequence and topology in the folding of βα–repeat proteins, which are dominated by local interactions, a combined experimental and simulation analysis was performed on two members of the flavodoxin-like, α/β/α sandwich fold. Spo0F and the N-terminal receiver domain of NtrC (NT-NtrC) have similar topologies but low sequence identity, enabling a test of the effects of sequence on folding. Experimental results demonstrated that both response-regulator proteins fold via parallel channels through highly structured sub-millisecond intermediates before accessing their cis prolyl peptide bond-containing native conformations. Global analysis of the experimental results preferentially places these intermediates off the productive folding pathway. Sequence-sensitive Gō-model simulations conclude that frustration in the folding in Spo0F, corresponding to the appearance of the off-pathway intermediate, reflects competition for intra-subdomain van der Waals contacts between its N- and C-terminal subdomains. The extent of transient, premature structure appears to correlate with the number of isoleucine, leucine and valine (ILV) side-chains that form a large sequence-local cluster involving the central β-sheet and helices α2, α3 and α4. The failure to detect the off-pathway species in the simulations of NT-NtrC may reflect the reduced number of ILV side-chains in its corresponding hydrophobic cluster. The location of the hydrophobic clusters in the structure may also be related to the differing functional properties of these response regulators. Comparison with the results of previous experimental and simulation analyses on the homologous CheY argues that prematurely-folded unproductive intermediates are a common property of the βα-repeat motif. PMID:20226790
Rsp5 WW domains interact directly with the carboxyl-terminal domain of RNA polymerase II.

PubMed

Chang, A; Cheang, S; Espanel, X; Sudol, M

2000-07-07

RSP5 is an essential gene in Saccharomyces cerevisiae and was recently shown to form a physical and functional complex with RNA polymerase II (RNA pol II). The amino-terminal half of Rsp5 consists of four domains: a C2 domain, which binds membrane phospholipids; and three WW domains, which are protein interaction modules that bind proline-rich ligands. The carboxyl-terminal half of Rsp5 contains a HECT (homologous to E6-AP carboxyl terminus) domain that catalytically ligates ubiquitin to proteins and functionally classifies Rsp5 as an E3 ubiquitin-protein ligase. The C2 and WW domains are presumed to act as membrane localization and substrate recognition modules, respectively. We report that the second (and possibly third) Rsp5 WW domain mediates binding to the carboxyl-terminal domain (CTD) of the RNA pol II large subunit. The CTD comprises a heptamer (YSPTSPS) repeated 26 times and a PXY core that is critical for interaction with a specific group of WW domains. An analysis of synthetic peptides revealed a minimal CTD sequence that is sufficient to bind to the second Rsp5 WW domain (Rsp5 WW2) in vitro and in yeast two-hybrid assays. Furthermore, we found that specific "imperfect" CTD repeats can form a complex with Rsp5 WW2. In addition, we have shown that phosphorylation of this minimal CTD sequence on serine, threonine and tyrosine residues acts as a negative regulator of the Rsp5 WW2-CTD interaction. In view of the recent data pertaining to phosphorylation-driven interactions between the RNA pol II CTD and the WW domain of Ess1/Pin1, we suggest that CTD dephosphorylation may be a prerequisite for targeted RNA pol II degradation.
Evidence that a sequence similar to TAR is important for induction of the JC virus late promoter by human immunodeficiency virus type 1 Tat.

PubMed Central

Chowdhury, M; Taylor, J P; Chang, C F; Rappaport, J; Khalili, K

1992-01-01

A specific RNA sequence located in the leader of all human immunodeficiency virus type 1 (HIV-1) mRNAs termed the transactivation response element, or TAR, is a primary target for induction of HIV-1 long terminal repeat activity by the HIV-1-derived trans-regulatory protein, Tat. Human neurotropic virus, JC virus (JCV), a causative agent of the degenerative demyelinating disease progressive multifocal leukoencephalopathy, contains sequences in the 5' end of the late RNA species with an extensive homology to HIV-1 TAR. In this study, we examined the possible role of the JCV-derived TAR-homologous sequence in Tat-mediated activation of the JCV late promoter (Tada et al., Proc. Natl. Acad. Sci. USA 87:3479-3483, 1990). Results from site-directed mutagenesis revealed that critical G residues required for the function of HIV-1 TAR that are conserved in the JCV TAR homolog play an important role in Tat activation of the JCV promoter. In addition, in vivo competition studies suggest that shared regulatory components mediate Tat activation of the JCV late and HIV-1 long terminal repeat promoters. Furthermore, we showed that the JCV-derived TAR sequence behaves in the same way as HIV-1 TAR in response to two distinct Tat mutants, one of which that has no ability to bind to HIV-1 TAR and another that lacks transcriptional activity on a responsive promoter. These results suggest that the TAR homolog of the JCV late promoter is responsive to HIV-1 Tat induction and thus may participate in the overall activation of the JCV late promoter mediated by this transactivation. Images PMID:1331525
Heterochromatin and molecular characterization of DsmarMITE transposable element in the beetle Dichotomius schiffleri (Coleoptera: Scarabaeidae).

PubMed

Xavier, Crislaine; Cabral-de-Mello, Diogo Cavalcanti; de Moura, Rita Cássia

2014-12-01

Cytogenetic studies of the Neotropical beetle genus Dichotomius (Scarabaeinae, Coleoptera) have shown dynamism for centromeric constitutive heterochromatin sequences. In the present work we studied the chromosomes and isolated repetitive sequences of Dichotomius schiffleri aiming to contribute to the understanding of coleopteran genome/chromosomal organization. Dichotomius schiffleri presented a conserved karyotype and heterochromatin distribution in comparison to other species of the genus with 2n = 18, biarmed chromosomes, and pericentromeric C-positive blocks. Similarly to heterochromatin distributional patterns, the highly and moderately repetitive DNA fraction (C 0 t-1 DNA) was detected in pericentromeric areas, contrasting with the euchromatic mapping of an isolated TE (named DsmarMITE). After structural analyses, the DsmarMITE was classified as a non-autonomous element of the type miniature inverted-repeat transposable element (MITE) with terminal inverted repeats similar to Mariner elements of insects from different orders. The euchromatic distribution for DsmarMITE indicates that it does not play a part in the dynamics of constitutive heterochromatin sequences.
Chromatin tethering effects of hNopp140 are involved in the spatial organization of nucleolus and the rRNA gene transcription

PubMed Central

Tsai, Yi-Tzang; Lin, Chen-I; Chen, Hung-Kai; Lee, Kuo-Ming; Hsu, Chia-Yi; Yang, Shun-Jen

2008-01-01

The short arms of five human acrocentric chromosomes contain ribosomal gene (rDNA) clusters where numerous mini-nucleoli arise at the exit of mitosis. These small nucleoli tend to coalesce into one or a few large nucleoli during interphase by unknown mechanisms. Here, we demonstrate that the N- and C-terminal domains of a nucleolar protein, hNopp140, bound respectively to α-satellite arrays and rDNA clusters of acrocentric chromosomes for nucleolar formation. The central acidic-and-basic repeated domain of hNopp140, possessing a weak self-self interacting ability, was indispensable for hNopp140 to build up a nucleolar round-shaped structure. The N- or the C-terminally truncated hNopp140 caused nucleolar segregation and was able to alter locations of the rDNA transcription, as mediated by detaching the rDNA repeats from the acrocentric α-satellite arrays. Interestingly, an hNopp140 mutant, made by joining the N- and C-terminal domains but excluding the entire central repeated region, induced nucleolar disruption and global chromatin condensation. Furthermore, RNAi knockdown of hNopp140 resulted in dispersion of the rDNA and acrocentric α-satellite sequences away from nucleolus that was accompanied by rDNA transcriptional silence. Our findings indicate that hNopp140, a scaffold protein, is involved in the nucleolar assembly, fusion, and maintenance. PMID:18253863
First Insights into the Large Genome of Epimedium sagittatum (Sieb. et Zucc) Maxim, a Chinese Traditional Medicinal Plant

PubMed Central

Liu, Di; Zeng, Shao-Hua; Chen, Jian-Jun; Zhang, Yan-Jun; Xiao, Gong; Zhu, Lin-Yao; Wang, Ying

2013-01-01

Epimedium sagittatum (Sieb. et Zucc) Maxim is a member of the Berberidaceae family of basal eudicot plants, widely distributed and used as a traditional medicinal plant in China for therapeutic effects on many diseases with a long history. Recent data shows that E. sagittatum has a relatively large genome, with a haploid genome size of ~4496 Mbp, divided into a small number of only 12 diploid chromosomes (2n = 2x = 12). However, little is known about Epimedium genome structure and composition. Here we present the analysis of 691 kb of high-quality genomic sequence derived from 672 randomly selected plasmid clones of E. sagittatum genomic DNA, representing ~0.0154% of the genome. The sampled sequences comprised at least 78.41% repetitive DNA elements and 2.51% confirmed annotated gene sequences, with a total GC% content of 39%. Retrotransposons represented the major class of transposable element (TE) repeats identified (65.37% of all TE repeats), particularly LTR (Long Terminal Repeat) retrotransposons (52.27% of all TE repeats). Chromosome analysis and Fluorescence in situ Hybridization of Gypsy-Ty3 retrotransposons were performed to survey the E. sagittatum genome at the cytological level. Our data provide the first insights into the composition and structure of the E. sagittatum genome, and will facilitate the functional genomic analysis of this valuable medicinal plant. PMID:23807511
Overview of Research Transition Products

NASA Technical Reports Server (NTRS)

Robinson, John

2014-01-01

Demonstrate increased, more consistent use of Performance- Based Navigation (PBN). Accelerate transfer of NASA scheduling and spacing technologies for inclusion in late mid-term NAS. During high-fidelity human-in-the-loop simulations of Terminal Sequencing and Spacing, air traffic controllers have significantly improved their use of PBN procedures during busy traffic periods without increased workload. Executed an aggressive, short timeframe development schedule. Developed TSS prototype based upon FAA operational systems. Conducted multiple joint FAA/NASA human-in-the-loop simulations. Performed repeated incremental deliveries of tech transfer material to non-traditional RTT stakeholders. Will continue to participate in later phases of FAA acquisition process. ATD-1 transferred Terminal Sequencing and Spacing (TSS) technologies to the FAA. TSS enables routine use of underutilized advanced avionics and PBN procedures. Potential benefits to airlines operating at initial TSS sites estimated to be $300-400M/year. FAA is planning for an initial capability in the NAS in 2018.
Ureaplasma antigenic variation beyond MBA phase variation: DNA inversions generating chimeric structures and switching in expression of the MBA N-terminal paralogue UU172

PubMed Central

Zimmerman, Carl-Ulrich R; Rosengarten, Renate; Spergser, Joachim

2011-01-01

Phase variation of the major ureaplasma surface membrane protein, the multiple-banded antigen (MBA), with its counterpart, the UU376 protein, was recently discussed as a result of DNA inversion occurring at specific inverted repeats. Two similar inverted repeats to the ones within the mba locus were found in the genome of Ureaplasma parvum serovar 3; one within the MBA N-terminal paralogue UU172 and another in the adjacent intergenic spacer region. In this report, we demonstrate on both genomic and protein level that DNA inversion at these inverted repeats leads to alternating expression between UU172 and the neighbouring conserved hypothetical ORF UU171. Sequence analysis of this phase-variable ‘UU172 element’ from both U. parvum and U. urealyticum strains revealed that it is highly conserved among both species and that it also includes the orthologue of UU144. A third inverted repeat region in UU144 is proposed to serve as an additional potential inversion site from which chimeric genes can evolve. Our results indicate that site-specific recombination events in the genome of U. parvum serovar 3 are dynamic and frequent, leading to a broad spectrum of antigenic variation by which the organism may evade host immune responses. PMID:21255110
Structural Basis of Egg Coat-Sperm Recognition at Fertilization.

PubMed

Raj, Isha; Sadat Al Hosseini, Hamed; Dioguardi, Elisa; Nishimura, Kaoru; Han, Ling; Villa, Alessandra; de Sanctis, Daniele; Jovine, Luca

2017-06-15

Recognition between sperm and the egg surface marks the beginning of life in all sexually reproducing organisms. This fundamental biological event depends on the species-specific interaction between rapidly evolving counterpart molecules on the gametes. We report biochemical, crystallographic, and mutational studies of domain repeats 1-3 of invertebrate egg coat protein VERL and their interaction with cognate sperm protein lysin. VERL repeats fold like the functionally essential N-terminal repeat of mammalian sperm receptor ZP2, whose structure is also described here. Whereas sequence-divergent repeat 1 does not bind lysin, repeat 3 binds it non-species specifically via a high-affinity, largely hydrophobic interface. Due to its intermediate binding affinity, repeat 2 selectively interacts with lysin from the same species. Exposure of a highly positively charged surface of VERL-bound lysin suggests that complex formation both disrupts the organization of egg coat filaments and triggers their electrostatic repulsion, thereby opening a hole for sperm penetration and fusion. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Suppression Analysis Reveals a Functional Difference between the Serines in Positions Two and Five in the Consensus Sequence of the C-Terminal Domain of Yeast RNA Polymerase II

PubMed Central

Yuryev, A.; Corden, J. L.

1996-01-01

The largest subunit of RNA polymerase II contains a repetitive C-terminal domain (CTD) consisting of tandem repeats of the consensus sequence Tyr(1)Ser(2)Pro(3)Thr(4) Ser(5)Pro(6) Ser(7). Substitution of nonphosphorylatable amino acids at positions two or five of the Saccharomyces cerevisiae CTD is lethal. We developed a selection ssytem for isolating suppressors of this lethal phenotype and cloned a gene, SCA1 (suppressor of CTD alanine), which complements recessive suppressors of lethal multiple-substitution mutations. A partial deletion of SCA1 (sca1Δ::hisG) suppresses alanine or glutamate substitutions at position two of the consensus CTD sequence, and a lethal CTD truncation mutation, but SCA1 deletion does not suppress alanine or glutamate substitutions at position five. SCA1 is identical to SRB9, a suppressor of a cold-sensitive CTD truncation mutation. Strains carrying dominant SRB mutations have the same suppression properties as a sca1Δ::hisG strain. These results reveal a functional difference between positions two and five of the consensus CTD heptapeptide repeat. The ability of SCA1 and SRB mutant alleles to suppress CTD truncation mutations suggest that substitutions at position two, but not at position five, cause a defect in RNA polymerase II function similar to that introduced by CTD truncation. PMID:8725217
The NnCenH3 protein and centromeric DNA sequence profiles of Nelumbo nucifera Gaertn. (sacred lotus) reveal the DNA structures and dynamics of centromeres in basal eudicots.

PubMed

Zhu, Zhixuan; Gui, Songtao; Jin, Jing; Yi, Rong; Wu, Zhihua; Qian, Qian; Ding, Yi

2016-09-01

Centromeres on eukaryotic chromosomes consist of large arrays of DNA repeats that undergo very rapid evolution. Nelumbo nucifera Gaertn. (sacred lotus) is a phylogenetic relict and an aquatic perennial basal eudicot. Studies concerning the centromeres of this basal eudicot species could provide ancient evolutionary perspectives. In this study, we characterized the centromeric marker protein NnCenH3 (sacred lotus centromere-specific histone H3 variant), and used a chromatin immunoprecipitation (ChIP)-based technique to recover the NnCenH3 nucleosome-associated sequences of sacred lotus. The properties of the centromere-binding protein and DNA sequences revealed notable divergence between sacred lotus and other flowering plants, including the following factors: (i) an NnCenH3 alternative splicing variant comprising only a partial centromere-targeting domain, (ii) active genes with low transcription levels in the NnCenH3 nucleosomal regions, and (iii) the prevalence of the Ty1/copia class of long terminal repeat (LTR) retrotransposons in the centromeres of sacred lotus chromosomes. In addition, the dynamic natures of the centromeric region showed that some of the centromeric repeat DNA sequences originated from telomeric repeats, and a pair of centromeres on the dicentric chromosome 1 was inactive in the metaphase cells of sacred lotus. Our characterization of the properties of centromeric DNA structure within the sacred lotus genome describes a centromeric profile in ancient basal eudicots and might provide evidence of the origins and evolution of centromeres. Furthermore, the identification of centromeric DNA sequences is of great significance for the assembly of the sacred lotus genome. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
An interferon regulatory factor binding site in the U5 region of the bovine leukemia virus long terminal repeat stimulates Tax-independent gene expression.

PubMed

Kiermer, V; Van Lint, C; Briclet, D; Vanhulle, C; Kettmann, R; Verdin, E; Burny, A; Droogmans, L

1998-07-01

Bovine leukemia virus (BLV) replication is controlled by both cis- and trans-acting elements. The virus-encoded transactivator, Tax, is necessary for efficient transcription from the BLV promoter, although it is not present during the early stages of infection. Therefore, sequences that control Tax-independent transcription must play an important role in the initiation of viral gene expression. This study demonstrates that the R-U5 sequence of BLV stimulates Tax-independent reporter gene expression directed by the BLV promoter. R-U5 was also stimulatory when inserted immediately downstream from the transcription initiation site of a heterologous promoter. Progressive deletion analysis of this region revealed that a 46-bp element corresponding to the 5' half of U5 is principally responsible for the stimulation. This element exhibited enhancer activity when inserted upstream or downstream from the herpes simplex virus thymidine kinase promoter. This enhancer contains a binding site for the interferon regulatory factors IRF-1 and IRF-2. A 3-bp mutation that destroys the IRF recognition site caused a twofold decrease in Tax-independent BLV long terminal repeat-driven gene expression. These observations suggest that the IRF binding site in the U5 region of BLV plays a role in the initiation of virus replication.
Isolation and molecular characterization of dTnp1, a mobile and defective transposable element of Nicotiana plumbaginifolia.

PubMed

Meyer, C; Pouteau, S; Rouzé, P; Caboche, M

1994-01-01

By Northern blot analysis of nitrate reductase-deficient mutants of Nicotiana plumbaginifolia, we identified a mutant (mutant D65), obtained after gamma-ray irradiation of protoplasts, which contained an insertion sequence in the nitrate reductase (NR) mRNA. This insertion sequence was localized by polymerase chain reaction (PCR) in the first exon of NR and was also shown to be present in the NR gene. The mutant gene contained a 565 bp insertion sequence that exhibits the sequence characteristics of a transposable element, which was thus named dTnp1. The dTnp1 element has 14 bp terminal inverted repeats and is flanked by an 8-bp target site duplication generated upon transposition. These inverted repeats have significant sequence homology with those of other transposable elements. Judging by its size and the absence of a long open reading frame, dTnp1 appears to represent a defective, although mobile, transposable element. The octamer motif TTTAGGCC was found several times in direct orientation near the 5' and 3' ends of dTnp1 together with a perfect palindrome located after the 5' inverted repeat. Southern blot analysis using an internal probe of dTnp1 suggested that this element occurs as a single copy in the genome of N. plumbaginifolia. It is also present in N. tabacum, but absent in tomato or petunia. The dTnp1 element is therefore of potential use for gene tagging in Nicotiana species.
Nucleotide sequence of soybean chloroplast DNA regions which contain the psb A and trn H genes and cover the ends of the large single copy region and one end of the inverted repeats.

PubMed Central

Spielmann, A; Stutz, E

1983-01-01

The soybean chloroplast psb A gene (photosystem II thylakoid membrane protein of Mr 32 000, lysine-free) and the trn H gene (tRNAHisGUG), which both map in the large single copy region adjacent to one of the inverted repeat structures (IR1), have been sequenced including flanking regions. The psb A gene shows in its structural part 92% sequence homology with the corresponding genes of spinach and N. debneyi and contains also an open reading frame for 353 aminoacids. The aminoacid sequence of a potential primary translation product (calculated Mr, 38 904, no lysine) diverges from that of spinach and N. debneyi in only two positions in the C-terminal part. The trn H gene has the same polarity as the psb A gene and the coding region is located at the very end of the large single copy region. The deduced sequence of the soybean chloroplast tRNAHisGUG is identical with that of Zea mays chloroplasts. Both ends of the large single copy region were sequenced including a small segment of the adjacent IR1 and IR2. PMID:6314279
Length Variation, Heteroplasmy and Sequence Divergence in the Mitochondrial DNA of Four Species of Sturgeon (Acipenser)

PubMed Central

Brown, J. R.; Beckenbach, K.; Beckenbach, A. T.; Smith, M. J.

1996-01-01

The extent of mtDNA length variation and heteroplasmy as well as DNA sequences of the control region and two tRNA genes were determined for four North American sturgeon species: Acipenser transmontanus, A. medirostris, A. fulvescens and A. oxyrhnychus. Across the Continental Divide, a division in the occurrence of length variation and heteroplasmy was observed that was concordant with species biogeography as well as with phylogenies inferred from restriction fragment length polymorphisms (RFLP) of whole mtDNA and pairwise comparisons of unique sequences of the control region. In all species, mtDNA length variation was due to repeated arrays of 78-82-bp sequences each containing a D-loop strand synthesis termination associated sequence (TAS). Individual repeats showed greater sequence conservation within individuals and species rather than between species, which is suggestive of concerted evolution. Differences in the frequencies of multiple copy genomes and heteroplasmy among the four species may be ascribed to differences in the rates of recurrent mutation. A mechanism that may offset the high rate of mutation for increased copy number is suggested on the basis that an increase in the number of functional TAS motifs might reduce the frequency of successfully initiated H-strand replications. PMID:8852850
Exploring the genome of the salt-marsh Spartina maritima (Poaceae, Chloridoideae) through BAC end sequence analysis.

PubMed

Ferreira de Carvalho, J; Chelaifa, H; Boutte, J; Poulain, J; Couloux, A; Wincker, P; Bellec, A; Fourment, J; Bergès, H; Salmon, A; Ainouche, M

2013-12-01

Spartina species play an important ecological role on salt marshes. Spartina maritima is an Old-World species distributed along the European and North-African Atlantic coasts. This hexaploid species (2n = 6x = 60, 2C = 3,700 Mb) hybridized with different Spartina species introduced from the American coasts, which resulted in the formation of new invasive hybrids and allopolyploids. Thus, S. maritima raises evolutionary and ecological interests. However, genomic information is dramatically lacking in this genus. In an effort to develop genomic resources, we analysed 40,641 high-quality bacterial artificial chromosome-end sequences (BESs), representing 26.7 Mb of the S. maritima genome. BESs were searched for sequence homology against known databases. A fraction of 16.91% of the BESs represents known repeats including a majority of long terminal repeat (LTR) retrotransposons (13.67%). Non-LTR retrotransposons represent 0.75%, DNA transposons 0.99%, whereas small RNA, simple repeats and low-complexity sequences account for 1.38% of the analysed BESs. In addition, 4,285 simple sequence repeats were detected. Using the coding sequence database of Sorghum bicolor, 6,809 BESs found homology accounting for 17.1% of all BESs. Comparative genomics with related genera reveals that the microsynteny is better conserved with S. bicolor compared to other sequenced Poaceae, where 37.6% of the paired matching BESs are correctly orientated on the chromosomes. We did not observe large macrosyntenic rearrangements using the mapping strategy employed. However, some regions appeared to have experienced rearrangements when comparing Spartina to Sorghum and to Oryza. This work represents the first overview of S. maritima genome regarding the respective coding and repetitive components. The syntenic relationships with other grass genomes examined here help clarifying evolution in Poaceae, S. maritima being a part of the poorly-known Chloridoideae sub-family.
The Ma Gene for Complete-Spectrum Resistance to Meloidogyne Species in Prunus Is a TNL with a Huge Repeated C-Terminal Post-LRR Region1[C][W

PubMed Central

Claverie, Michel; Dirlewanger, Elisabeth; Bosselut, Nathalie; Van Ghelder, Cyril; Voisin, Roger; Kleinhentz, Marc; Lafargue, Bernard; Abad, Pierre; Rosso, Marie-Noëlle; Chalhoub, Boulos; Esmenjaud, Daniel

2011-01-01

Root-knot nematode (RKN) Meloidogyne species are major polyphagous pests of most crops worldwide, and cultivars with durable resistance are urgently needed because of nematicide bans. The Ma gene from the Myrobalan plum (Prunus cerasifera) confers complete-spectrum, heat-stable, and high-level resistance to RKN, which is remarkable in comparison with the Mi-1 gene from tomato (Solanum lycopersicum), the sole RKN resistance gene cloned. We report here the positional cloning and the functional validation of the Ma locus present at the heterozygous state in the P.2175 accession. High-resolution mapping totaling over 3,000 segregants reduced the Ma locus interval to a 32-kb cluster of three Toll/Interleukin1 Receptor-Nucleotide Binding Site-Leucine-Rich Repeat (LRR) genes (TNL1–TNL3), including a pseudogene (TNL2) and a truncated gene (TNL3). The sole complete gene in this interval (TNL1) was validated as Ma, as it conferred the same complete-spectrum and high-level resistance (as in P.2175) using its genomic sequence and native promoter region in Agrobacterium rhizogenes-transformed hairy roots and composite plants. The full-length cDNA (2,048 amino acids) of Ma is the longest of all Resistance genes cloned to date. Its TNL structure is completed by a huge post-LRR (PL) sequence (1,088 amino acids) comprising five repeated carboxyl-terminal PL exons with two conserved motifs. The amino-terminal region (213 amino acids) of the LRR exon is conserved between alleles and contrasts with the high interallelic polymorphisms of its distal region (111 amino acids) and of PL domains. The Ma gene highlights the importance of these uncharacterized PL domains, which may be involved in pathogen recognition through the decoy hypothesis or in nuclear signaling. PMID:21482634
Single Molecule Analysis of Replicated DNA Reveals the Usage of Multiple KSHV Genome Regions for Latent Replication

PubMed Central

Verma, Subhash C.; Lu, Jie; Cai, Qiliang; Kosiyatrakul, Settapong; McDowell, Maria E.; Schildkraut, Carl L.; Robertson, Erle S.

2011-01-01

Kaposi's sarcoma associated herpesvirus (KSHV), an etiologic agent of Kaposi's sarcoma, Body Cavity Based Lymphoma and Multicentric Castleman's Disease, establishes lifelong latency in infected cells. The KSHV genome tethers to the host chromosome with the help of a latency associated nuclear antigen (LANA). Additionally, LANA supports replication of the latent origins within the terminal repeats by recruiting cellular factors. Our previous studies identified and characterized another latent origin, which supported the replication of plasmids ex-vivo without LANA expression in trans. Therefore identification of an additional origin site prompted us to analyze the entire KSHV genome for replication initiation sites using single molecule analysis of replicated DNA (SMARD). Our results showed that replication of DNA can initiate throughout the KSHV genome and the usage of these regions is not conserved in two different KSHV strains investigated. SMARD also showed that the utilization of multiple replication initiation sites occurs across large regions of the genome rather than a specified sequence. The replication origin of the terminal repeats showed only a slight preference for their usage indicating that LANA dependent origin at the terminal repeats (TR) plays only a limited role in genome duplication. Furthermore, we performed chromatin immunoprecipitation for ORC2 and MCM3, which are part of the pre-replication initiation complex to determine the genomic sites where these proteins accumulate, to provide further characterization of potential replication initiation sites on the KSHV genome. The ChIP data confirmed accumulation of these pre-RC proteins at multiple genomic sites in a cell cycle dependent manner. Our data also show that both the frequency and the sites of replication initiation vary within the two KSHV genomes studied here, suggesting that initiation of replication is likely to be affected by the genomic context rather than the DNA sequences. PMID:22072974

Discordant expression and variable numbers of neighboring GGA- and GAA-rich triplet repeats in the 3' untranslated regions of two groups of messenger RNAs encoded by the rat polymeric immunoglobulin receptor gene.

PubMed Central

Koch, K S; Gleiberman, A S; Aoki, T; Leffert, H L; Feren, A; Jones, A L; Fodor, E J

1995-01-01

An unusual S1-nuclease sensitive microsatellite (STMS) has been found in the single copy, rat polymeric immunoglobulin receptor gene (PIGR) terminal exon. In Fisher rats, elements within or beyond the STMS are expressed variably in the 3' untranslated regions (3'UTRs) of two 'Groups' of PIGR-encoded hepatic mRNAs (pIg-R) during liver regeneration. STMS elements include neighboring constant regions (a 60-bp d[GA]-rich tract with a chi-like octamer, followed by 15 tandem d[GGA] repeats) that merge directly with 36 or 39 tandem d[GAA] repeats (Fisher or Wistar strains, respectively) interrupted by d[AA] between their 5th-6th repeat units. The Wistar STMS is flanked upstream by two regions of nearly contiguous d[CA] or d[CT] repeats in the 3' end of intron 8; and downstream, by a 283 bp 'unit' containing several inversions at its 5' end, and two polyadenylation signals at its 3' end. The 283 nt unit is expressed in Group 1 pIg-R mRNAs; but it is absent in the Group 2 family so that their GAA repeats merge with their poly A tails. In contrast to genomic sequence, GGA triplet repeats are amplified (n > or = 24-26), whereas GAA triplet repeats are truncated variably (n < or = 9-37) and expressed uninterruptedly in both mRNA Groups. These results suggest that 3' end processing of the rat PIGR gene may involve misalignment, slippage and premature termination of RNA polymerase II. The function of this unusual processing and possible roles of chi-like octamers in quiescent or extrahepatic tissues are discussed. Images PMID:7739889
Cloning and sequencing of a gene encoding the 69-kDa extracellular chitinase of Janthinobacterium lividum.

PubMed

Gleave, A P; Taylor, R K; Morris, B A; Greenwood, D R

1995-09-15

Janthinobacterium lividum secretes a major 56-kDa chitinase and a minor 69-kDa chitinase. A chitinase gene was defined on a 3-kb fragment of clone pRKT10, by virtue of fluorescent colonies in the presence of 4-methylumbelliferyl-beta-D-N,N',N"-chitotrioside. Nucleotide sequencing revealed an 1998-bp open reading frame with the potential to encode a 69,716-Da protein with amino acid sequences similar to those in other chitinases, suggesting it encodes the minor chitinase (Chi69). Chitinase activity of Escherichia coli (pRKT10) lysates was detected mainly in the periplasmic fraction and immunoblotting detected a 70-kDa protein in this fraction. Chi69 has an N-terminal secretory leader peptide preceding two probable chitin-binding domains and a catalytic domain. These functional domains are separated by linker regions of proline-threonine repeats. Amino acid sequencing of cyanogen bromide cleavage-derived peptides from the major 56-kDa chitinase suggested that Chi69 may be a precursor of Chi56. In addition, an N-terminally truncated version of Chi69 retained chitinase activity as expected if in vivo processing of Chi69 generates Chi56.
Comparative molecular cytogenetic analyses of a major tandemly repeated DNA family and retrotransposon sequences in cultivated jute Corchorus species (Malvaceae)

PubMed Central

Begum, Rabeya; Zakrzewski, Falk; Menzel, Gerhard; Weber, Beatrice; Alam, Sheikh Shamimul; Schmidt, Thomas

2013-01-01

Background and Aims The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification. Methods A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100–500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling. Key Results Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S–5·8S–25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species. Conclusions The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species. PMID:23666888
Abr1, a Transposon-Like Element in the Genome of the Cultivated Mushroom Agaricus bisporus (Lange) Imbach

PubMed Central

Sonnenberg, Anton S. M.; Baars, Johan J. P.; Mikosch, Thomas S. P.; Schaap, Peter J.; Van Griensven, Leo J. L. D.

1999-01-01

A 300-bp repetitive element was found in the genome of the white button mushroom, Agaricus bisporus, and designated Abr1. It is present in ∼15 copies per haploid genome in the commercial strain Horst U1. Analysis of seven copies showed 89 to 97% sequence identity. The repeat has features typical of class II transposons (i.e., terminal inverted repeats, subterminal repeats, and a target site duplication of 7 bp). The latter shows a consensus sequence. When used as probe on Southern blots, Abr1 identifies relatively little variation within traditional and present-day commercial strains, indicating that most strains are identical or have a common origin. In contrast to these cultivars, high variation is found among field-collected strains. Furthermore, a remarkable difference in copy numbers of Abr1 was found between A. bisporus isolates with a secondarily homothallic life cycle and those with a heterothallic life cycle. Abr1 is a type II transposon not previously reported in basidiomycetes and appears to be useful for the identification of strains within the species A. bisporus. PMID:10427018
Gene Deletion in Barley Mediated by LTR-retrotransposon BARE

PubMed Central

Shang, Yi; Yang, Fei; Schulman, Alan H.; Zhu, Jinghuan; Jia, Yong; Wang, Junmei; Zhang, Xiao-Qi; Jia, Qiaojun; Hua, Wei; Yang, Jianming; Li, Chengdao

2017-01-01

A poly-row branched spike (prbs) barley mutant was obtained from soaking a two-rowed barley inflorescence in a solution of maize genomic DNA. Positional cloning and sequencing demonstrated that the prbs mutant resulted from a 28 kb deletion including the inflorescence architecture gene HvRA2. Sequence annotation revealed that the HvRA2 gene is flanked by two LTR (long terminal repeat) retrotransposons (BARE) sharing 89% sequence identity. A recombination between the integrase (IN) gene regions of the two BARE copies resulted in the formation of an intact BARE and loss of HvRA2. No maize DNA was detected in the recombination region although the flanking sequences of HvRA2 gene showed over 73% of sequence identity with repetitive sequences on 10 maize chromosomes. It is still unknown whether the interaction of retrotransposons between barley and maize has resulted in the recombination observed in the present study. PMID:28252053
Peptides derived from human galectin-3 N-terminal tail interact with its carbohydrate recognition domain in a phosphorylation-dependent manner

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berbís, M. Álvaro; André, Sabine; Cañada, F. Javier

2014-01-03

Highlights: •Galectin-3 is composed of a carbohydrate recognition domain and an N-terminal tail. •Synthetic peptides derived from the tail are shown to interact with the CRD. •This interaction is modulated by Ser- and Tyr-phosphorylation of the peptides. -- Abstract: Galectin-3 (Gal-3) is a multi-functional effector protein that functions in the cytoplasm and the nucleus, as well as extracellularly following non-classical secretion. Structurally, Gal-3 is unique among galectins with its carbohydrate recognition domain (CRD) attached to a rather long N-terminal tail composed mostly of collagen-like repeats (nine in the human protein) and terminating in a short non-collagenous terminal peptide sequence uniquemore » in this lectin family and not yet fully explored. Although several Ser and Tyr sites within the N-terminal tail can be phosphorylated, the physiological significance of this post-translational modification remains unclear. Here, we used a series of synthetic (phospho)peptides derived from the tail to assess phosphorylation-mediated interactions with {sup 15}N-labeled Gal-3 CRD. HSQC-derived chemical shift perturbations revealed selective interactions at the backface of the CRD that were attenuated by phosphorylation of Tyr 107 and Tyr 118, while phosphorylation of Ser 6 and Ser 12 was essential. Controls with sequence scrambling underscored inherent specificity. Our studies shed light on how phosphorylation of the N-terminal tail may impact on Gal-3 function and prompt further studies using phosphorylated full-length protein.« less
Formation of highly stable chimeric trimers by fusion of an adenovirus fiber shaft fragment with the foldon domain of bacteriophage t4 fibritin.

PubMed

Papanikolopoulou, Katerina; Forge, Vincent; Goeltz, Pierrette; Mitraki, Anna

2004-03-05

The folding of beta-structured, fibrous proteins is a largely unexplored area. A class of such proteins is used by viruses as adhesins, and recent studies revealed novel beta-structured motifs for them. We have been studying the folding and assembly of adenovirus fibers that consist of a globular C-terminal domain, a central fibrous shaft, and an N-terminal part that attaches to the viral capsid. The globular C-terminal, or "head" domain, has been postulated to be necessary for the trimerization of the fiber and might act as a registration signal that directs its correct folding and assembly. In this work, we replaced the head of the fiber by the trimerization domain of the bacteriophage T4 fibritin, termed "foldon." Two chimeric proteins, comprising the foldon domain connected at the C-terminal end of four fiber shaft repeats with or without the use of a natural linker sequence, fold into highly stable, SDS-resistant trimers. The structural signatures of the chimeric proteins as seen by CD and infrared spectroscopy are reported. The results suggest that the foldon domain can successfully replace the fiber head domain in ensuring correct trimerization of the shaft sequences. Biological implications and implications for engineering highly stable, beta-structured nanorods are discussed.
The paradox of MHC-DRB exon/intron evolution: alpha-helix and beta-sheet encoding regions diverge while hypervariable intronic simple repeats coevolve with beta-sheet codons.

PubMed

Schwaiger, F W; Weyers, E; Epplen, C; Brün, J; Ruff, G; Crawford, A; Epplen, J T

1993-09-01

Twenty-one different caprine and 13 ovine MHC-DRB exon 2 sequences were determined including part of the adjacent introns containing simple repetitive (gt)n(ga)m elements. The positions for highly polymorphic DRB amino acids vary slightly among ungulates and other mammals. From man and mouse to ungulates the basic (gt)n(ga)m structure is fixed in evolution for 7 x 10(7) years whereas ample variations exist in the tandem (gt)n and (ga)m dinucleotides and especially their "degenerated" derivatives. Phylogenetic trees for the alpha-helices and beta-pleated sheets of the ungulate DRB sequences suggest different evolutionary histories. In hoofed animals as well as in humans DRB beta-sheet encoding sequences and adjacent intronic repeats can be assembled into virtually identical groups suggesting coevolution of noncoding as well as coding DNA. In contrast alpha-helices and C-terminal parts of the first DRB domain evolve distinctly. In the absence of a defined mechanism causing specific, site-directed mutations, double-recombination or gene-conversion-like events would readily explain this fact. The role of the intronic simple (gt)n(ga)m repeat is discussed with respect to these genetic exchange mechanisms during evolution.
Identification of Bari Transposons in 23 Sequenced Drosophila Genomes Reveals Novel Structural Variants, MITEs and Horizontal Transfer

PubMed Central

D’Addabbo, Pietro; Caizzi, Ruggiero

2016-01-01

Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon’s co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon’s evolutionary dynamics and increases our understanding on the Tc1-mariner elements’ biology. PMID:27213270
Identification of Bari Transposons in 23 Sequenced Drosophila Genomes Reveals Novel Structural Variants, MITEs and Horizontal Transfer.

PubMed

Palazzo, Antonio; Lovero, Domenica; D'Addabbo, Pietro; Caizzi, Ruggiero; Marsano, René Massimiliano

2016-01-01

Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon's co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon's evolutionary dynamics and increases our understanding on the Tc1-mariner elements' biology.
Interactive computer graphics system for structural sizing and analysis of aircraft structures

NASA Technical Reports Server (NTRS)

Bendavid, D.; Pipano, A.; Raibstein, A.; Somekh, E.

1975-01-01

A computerized system for preliminary sizing and analysis of aircraft wing and fuselage structures was described. The system is based upon repeated application of analytical program modules, which are interactively interfaced and sequence-controlled during the iterative design process with the aid of design-oriented graphics software modules. The entire process is initiated and controlled via low-cost interactive graphics terminals driven by a remote computer in a time-sharing mode.
Novel Structure of Ty3 Reverse Transcriptase | Center for Cancer Research

Cancer.gov

Retrotransposons are mobile genetic elements that self amplify via a single-stranded RNA intermediate, which is converted to double-stranded DNA by an encoded reverse transcriptase (RT) with both DNA polymerase (pol) and ribonuclease H (RNase) activities. Categorized by whether they contain flanking long terminal repeat (LTR) sequences, retrotransposons play a critical role in the architecture of eukaryotic genomes and are the evolutionary origin of retroviruses, including human immunodeficiency virus (HIV).
Complete genome sequence and architecture of crucian carp Carassius auratus herpesvirus (CaHV).

PubMed

Zeng, Xiao-Tao; Chen, Zhong-Yuan; Deng, Yuan-Sheng; Gui, Jian-Fang; Zhang, Qi-Ya

2016-12-01

Crucian carp Carassius auratus herpesvirus (CaHV) was isolated from diseased crucian carp with acute gill hemorrhages and high mortality. The CaHV genome was sequenced and analyzed. The data showed that it consists of 275,348 bp and contains 150 predicted ORFs. The architecture of the CaHV genome differs from those of four cyprinid herpesviruses (CyHV1, CyHV2, SY-C1, CyHV3), with insertions, deletions and the absence of a terminal direct repeat. Phylogenetic analysis of the DNA polymerase sequences of 17 strains of Herpesvirales members, and the concatenated 12 core ORFs from 10 strains of alloherpesviruses showed that CaHV clustered together with members of the genus Cyprinivirus, family Alloherpesviridae.
Foamy Virus Vector Carries a Strong Insulator in Its Long Terminal Repeat Which Reduces Its Genotoxic Potential

PubMed Central

2017-01-01

ABSTRACT Strong viral enhancers in gammaretrovirus vectors have caused cellular proto-oncogene activation and leukemia, necessitating the use of cellular promoters in “enhancerless” self-inactivating integrating vectors. However, cellular promoters result in relatively low transgene expression, often leading to inadequate disease phenotype correction. Vectors derived from foamy virus, a nonpathogenic retrovirus, show higher preference for nongenic integrations than gammaretroviruses/lentiviruses and preferential integration near transcriptional start sites, like gammaretroviruses. We found that strong viral enhancers/promoters placed in foamy viral vectors caused extremely low immortalization of primary mouse hematopoietic stem/progenitor cells compared to analogous gammaretrovirus/lentivirus vectors carrying the same enhancers/promoters, an effect not explained solely by foamy virus' modest insertional site preference for nongenic regions compared to gammaretrovirus/lentivirus vectors. Using CRISPR/Cas9-mediated targeted insertion of analogous proviral sequences into the LMO2 gene and then measuring LMO2 expression, we demonstrate a sequence-specific effect of foamy virus, independent of insertional bias, contributing to reduced genotoxicity. We show that this effect is mediated by a 36-bp insulator located in the foamy virus long terminal repeat (LTR) that has high-affinity binding to the CCCTC-binding factor. Using our LMO2 activation assay, LMO2 expression was significantly increased when this insulator was removed from foamy virus and significantly reduced when the insulator was inserted into the lentiviral LTR. Our results elucidate a mechanism underlying the low genotoxicity of foamy virus, identify a novel insulator, and support the use of foamy virus as a vector for gene therapy, especially when strong enhancers/promoters are required. IMPORTANCE Understanding the genotoxic potential of viral vectors is important in designing safe and efficacious vectors for gene therapy. Self-inactivating vectors devoid of viral long-terminal-repeat enhancers have proven safe; however, transgene expression from cellular promoters is often insufficient for full phenotypic correction. Foamy virus is an attractive vector for gene therapy. We found foamy virus vectors to be remarkably less genotoxic, well below what was expected from their integration site preferences. We demonstrate that the foamy virus long terminal repeats contain an insulator element that binds CCCTC-binding factor and reduces its insertional genotoxicity. Our study elucidates a mechanism behind the low genotoxic potential of foamy virus, identifies a unique insulator, and supports the use of foamy virus as a vector for gene therapy. PMID:29046446
Characterization of the microtubule binding domain of microtubule actin crosslinking factor (MACF): identification of a novel group of microtubule associated proteins.

PubMed

Sun, D; Leung, C L; Liem, R K

2001-01-01

MACF (microtubule actin cross-linking factor) is a large, 608-kDa protein that can associate with both actin microfilaments and microtubules (MTs). Structurally, MACF can be divided into 3 domains: an N-terminal domain that contains both a calponin type actin-binding domain and a plakin domain; a rod domain that is composed of 23 dystrophin-like spectrin repeats; and a C-terminal domain that includes two EF-hand calcium-binding motifs, as well as a region that is homologous to two related proteins, GAR22 and Gas2. We have previously demonstrated that the C-terminal domain of MACF binds to MTs, although no homology was observed between this domain and other known microtubule-binding proteins. In this report, we describe the characterization of this microtubule-binding domain of MACF by transient transfection studies and in vitro binding assays. We found that the C-terminus of MACF contains at least two microtubule-binding regions, a GAR domain and a domain containing glycine-serine-arginine (GSR) repeats. In transfected cells, the GAR domain bound to and partially stabilized MTs to depolymerization by nocodazole. The GSR-containing domain caused MTs to form bundles that are still sensitive to nocodazole-induced depolymerization. When present together, these two domains acted in concert to bundle MTs and render them stable to nocodazole treatment. Recently, a study has shown that the N-terminal half of the plakin domain (called the M1 domain) of MACF also binds MTs. We therefore examined the microtubule binding ability of the M1 domain in the context of the entire plakin domain with and without the remaining N-terminal regions of two different MACF isoforms. Interestingly, in the presence of the surrounding sequences, the M1 domain did not bind MTs. In addition to MACF, cDNA sequences encoding the GAR and GSR-containing domains are also found in the partial human EST clone KIAA0728, which has high sequence homology to the 3' end of the MACF cDNA; hence, we refer to it as MACF2. The C-terminal domain of mouse MACF2 was cloned and characterized. The microtubule-binding properties of MACF2 C-terminal domain are similar to that of MACF. The GAR domain was originally found in Gas 2 protein and here we show that it can associate with MTs in transfected cells. Plectin and desmoplakin have GSR-containing domains at their C-termini and we further demonstrate that the GSR-containing domain of plectin, but not desmoplakin, can bind to MTs in vivo.
Molecular identification and characterization of clustered regularly interspaced short palindromic repeat (CRISPR) gene cluster in Taylorella equigenitalis.

PubMed

Hara, Yasushi; Hayashi, Kyohei; Nakajima, Takuya; Kagawa, Shizuko; Tazumi, Akihiro; Moore, John E; Matsuda, Motoo

2013-09-01

Clustered regularly interspaced short palindromic repeats (CRISPRs), of approximately 10,000 base pairs (bp) in length, were shown to occur in the Japanese Taylorella equigenitalis strain, EQ59. The locus was composed of the putative CRISPRs-associated with 5 (cas5), RAMP csd1, csd2, recB, cas1, a leader region, 13 CRISPR consensus sequence repeats (each 32 bp; 5'-TCAGCCACGTTCGCGTGGCTGTGTGTTTAAAG-3'). These were in turn separated by 12 non repetitive unique spacer regions of similar length. In addition, a leader region, a transposase/IS protein, a leader region, and cas3 were also seen. All seven putative open reading frames carry their ribosome binding sites. Promoter consensus sequences at the -35 and -10 regions and putative intrinsic ρ-independent transcription terminator regions also occurred. A possible long overlap of 170 bp in length occurred between the recB and cas1 loci. Positive reverse transcription PCR signals of cas5, RAMP csd1, csd2-recB/cas1, and cas3 were generated. A putative secondary structure of the CRISPR consensus repeats was constructed. Following this, CRISPR results of the T. equigenitalis EQ59 isolate were subsequently compared with those from the Taylorella asinigenitalis MCE3 isolate.
Important role of N108 residue in binding of bovine foamy virus transactivator Tas to viral promoters.

PubMed

Bing, Tiejun; Zhang, Suzhen; Liu, Xiaojuan; Liang, Zhibin; Shao, Peng; Zhang, Song; Qiao, Wentao; Tan, Juan

2016-06-30

Bovine foamy virus (BFV) encodes the transactivator BTas, which enhances viral gene transcription by binding to the long terminal repeat promoter and the internal promoter. In this study, we investigated the different replication capacities of two similar BFV full-length DNA clones, pBS-BFV-Y and pBS-BFV-B. Here, functional analysis of several chimeric clones revealed a major role for the C-terminal region of the viral genome in causing this difference. Furthermore, BTas-B, which is located in this C-terminal region, exhibited a 20-fold higher transactivation activity than BTas-Y. Sequence alignment showed that these two sequences differ only at amino acid 108, with BTas-B containing N108 and BTas-Y containing D108 at this position. Results of mutagenesis studies demonstrated that residue N108 is important for BTas binding to viral promoters. In addition, the N108D mutation in pBS-BFV-B reduced the viral replication capacity by about 1.5-fold. Our results suggest that residue N108 is important for BTas binding to BFV promoters and has a major role in BFV replication. These findings not only advances our understanding of the transactivation mechanism of BTas, but they also highlight the importance of certain sequence polymorphisms in modulating the replication capacity of isolated BFV clones.
Draft Sequencing of the Heterozygous Diploid Genome of Satsuma (Citrus unshiu Marc.) Using a Hybrid Assembly Approach

PubMed Central

Shimizu, Tokurou; Tanizawa, Yasuhiro; Mochizuki, Takako; Nagasaki, Hideki; Yoshioka, Terutaka; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu

2017-01-01

Satsuma (Citrus unshiu Marc.) is one of the most abundantly produced mandarin varieties of citrus, known for its seedless fruit production and as a breeding parent of citrus. De novo assembly of the heterozygous diploid genome of Satsuma (“Miyagawa Wase”) was conducted by a hybrid assembly approach using short-read sequences, three mate-pair libraries, and a long-read sequence of PacBio by the PLATANUS assembler. The assembled sequence, with a total size of 359.7 Mb at the N50 length of 386,404 bp, consisted of 20,876 scaffolds. Pseudomolecules of Satsuma constructed by aligning the scaffolds to three genetic maps showed genome-wide synteny to the genomes of Clementine, pummelo, and sweet orange. Gene prediction by modeling with MAKER-P proposed 29,024 genes and 37,970 mRNA; additionally, gene prediction analysis found candidates for novel genes in several biosynthesis pathways for gibberellin and violaxanthin catabolism. BUSCO scores for the assembled scaffold and predicted transcripts, and another analysis by BAC end sequence mapping indicated the assembled genome consistency was close to those of the haploid Clementine, pummel, and sweet orange genomes. The number of repeat elements and long terminal repeat retrotransposon were comparable to those of the seven citrus genomes; this suggested no significant failure in the assembly at the repeat region. A resequencing application using the assembled sequence confirmed that both kunenbo-A and Satsuma are offsprings of Kishu, and Satsuma is a back-crossed offspring of Kishu. These results illustrated the performance of the hybrid assembly approach and its ability to construct an accurate heterozygous diploid genome. PMID:29259619
Draft Sequencing of the Heterozygous Diploid Genome of Satsuma (Citrus unshiu Marc.) Using a Hybrid Assembly Approach.

PubMed

Shimizu, Tokurou; Tanizawa, Yasuhiro; Mochizuki, Takako; Nagasaki, Hideki; Yoshioka, Terutaka; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu

2017-01-01

Satsuma ( Citrus unshiu Marc.) is one of the most abundantly produced mandarin varieties of citrus, known for its seedless fruit production and as a breeding parent of citrus. De novo assembly of the heterozygous diploid genome of Satsuma ("Miyagawa Wase") was conducted by a hybrid assembly approach using short-read sequences, three mate-pair libraries, and a long-read sequence of PacBio by the PLATANUS assembler. The assembled sequence, with a total size of 359.7 Mb at the N 50 length of 386,404 bp, consisted of 20,876 scaffolds. Pseudomolecules of Satsuma constructed by aligning the scaffolds to three genetic maps showed genome-wide synteny to the genomes of Clementine, pummelo, and sweet orange. Gene prediction by modeling with MAKER-P proposed 29,024 genes and 37,970 mRNA; additionally, gene prediction analysis found candidates for novel genes in several biosynthesis pathways for gibberellin and violaxanthin catabolism. BUSCO scores for the assembled scaffold and predicted transcripts, and another analysis by BAC end sequence mapping indicated the assembled genome consistency was close to those of the haploid Clementine, pummel, and sweet orange genomes. The number of repeat elements and long terminal repeat retrotransposon were comparable to those of the seven citrus genomes; this suggested no significant failure in the assembly at the repeat region. A resequencing application using the assembled sequence confirmed that both kunenbo-A and Satsuma are offsprings of Kishu, and Satsuma is a back-crossed offspring of Kishu. These results illustrated the performance of the hybrid assembly approach and its ability to construct an accurate heterozygous diploid genome.
Major repeat components covering one-third of the ginseng (Panax ginseng C.A. Meyer) genome and evidence for allotetraploidy.

PubMed

Choi, Hong-Il; Waminal, Nomar E; Park, Hye Mi; Kim, Nam-Hoon; Choi, Beom Soon; Park, Minkyu; Choi, Doil; Lim, Yong Pyo; Kwon, Soo-Jin; Park, Beom-Seok; Kim, Hyun Hee; Yang, Tae-Jin

2014-03-01

Ginseng (Panax ginseng) is a famous medicinal herb, but the composition and structure of its genome are largely unknown. Here we characterized the major repeat components and inspected their distribution in the ginseng genome. By analyzing three repeat-rich bacterial artificial chromosome (BAC) sequences from ginseng, we identified complex insertion patterns of 34 long terminal repeat retrotransposons (LTR-RTs) and 11 LTR-RT derivatives accounting for more than 80% of the BAC sequences. The LTR-RTs were classified into three Ty3/gypsy (PgDel, PgTat and PgAthila) and two Ty1/Copia (PgTork and PgOryco) families. Mapping of 30-Gbp Illumina whole-genome shotgun reads to the BAC sequences revealed that these five LTR-RT families occupy at least 34% of the ginseng genome. The Ty3/Gypsy families were predominant, comprising 74 and 33% of the BAC sequences and the genome, respectively. In particular, the PgDel family accounted for 29% of the genome and presumably played major roles in enlargement of the size of the ginseng genome. Fluorescence in situ hybridization (FISH) revealed that the PgDel1 elements are distributed throughout the chromosomes along dispersed heterochromatic regions except for ribosomal DNA blocks. The intensity of the PgDel2 FISH signals was biased toward 24 out of 48 chromosomes. Unique gene probes showed two pairs of signals with different locations, one pair in subtelomeric regions on PgDel2-rich chromosomes and the other in interstitial regions on PgDel2-poor chromosomes, demonstrating allotetraploidy in ginseng. Our findings promote understanding of the evolution of the ginseng genome and of that of related species in the Araliaceae. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

The cDNA sequence of mouse Pgp-1 and homology to human CD44 cell surface antigen and proteoglycan core/link proteins.

PubMed

Wolffe, E J; Gause, W C; Pelfrey, C M; Holland, S M; Steinberg, A D; August, J T

1990-01-05

We describe the isolation and sequencing of a cDNA encoding mouse Pgp-1. An oligonucleotide probe corresponding to the NH2-terminal sequence of the purified protein was synthesized by the polymerase chain reaction and used to screen a mouse macrophage lambda gt11 library. A cDNA clone with an insert of 1.2 kilobases was selected and sequenced. In Northern blot analysis, only cells expressing Pgp-1 contained mRNA species that hybridized with this Pgp-1 cDNA. The nucleotide sequence of the cDNA has a single open reading frame that yields a protein-coding sequence of 1076 base pairs followed by a 132-base pair 3'-untranslated sequence that includes a putative polyadenylation signal but no poly(A) tail. The translated sequence comprises a 13-amino acid signal peptide followed by a polypeptide core of 345 residues corresponding to an Mr of 37,800. Portions of the deduced amino acid sequence were identical to those obtained by amino acid sequence analysis from the purified glycoprotein, confirming that the cDNA encodes Pgp-1. The predicted structure of Pgp-1 includes an NH2-terminal extracellular domain (residues 14-265), a transmembrane domain (residues 266-286), and a cytoplasmic tail (residues 287-358). Portions of the mouse Pgp-1 sequence are highly similar to that of the human CD44 cell surface glycoprotein implicated in cell adhesion. The protein also shows sequence similarity to the proteoglycan tandem repeat sequences found in cartilage link protein and cartilage proteoglycan core protein which are thought to be involved in binding to hyaluronic acid.
Sequence and structural implications of a bovine corneal keratan sulfate proteoglycan core protein. Protein 37B represents bovine lumican and proteins 37A and 25 are unique

NASA Technical Reports Server (NTRS)

Funderburgh, J. L.; Funderburgh, M. L.; Brown, S. J.; Vergnes, J. P.; Hassell, J. R.; Mann, M. M.; Conrad, G. W.; Spooner, B. S. (Principal Investigator)

1993-01-01

Amino acid sequence from tryptic peptides of three different bovine corneal keratan sulfate proteoglycan (KSPG) core proteins (designated 37A, 37B, and 25) showed similarities to the sequence of a chicken KSPG core protein lumican. Bovine lumican cDNA was isolated from a bovine corneal expression library by screening with chicken lumican cDNA. The bovine cDNA codes for a 342-amino acid protein, M(r) 38,712, containing amino acid sequences identified in the 37B KSPG core protein. The bovine lumican is 68% identical to chicken lumican, with an 83% identity excluding the N-terminal 40 amino acids. Location of 6 cysteine and 4 consensus N-glycosylation sites in the bovine sequence were identical to those in chicken lumican. Bovine lumican had about 50% identity to bovine fibromodulin and 20% identity to bovine decorin and biglycan. About two-thirds of the lumican protein consists of a series of 10 amino acid leucine-rich repeats that occur in regions of calculated high beta-hydrophobic moment, suggesting that the leucine-rich repeats contribute to beta-sheet formation in these proteins. Sequences obtained from 37A and 25 core proteins were absent in bovine lumican, thus predicting a unique primary structure and separate mRNA for each of the three bovine KSPG core proteins.
Behavioral Context of Call Production by Eastern North Pacific Blue Whales

DTIC Science & Technology

2007-01-25

pairs occurring in a repeated song sequence; B calls from a different blue whale are also evident; spectrogram parameters: fast Fourier transform (FFT...Acoustic data were viewed in spectrogram form ( fast Fourier transform [FFT] length 1 s, 80% overlap, Hanning window) to de- termine the presence of calls...dura- tion to song A and B units (Table 2), but the intermit - tent timing clearly distinguishes them from song. Whales producing singular calls were
Role of C-terminal residues in oligomerization and stability of lambda CII: implications for lysis-lysogeny decision of the phage.

PubMed

Datta, Ajit Bikram; Roy, Siddhartha; Parrack, Pradeep

2005-01-14

A crucial element in the lysis-lysogeny decision of the temperate coliphage lambda is the phage protein CII, which has several interesting properties. It promotes lysogeny through activation of three phage promoters p(E), p(I) and p(aQ), recognizing a direct repeat sequence TTGCN6TTGC at each. The three-dimensional structure of CII, a homo-tetramer of 97 residue subunits, is unknown. It is an unstable protein in vivo, being rapidly degraded by the host protease HflB (FtsH). This instability is essential for the function of CII in the lysis-lysogeny switch. From NMR and limited proteolysis we show that about 15 C-terminal residues of CII are highly flexible, and may act as a target for proteolysis in vivo. From in vitro transcription, isothermal calorimetry and gel chromatography of CII (1-97) and its truncated fragments CIIA (4-81/82) and CIIB (4-69), we find that residues 70-81/82 are essential for (a) tetramer formation, (b) operator binding and (c) transcription activation. Presumably, tetramerization is necessary for the latter functions. Based on these results, we propose a model for CII structure, in which protein-protein contacts for dimer and tetramer formation are different. The implications of tetrameric organization, essential for CII activity, on the recognition of the direct repeat sequence is discussed.
A Cdk9-PP1 switch regulates the elongation-termination transition of RNA polymerase II.

PubMed

Parua, Pabitra K; Booth, Gregory T; Sansó, Miriam; Benjamin, Bradley; Tanny, Jason C; Lis, John T; Fisher, Robert P

2018-06-13

The end of the RNA polymerase II (Pol II) transcription cycle is strictly regulated to prevent interference between neighbouring genes and to safeguard transcriptome integrity 1 . The accumulation of Pol II downstream of the cleavage and polyadenylation signal can facilitate the recruitment of factors involved in mRNA 3'-end formation and termination 2 , but how this sequence is initiated remains unclear. In a chemical-genetic screen, human protein phosphatase 1 (PP1) isoforms were identified as substrates of positive transcription elongation factor b (P-TEFb), also known as the cyclin-dependent kinase 9 (Cdk9)-cyclin T1 (CycT1) complex 3 . Here we show that Cdk9 and PP1 govern phosphorylation of the conserved elongation factor Spt5 in the fission yeast Schizosaccharomyces pombe. Cdk9 phosphorylates both Spt5 and a negative regulatory site on the PP1 isoform Dis2 4 . Sites targeted by Cdk9 in the Spt5 carboxy-terminal domain can be dephosphorylated by Dis2 in vitro, and dis2 mutations retard Spt5 dephosphorylation after inhibition of Cdk9 in vivo. Chromatin immunoprecipitation and sequencing analysis indicates that Spt5 is dephosphorylated as transcription complexes traverse the cleavage and polyadenylation signal, concomitant with the accumulation of Pol II phosphorylated at residue Ser2 of the carboxy-terminal domain consensus heptad repeat 5 . A conditionally lethal Dis2-inactivating mutation attenuates the drop in Spt5 phosphorylation on chromatin, promotes transcription beyond the normal termination zone (as detected by precision run-on transcription and sequencing 6 ) and is genetically suppressed by the ablation of Cdk9 target sites in Spt5. These results suggest that the transition of Pol II from elongation to termination coincides with a Dis2-dependent reversal of Cdk9 signalling-a switch that is analogous to a Cdk1-PP1 circuit that controls mitotic progression 4 .
Tn5401, a new class II transposable element from Bacillus thuringiensis.

PubMed Central

Baum, J A

1994-01-01

A new class II (Tn3-like) transposable element, designated Tn5401, was recovered from a sporulation-deficient variant of Bacillus thuringiensis subsp. morrisoni EG2158 following its insertion into a recombinant plasmid. Sequence analysis of the insert revealed a 4,837-bp transposon with two large open reading frames, in the same orientation, encoding proteins of 36 kDa (306 residues) and 116 kDa (1,005 residues) and 53-bp terminal inverted repeats. The deduced amino acid sequence for the 36-kDa protein shows 24% sequence identity with the TnpI recombinase of the B. thuringiensis transposon Tn4430, a member of the phage integrase family of site-specific recombinases. The deduced amino acid sequence for the 116-kDa protein shows 42% sequence identity with the transposase of Tn3 but only 28% identity with the TnpA transposase of Tn4430. Two small open reading frames of unknown function, designated orf1 (85 residues) and orf2 (74 residues), were also identified. Southern blot analysis indicated that Tn5401, in contrast to Tn4430, is not commonly found among different subspecies of B. thuringiensis and is not typically associated with known insecticidal crystal protein genes. Transposition was studied with B. thuringiensis by using plasmid pEG922, a temperature-sensitive shuttle vector containing Tn5401. Tn5401 transposed to both chromosomal and plasmid target sites but displayed an apparent preference for plasmid sites. Transposition was replicative and resulted in the generation of a 5-bp duplication at the target site. Transcriptional start sites within Tn5401 were mapped by primer extension analysis. Two promoters, designated PL and PR, direct the transcription of orf1-orf2 and tnpI-tnpA, respectively, and are negatively regulated by TnpI. Sequence comparison of the promoter regions of Tn5401 and Tn4430 suggests that the conserved sequence element ATGTCCRCTAAY mediates TnpI binding and cointegrate resolution. The same element is contained within the 53-bp terminal inverted repeats, thus accounting for their unusual lengths and suggesting an additional role for TnpI in regulating Tn5401 transposition. Images PMID:7514590
2-D Structure of the A Region of Xist RNA and Its Implication for PRC2 Association

PubMed Central

Maenner, Sylvain; Blaud, Magali; Fouillen, Laetitia; Savoye, Anne; Marchand, Virginie; Dubois, Agnès; Sanglier-Cianférani, Sarah; Van Dorsselaer, Alain; Clerc, Philippe; Avner, Philip; Visvikis, Athanase; Branlant, Christiane

2010-01-01

In placental mammals, inactivation of one of the X chromosomes in female cells ensures sex chromosome dosage compensation. The 17 kb non-coding Xist RNA is crucial to this process and accumulates on the future inactive X chromosome. The most conserved Xist RNA region, the A region, contains eight or nine repeats separated by U-rich spacers. It is implicated in the recruitment of late inactivated X genes to the silencing compartment and likely in the recruitment of complex PRC2. Little is known about the structure of the A region and more generally about Xist RNA structure. Knowledge of its structure is restricted to an NMR study of a single A repeat element. Our study is the first experimental analysis of the structure of the entire A region in solution. By the use of chemical and enzymatic probes and FRET experiments, using oligonucleotides carrying fluorescent dyes, we resolved problems linked to sequence redundancies and established a 2-D structure for the A region that contains two long stem-loop structures each including four repeats. Interactions formed between repeats and between repeats and spacers stabilize these structures. Conservation of the spacer terminal sequences allows formation of such structures in all sequenced Xist RNAs. By combination of RNP affinity chromatography, immunoprecipitation assays, mass spectrometry, and Western blot analysis, we demonstrate that the A region can associate with components of the PRC2 complex in mouse ES cell nuclear extracts. Whilst a single four-repeat motif is able to associate with components of this complex, recruitment of Suz12 is clearly more efficient when the entire A region is present. Our data with their emphasis on the importance of inter-repeat pairing change fundamentally our conception of the 2-D structure of the A region of Xist RNA and support its possible implication in recruitment of the PRC2 complex. PMID:20052282
Human mRNA polyadenylate binding protein: evolutionary conservation of a nucleic acid binding motif.

PubMed Central

Grange, T; de Sa, C M; Oddos, J; Pictet, R

1987-01-01

We have isolated a full length cDNA (cDNA) coding for the human poly(A) binding protein. The cDNA derived 73 kd basic translation product has the same Mr, isoelectric point and peptidic map as the poly(A) binding protein. DNA sequence analysis reveals a 70,244 dalton protein. The N terminal part, highly homologous to the yeast poly(A) binding protein, is sufficient for poly(A) binding activity. This domain consists of a four-fold repeated unit of approximately 80 amino acids present in other nucleic acid binding proteins. In the C terminal part there is, as in the yeast protein, a sequence of approximately 150 amino acids, rich in proline, alanine and glutamine which together account for 48% of the residues. A 2,9 kb mRNA corresponding to this cDNA has been detected in several vertebrate cell types and in Drosophila melanogaster at every developmental stage including oogenesis. Images PMID:2885805
Identification of a Novel Virulence Determinant with Serum Opacification Activity in Streptococcus suis

PubMed Central

Baums, Christoph G.; Kaim, Ute; Fulde, Marcus; Ramachandran, Girish; Goethe, Ralph; Valentin-Weigand, Peter

2006-01-01

Streptococcus suis serotype 2 is a porcine and human pathogen with adhesive and invasive properties. In other streptococci, large surface-associated proteins (>100 kDa) of the MSCRAMM family (microbial surface components recognizing adhesive matrix molecules) are key players in interactions with host tissue. In this study, we identified a novel opacity factor of S. suis (OFS) with structural homology to members of the MSCRAMM family. The N-terminal region of OFS is homologous to the respective regions of fibronectin-binding protein A (FnBA) of Streptococcus dysgalactiae and the serum opacity factor (SOF) of Streptococcus pyogenes. Similar to these two proteins, the N-terminal domain of OFS opacified horse serum. Serum opacification activity was detectable in sodium dodecyl sulfate extracts of wild-type S. suis but not in extracts of isogenic ofs knockout mutants. Heterologous expression of OFS in Lactococcus lactis demonstrated that a high level of expression of OFS is sufficient to provide surface-associated serum opacification activity. Furthermore, serum opacification could be inhibited by an antiserum against recombinant OFS. The C-terminal repetitive sequence elements of OFS differed significantly from the respective repeat regions of FnBA and SOF as well as from the consensus sequence of the fibronectin-binding repeats of MSCRAMMs. Accordingly, fibronectin binding was not detectable in recombinant OFS. To investigate the putative function of OFS in the pathogenesis of invasive S. suis diseases, piglets were experimentally infected with an isogenic mutant strain in which the ofs gene had been knocked out by an in-frame deletion. The mutant was severely attenuated in virulence but not in colonization, demonstrating that OFS represents a novel virulence determinant of S. suis. PMID:17057090
Myelodysplastic syndromes and acute myeloid leukemia in cats infected with feline leukemia virus clone33 containing a unique long terminal repeat.

PubMed

Hisasue, Masaharu; Nagashima, Naho; Nishigaki, Kazuo; Fukuzawa, Isao; Ura, Shigeyoshi; Katae, Hiromi; Tsuchiya, Ryo; Yamada, Takatsugu; Hasegawa, Atsuhiko; Tsujimoto, Hajime

2009-03-01

Feline leukemia virus (FeLV) clone33 was obtained from a domestic cat with acute myeloid leukemia (AML). The long terminal repeat (LTR) of this virus, like the LTRs present in FeLV from other cats with AML, differs from the LTRs of other known FeLV in that it has 3 tandem direct 47-bp repeats in the upstream region of the enhancer (URE). Here, we injected cats with FeLV clone33 and found 41% developed myelodysplastic syndromes (MDS) characterized by peripheral blood cytopenias and dysplastic changes in the bone marrow. Some of the cats with MDS eventually developed AML. The bone marrow of the majority of cats with FeLV clone33 induced MDS produced fewer erythroid and myeloid colonies upon being cultured with erythropoietin or granulocyte-macrophage colony-stimulating factor (GM-SCF) than bone marrow from normal control cats. Furthermore, the bone marrow of some of the cats expressed high-levels of the apoptosis-related genes TNF-alpha and survivin. Analysis of the proviral sequences obtained from 13 cats with naturally occurring MDS reveal they also bear the characteristic URE repeats seen in the LTR of FeLV clone33 and other proviruses from cats with AML. Deletions and mutations within the enhancer elements are frequently observed in naturally occurring MDS as well as AML. These results suggest that FeLV variants that bear URE repeats in their LTR strongly associate with the induction of both MDS and AML in cats.
Comparative Genomic Analysis Reveals Multiple Long Terminal Repeats, Lineage-Specific Amplification, and Frequent Interelement Recombination for Cassandra Retrotransposon in Pear (Pyrus bretschneideri Rehd.)

PubMed Central

Yin, Hao; Du, Jianchang; Li, Leiting; Jin, Cong; Fan, Lian; Li, Meng; Wu, Jun; Zhang, Shaoling

2014-01-01

Cassandra transposable elements belong to a specific group of terminal-repeat retrotransposons in miniature (TRIM). Although Cassandra TRIM elements have been found in almost all vascular plants, detailed investigations on the nature, abundance, amplification timeframe, and evolution have not been performed in an individual genome. We therefore conducted a comprehensive analysis of Cassandra retrotransposons using the newly sequenced pear genome along with four other Rosaceae species, including apple, peach, mei, and woodland strawberry. Our data reveal several interesting findings for this particular retrotransposon family: 1) A large number of the intact copies contain three, four, or five long terminal repeats (LTRs) (∼20% in pear); 2) intact copies and solo LTRs with or without target site duplications are both common (∼80% vs. 20%) in each genome; 3) the elements exhibit an overall unbiased distribution among the chromosomes; 4) the elements are most successfully amplified in pear (5,032 copies); and 5) the evolutionary relationships of these elements vary among different lineages, species, and evolutionary time. These results indicate that Cassandra retrotransposons contain more complex structures (elements with multiple LTRs) than what we have known previously, and that frequent interelement unequal recombination followed by transposition may play a critical role in shaping and reshaping host genomes. Thus this study provides insights into the property, propensity, and molecular mechanisms governing the formation and amplification of Cassandra retrotransposons, and enhances our understanding of the structural variation, evolutionary history, and transposition process of LTR retrotransposons in plants. PMID:24899073
Genes Altered by Intracisternal A Particles in Mouse Mammary Tumorigenesis

DTIC Science & Technology

1997-07-01

mouse Mus musculus as well as most other rodents (1). They are defective retroviruses which contain 3’ and 5’ long terminal repeat (LTR) sequences and... musculus (C57BL/6J) X Mus spretus backcross was obtained for The Jackson Laboratory (Bar Harbor, Maine) and used for localization of the pl7b(kokopelli...understand the nature of the potential mutation found in the tumors I decided to localize pl7b within the mouse genome. I screened a Mus musculus musculus X
Chlorovirus Skp1-binding ankyrin repeat protein interplay and mimicry of cellular ubiquitin ligase machinery.

PubMed

Noel, Eric A; Kang, Ming; Adamec, Jiri; Van Etten, James L; Oyler, George A

2014-12-01

The ubiquitin-proteasome system is targeted by many viruses that have evolved strategies to redirect host ubiquitination machinery. Members of the genus Chlorovirus are proposed to share an ancestral lineage with a broader group of related viruses, nucleo-cytoplasmic large DNA viruses (NCLDV). Chloroviruses encode an Skp1 homolog and ankyrin repeat (ANK) proteins. Several chlorovirus-encoded ANK repeats contain C-terminal domains characteristic of cellular F-boxes or related NCLDV chordopox PRANC (pox protein repeats of ankyrin at C-terminal) domains. These observations suggested that this unique combination of Skp1 and ANK repeat proteins might form complexes analogous to the cellular Skp1-Cul1-F-box (SCF) ubiquitin ligase complex. We identified two ANK proteins from the prototypic chlorovirus Paramecium bursaria chlorella virus-1 (PBCV-1) that functioned as binding partners for the virus-encoded Skp1, proteins A682L and A607R. These ANK proteins had a C-terminal Skp1 interactional motif that functioned similarly to cellular F-box domains. A C-terminal motif of ANK protein A682L binds Skp1 proteins from widely divergent species. Yeast two-hybrid analyses using serial domain deletion constructs confirmed the C-terminal localization of the Skp1 interactional motif in PBCV-1 A682L. ANK protein A607R represents an ANK family with one member present in all 41 sequenced chloroviruses. A comprehensive phylogenetic analysis of these related ANK and viral Skp1 proteins suggested partnered function tailored to the host alga or common ancestral heritage. Here, we show protein-protein interaction between corresponding family clusters of virus-encoded ANK and Skp1 proteins from three chlorovirus types. Collectively, our results indicate that chloroviruses have evolved complementing Skp1 and ANK proteins that mimic cellular SCF-associated proteins. Viruses have evolved ways to direct ubiquitination events in order to create environments conducive to their replication. As reported in the manuscript, the large chloroviruses encode several components involved in the SCF ubiquitin ligase complex including a viral Skp1 homolog. Studies on how chloroviruses manipulate their host algal ubiquitination system will provide insights toward viral protein mimicry, substrate recognition, and key interactive domains controlling selective protein degradation. These findings may also further understanding of the evolution of other large DNA viruses, like poxviruses, that are reported to share the same monophyly lineage as chloroviruses. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
TIR-NBS-LRR genes are rare in monocots: evidence from diverse monocot orders

PubMed Central

Tarr, D Ellen K; Alexander, Helen M

2009-01-01

Background Plant resistance (R) gene products recognize pathogen effector molecules. Many R genes code for proteins containing nucleotide binding site (NBS) and C-terminal leucine-rich repeat (LRR) domains. NBS-LRR proteins can be divided into two groups, TIR-NBS-LRR and non-TIR-NBS-LRR, based on the structure of the N-terminal domain. Although both classes are clearly present in gymnosperms and eudicots, only non-TIR sequences have been found consistently in monocots. Since most studies in monocots have been limited to agriculturally important grasses, it is difficult to draw conclusions. The purpose of our study was to look for evidence of these sequences in additional monocot orders. Findings Using degenerate PCR, we amplified NBS sequences from four monocot species (C. blanda, D. marginata, S. trifasciata, and Spathiphyllum sp.), a gymnosperm (C. revoluta) and a eudicot (C. canephora). We successfully amplified TIR-NBS-LRR sequences from dicot and gymnosperm DNA, but not from monocot DNA. Using databases, we obtained NBS sequences from additional monocots, magnoliids and basal angiosperms. TIR-type sequences were not present in monocot or magnoliid sequences, but were present in the basal angiosperms. Phylogenetic analysis supported a single TIR clade and multiple non-TIR clades. Conclusion We were unable to find monocot TIR-NBS-LRR sequences by PCR amplification or database searches. In contrast to previous studies, our results represent five monocot orders (Poales, Zingiberales, Arecales, Asparagales, and Alismatales). Our results establish the presence of TIR-NBS-LRR sequences in basal angiosperms and suggest that although these sequences were present in early land plants, they have been reduced significantly in monocots and magnoliids. PMID:19785756
A Legionella pneumophila collagen-like protein encoded by a gene with a variable number of tandem repeats is involved in the adherence and invasion of host cells.

PubMed

Vandersmissen, Liesbeth; De Buck, Emmy; Saels, Veerle; Coil, David A; Anné, Jozef

2010-05-01

Legionella pneumophila is a Gram-negative, facultative intracellular pathogen and the causative agent of Legionnaires' disease, a severe pneumonia in humans. Analysis of the Legionella sequenced genomes revealed a gene with a variable number of tandem repeats (VNTRs), whose number varies between strains. We examined the strain distribution of this gene among a collection of 108 clinical, environmental and hot spring serotype I strains. Twelve variants were identified, but no correlation was observed between the number of repeat units and clinical and environmental strains. The encoded protein contains the C-terminal consensus motif of outer membrane proteins and has a large region of collagen-like repeats that is encoded by the VNTR region. We have therefore annotated this protein Lcl for Legionella collagen-like protein. Lcl was shown to contribute to the adherence and invasion of host cells and it was demonstrated that the number of repeat units present in lcl had an influence on these adhesion characteristics.
Comparative Sequence and X-Inactivation Analyses of a Domain of Escape in Human Xp11.2 and the Conserved Segment in Mouse

PubMed Central

Tsuchiya, Karen D.; Greally, John M.; Yi, Yajun; Noel, Kevin P.; Truong, Jean-Pierre; Disteche, Christine M.

2004-01-01

We have performed X-inactivation and sequence analyses on 350 kb of sequence from human Xp11.2, a region shown previously to contain a cluster of genes that escape X inactivation, and we compared this region with the region of conserved synteny in mouse. We identified several new transcripts from this region in human and in mouse, which defined the full extent of the domain escaping X inactivation in both species. In human, escape from X inactivation involves an uninterrupted 235-kb domain of multiple genes. Despite highly conserved gene content and order between the two species, Smcx is the only mouse gene from the conserved segment that escapes inactivation. As repetitive sequences are believed to facilitate spreading of X inactivation along the chromosome, we compared the repetitive sequence composition of this region between the two species. We found that long terminal repeats (LTRs) were decreased in the human domain of escape, but not in the majority of the conserved mouse region adjacent to Smcx in which genes were subject to X inactivation, suggesting that these repeats might be excluded from escape domains to prevent spreading of silencing. Our findings indicate that genomic context, as well as gene-specific regulatory elements, interact to determine expression of a gene from the inactive X-chromosome. PMID:15197169
Investigating intermolecular forces associated with thrombus initiation using optical tweezers

NASA Astrophysics Data System (ADS)

Arya, Maneesh; Lopez, Jose A.; Romo, Gabriel M.; Dong, Jing-Fei; McIntire, Larry V.; Moake, Joel L.; Anvari, Bahman

2002-05-01

Thrombus formation occurs when a platelet membrane receptor, glycoprotein (GP) Ib-IX-V complex, binds to its ligand, von Willebrand factor (vWf), in the subendothelium or plasma. To determine which GP Ib-IX-V amino acid sequences are critical for bond formation, we have used optical tweezers to measure forces involved in the binding of vWf to GP Ib-IX-V variants. Inasmuch as GP Ib(alpha) subunit is the primary component in human GP Ib-IX-V complex that binds to vWf, and that canine GP Ib(alpha) , on the other hand, does not bind to human vWf, we progressively replaced human GP Ib(alpha) amino acid sequences with canine GP Ib(alpha) sequences to determine the sequences essential for vWf/GP Ib(alpha) binding. After measuring the adhesive forces between optically trapped, vWf-coated beads and GP Ib(alpha) variants expressed on mammalian cells, we determined that leucine- rich repeat 2 of GP Ib(alpha) was necessary for vWf/GP Ib-IX- V bond formation. We also found that deletion of the N- terminal flanking sequence and leucine-rich repeat 1 reduced adhesion strength to vWf but did not abolish binding. While divalent cations are known to influence binding of vWf, addition of 1mM CaCl2 had no effect on measured vWf/GP Ib(alpha) bond strengths.
Size and sequence polymorphisms in the glutamate-rich protein gene of the human malaria parasite Plasmodium falciparum in Thailand.

PubMed

Pattaradilokrat, Sittiporn; Trakoolsoontorn, Chawinya; Simpalipan, Phumin; Warrit, Natapot; Kaewthamasorn, Morakot; Harnyuttanakorn, Pongchai

2018-01-22

The glutamate-rich protein (GLURP) of the malaria parasite Plasmodium falciparum is a key surface antigen that serves as a component of a clinical vaccine. Moreover, the GLURP gene is also employed routinely as a genetic marker for malarial genotyping in epidemiological studies. While extensive size polymorphisms in GLURP are well recorded, the extent of the sequence diversity of this gene is rarely investigated. The present study aimed to explore the genetic diversity of GLURP in natural populations of P. falciparum. The polymorphic C-terminal repetitive R2 region of GLURP sequences from 65 P. falciparum isolates in Thailand were generated and combined with the data from 103 worldwide isolates to generate a GLURP database. The collection was comprised of 168 alleles, encoding 105 unique GLURP subtypes, characterized by 18 types of amino acid repeat units (AAU). Of these, 28 GLURP subtypes, formed by 10 AAU types, were detected in P. falciparum in Thailand. Among them, 19 GLURP subtypes and 2 AAU types are described for the first time in the Thai parasite population. The AAU sequences were highly conserved, which is likely due to negative selection. Standard Fst analysis revealed the shared distributions of GLURP types among the P. falciparum populations, providing evidence of gene flow among the different demographic populations. Sequence diversity causing size variations in GLURP in Thai P. falciparum populations were detected, and caused by non-synonymous substitutions in repeat units and some insertion/deletion of aspartic acid or glutamic acid codons between repeat units. The P. falciparum population structure based on GLURP showed promising implications for the development of GLURP-based vaccines and for monitoring vaccine efficacy.
Identification and Characterization of Functionally Critical, Conserved Motifs in the Internal Repeats and N-terminal Domain of Yeast Translation Initiation Factor 4B (yeIF4B)*

PubMed Central

Zhou, Fujun; Walker, Sarah E.; Mitchell, Sarah F.; Lorsch, Jon R.; Hinnebusch, Alan G.

2014-01-01

eIF4B has been implicated in attachment of the 43 S preinitiation complex (PIC) to mRNAs and scanning to the start codon. We recently determined that the internal seven repeats (of ∼26 amino acids each) of Saccharomyces cerevisiae eIF4B (yeIF4B) compose the region most critically required to enhance mRNA recruitment by 43 S PICs in vitro and stimulate general translation initiation in yeast. Moreover, although the N-terminal domain (NTD) of yeIF4B contributes to these activities, the RNA recognition motif is dispensable. We have now determined that only two of the seven internal repeats are sufficient for wild-type (WT) yeIF4B function in vivo when all other domains are intact. However, three or more repeats are needed in the absence of the NTD or when the functions of eIF4F components are compromised. We corroborated these observations in the reconstituted system by demonstrating that yeIF4B variants with only one or two repeats display substantial activity in promoting mRNA recruitment by the PIC, whereas additional repeats are required at lower levels of eIF4A or when the NTD is missing. These findings indicate functional overlap among the 7-repeats and NTD domains of yeIF4B and eIF4A in mRNA recruitment. Interestingly, only three highly conserved positions in the 26-amino acid repeat are essential for function in vitro and in vivo. Finally, we identified conserved motifs in the NTD and demonstrate functional overlap of two such motifs. These results provide a comprehensive description of the critical sequence elements in yeIF4B that support eIF4F function in mRNA recruitment by the PIC. PMID:24285537
Molecular Dynamics Simulations of DNA-Free and DNA-Bound TAL Effectors

PubMed Central

Wan, Hua; Hu, Jian-ping; Li, Kang-shun; Tian, Xu-hong; Chang, Shan

2013-01-01

TAL (transcriptional activator-like) effectors (TALEs) are DNA-binding proteins, containing a modular central domain that recognizes specific DNA sequences. Recently, the crystallographic studies of TALEs revealed the structure of DNA-recognition domain. In this article, molecular dynamics (MD) simulations are employed to study two crystal structures of an 11.5-repeat TALE, in the presence and absence of DNA, respectively. The simulated results indicate that the specific binding of RVDs (repeat-variable diresidues) with DNA leads to the markedly reduced fluctuations of tandem repeats, especially at the two ends. In the DNA-bound TALE system, the base-specific interaction is formed mainly by the residue at position 13 within a TAL repeat. Tandem repeats with weak RVDs are unfavorable for the TALE-DNA binding. These observations are consistent with experimental studies. By using principal component analysis (PCA), the dominant motions are open-close movements between the two ends of the superhelical structure in both DNA-free and DNA-bound TALE systems. The open-close movements are found to be critical for the recognition and binding of TALE-DNA based on the analysis of free energy landscape (FEL). The conformational analysis of DNA indicates that the 5′ end of DNA target sequence has more remarkable structural deformability than the other sites. Meanwhile, the conformational change of DNA is likely associated with the specific interaction of TALE-DNA. We further suggest that the arrangement of N-terminal repeats with strong RVDs may help in the design of efficient TALEs. This study provides some new insights into the understanding of the TALE-DNA recognition mechanism. PMID:24130757

Structural Analyses of the Ankyrin Repeat Domain of TRPV6 and Related TRPV Ion Channels

DOE Office of Scientific and Technical Information (OSTI.GOV)

Phelps, C.B.; Huang, R.J.; Lishko, P.V.

2008-06-03

Transient receptor potential (TRP) proteins are cation channels composed of a transmembrane domain flanked by large N- and C-terminal cytoplasmic domains. All members of the vanilloid family of TRP channels (TRPV) possess an N-terminal ankyrin repeat domain (ARD). The ARD of mammalian TRPV6, an important regulator of calcium uptake and homeostasis, is essential for channel assembly and regulation. The 1.7 A crystal structure of the TRPV6-ARD reveals conserved structural elements unique to the ARDs of TRPV proteins. First, a large twist between the fourth and fifth repeats is induced by residues conserved in all TRPV ARDs. Second, the third fingermore » loop is the most variable region in sequence, length and conformation. In TRPV6, a number of putative regulatory phosphorylation sites map to the base of this third finger. Size exclusion chromatography and crystal packing indicate that the TRPV6-ARD does not assemble as a tetramer and is monomeric in solution. Adenosine triphosphate-agarose and calmodulin-agarose pull-down assays show that the TRPV6-ARD does not interact with either ligand, indicating a different functional role for the TRPV6-ARD than in the paralogous thermosensitive TRPV1 channel. Similar biochemical findings are also presented for the highly homologous mammalian TRPV5-ARD. The implications of the structural and biochemical data on the role of the ankyrin repeats in different TRPV channels are discussed.« less
A mammary cell-specific enhancer in mouse mammary tumor virus DNA is composed of multiple regulatory elements including binding sites for CTF/NFI and a novel transcription factor, mammary cell-activating factor.

PubMed Central

Mink, S; Härtig, E; Jennewein, P; Doppler, W; Cato, A C

1992-01-01

Mouse mammary tumor virus (MMTV) is a milk-transmitted retrovirus involved in the neoplastic transformation of mouse mammary gland cells. The expression of this virus is regulated by mammary cell type-specific factors, steroid hormones, and polypeptide growth factors. Sequences for mammary cell-specific expression are located in an enhancer element in the extreme 5' end of the long terminal repeat region of this virus. This enhancer, when cloned in front of the herpes simplex thymidine kinase promoter, endows the promoter with mammary cell-specific response. Using functional and DNA-protein-binding studies with constructs mutated in the MMTV long terminal repeat enhancer, we have identified two main regulatory elements necessary for the mammary cell-specific response. These elements consist of binding sites for a transcription factor in the family of CTF/NFI proteins and the transcription factor mammary cell-activating factor (MAF) that recognizes the sequence G Pu Pu G C/G A A G G/T. Combinations of CTF/NFI- and MAF-binding sites or multiple copies of either one of these binding sites but not solitary binding sites mediate mammary cell-specific expression. The functional activities of these two regulatory elements are enhanced by another factor that binds to the core sequence ACAAAG. Interdigitated binding sites for CTF/NFI, MAF, and/or the ACAAAG factor are also found in the 5' upstream regions of genes encoding whey milk proteins from different species. These findings suggest that mammary cell-specific regulation is achieved by a concerted action of factors binding to multiple regulatory sites. Images PMID:1328867
Polysaccharides from heterocyst and spore envelopes of a blue-green alga. [Anabaena cylindrica

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cardemil, L.; Wolk, C.P.

The polysaccharides from the envelopes of heterocysts and spores of Anabaena cylindrica consist of repeating units containing 1 mannosyl and 3 glucosyl residues, all linked by ..beta..(1 ..-->.. 3) glucosidic bonds, with glucose, xylose, galactose, and mannose present in side branches. Degradation of the polysaccharides with specific glycosidases has permitted identification of the linkages to almost all of the branches. When the polysaccharides, from which all but two types of side branches had been cleaved, were digested with a ..beta..(1 ..-->.. 3) endoglucanase, glucose, a tri-, and a pentasaccharide were produced. The oligosaccharide products were identified. The backbones of themore » polysaccharides were sequenced from the reducing terminus by a modified Smith degradation. Analysis with NaB/sup 3/H/sub 4/ at each stage of the degradation showed that the backbones terminate in the sequence Man-Glc-Glc-Glc and are therefore presumed to have the structure (Man-Glc-Glc-Glc)/sub n/, and that they contain an average of from 128 to 150 sugar residues. From the information obtained, the repeating sequences of the original polysaccharides from the two types of differentiated cells of A. cylindrica could be largely deduced and appeared to be identical.« less
Molecular Cytogenetic Analysis of Deschampsia antarctica Desv. (Poaceae), Maritime Antarctic.

PubMed

Amosova, Alexandra V; Bolsheva, Nadezhda L; Samatadze, Tatiana E; Twardovska, Maryana O; Zoshchuk, Svyatoslav A; Andreev, Igor O; Badaeva, Ekaterina D; Kunakh, Viktor A; Muravenko, Olga V

2015-01-01

Deschampsia antarctica Desv. (Poaceae) (2n = 26) is one of the two vascular plants adapted to the harshest environment of the Antarctic. Although the species is a valuable model for study of environmental stress tolerance in plants, its karyotype is still poorly investigated. We firstly conducted a comprehensive molecular cytogenetic analysis of D. antarctica collected on four islands of the Maritime Antarctic. D. antarctica karyotypes were studied by Giemsa C- and DAPI/C-banding, Ag-NOR staining, multicolour fluorescence in situ hybridization with repeated DNA probes (pTa71, pTa794, telomere repeats, pSc119.2, pAs1) and the GAA simple sequence repeat probe. We also performed sequential rapid in situ hybridization with genomic DNA of D. caespitosa. Two chromosome pairs bearing transcriptionally active 45S rDNA loci and five pairs with 5S rDNA sites were detected. A weak intercalary site of telomere repeats was revealed on the largest chromosome in addition to telomere hybridization signals at terminal positions. This fact confirms indirectly the hypothesis that chromosome fusion might have been the cause of the unusual for cereals chromosome number in this species. Based on patterns of distribution of the examined molecular cytogenetic markers, all chromosomes in karyotypes were identified, and chromosome idiograms of D. antarctica were constructed. B chromosomes were found in most karyotypes of plants from Darboux Island. A mixoploid plant with mainly triploid cells bearing a Robertsonian rearrangement was detected among typical diploid specimens from Great Jalour Island. The karyotype variability found in D. antarctica is probably an expression of genome instability induced by environmental stress factors. The differences in C-banding patterns and in chromosome distribution of rDNA loci as well as homologous highly repeated DNA sequences detected between genomes of D. antarctica and its related species D. caespitosa indicate that genome reorganization involving coding and noncoding repeated DNA sequences had occurred during the divergence of these species.
Molecular Cytogenetic Analysis of Deschampsia antarctica Desv. (Poaceae), Maritime Antarctic

PubMed Central

Amosova, Alexandra V.; Bolsheva, Nadezhda L.; Samatadze, Tatiana E.; Twardovska, Maryana O.; Zoshchuk, Svyatoslav A.; Andreev, Igor O.; Badaeva, Ekaterina D.; Kunakh, Viktor A.; Muravenko, Olga V.

2015-01-01

Deschampsia antarctica Desv. (Poaceae) (2n = 26) is one of the two vascular plants adapted to the harshest environment of the Antarctic. Although the species is a valuable model for study of environmental stress tolerance in plants, its karyotype is still poorly investigated. We firstly conducted a comprehensive molecular cytogenetic analysis of D. antarctica collected on four islands of the Maritime Antarctic. D. antarctica karyotypes were studied by Giemsa C- and DAPI/C-banding, Ag-NOR staining, multicolour fluorescence in situ hybridization with repeated DNA probes (pTa71, pTa794, telomere repeats, pSc119.2, pAs1) and the GAA simple sequence repeat probe. We also performed sequential rapid in situ hybridization with genomic DNA of D. caespitosa. Two chromosome pairs bearing transcriptionally active 45S rDNA loci and five pairs with 5S rDNA sites were detected. A weak intercalary site of telomere repeats was revealed on the largest chromosome in addition to telomere hybridization signals at terminal positions. This fact confirms indirectly the hypothesis that chromosome fusion might have been the cause of the unusual for cereals chromosome number in this species. Based on patterns of distribution of the examined molecular cytogenetic markers, all chromosomes in karyotypes were identified, and chromosome idiograms of D. antarctica were constructed. B chromosomes were found in most karyotypes of plants from Darboux Island. A mixoploid plant with mainly triploid cells bearing a Robertsonian rearrangement was detected among typical diploid specimens from Great Jalour Island. The karyotype variability found in D. antarctica is probably an expression of genome instability induced by environmental stress factors. The differences in C-banding patterns and in chromosome distribution of rDNA loci as well as homologous highly repeated DNA sequences detected between genomes of D. antarctica and its related species D. caespitosa indicate that genome reorganization involving coding and noncoding repeated DNA sequences had occurred during the divergence of these species. PMID:26394331
Molecular evolution of pentatricopeptide repeat genes reveals truncation in species lacking an editing target and structural domains under distinct selective pressures.

PubMed

Hayes, Michael L; Giang, Karolyn; Mulligan, R Michael

2012-05-14

Pentatricopeptide repeat (PPR) proteins are required for numerous RNA processing events in plant organelles including C-to-U editing, splicing, stabilization, and cleavage. Fifteen PPR proteins are known to be required for RNA editing at 21 sites in Arabidopsis chloroplasts, and belong to the PLS class of PPR proteins. In this study, we investigate the co-evolution of four PPR genes (CRR4, CRR21, CLB19, and OTP82) and their six editing targets in Brassicaceae species. PPR genes are composed of approximately 10 to 20 tandem repeats and each repeat has two α-helical regions, helix A and helix B, that are separated by short coil regions. Each repeat and structural feature was examined to determine the selective pressures on these regions. All of the PPR genes examined are under strong negative selection. Multiple independent losses of editing site targets are observed for both CRR21 and OTP82. In several species lacking the known editing target for CRR21, PPR genes are truncated near the 17th PPR repeat. The coding sequences of the truncated CRR21 genes are maintained under strong negative selection; however, the 3' UTR sequences beyond the truncation site have substantially diverged. Phylogenetic analyses of four PPR genes show that sequences corresponding to helix A are high compared to helix B sequences. Differential evolutionary selection of helix A versus helix B is observed in both plant and mammalian PPR genes. PPR genes and their cognate editing sites are mutually constrained in evolution. Editing sites are frequently lost by replacement of an edited C with a genomic T. After the loss of an editing site, the PPR genes are observed with three outcomes: first, few changes are detected in some cases; second, the PPR gene is present as a pseudogene; and third, the PPR gene is present but truncated in the C-terminal region. The retention of truncated forms of CRR21 that are maintained under strong negative selection even in the absence of an editing site target suggests that unrecognized function(s) might exist for this PPR protein. PPR gene sequences that encode helix A are under strong selection, and could be involved in RNA substrate recognition.
Characterisation of IS153, an IS3-family insertion sequence isolated from Lactobacillus sanfranciscensis and its use for strain differentiation.

PubMed

Ehrmann, M A; Vogel, R E

2001-11-01

An insertion sequence has been identified in the genome of Lactobacillus sanfranciscensis DSM 20451T as segment of 1351 nucleotides containing 37-bp imperfect terminal inverted repeats. The sequence of this element encodes two out of phase, overlapping open reading frames, orfA and orfB, from which three putative proteins are produced. OrfAB is a transframe protein produced by -1 translational frame shifting between orf A and orf B that is presumed to be the transposase. The large orfAB of this element encodes a 342 amino acid protein that displays similarities with transposases encoded by bacterial insertion sequences belonging to the IS3 family. In L. sanfranciscensis type strain DSM 20451T multiple truncated IS elements were identified. Inverse PCR was used to analyze target sites of four of these elements, but except of their highly AT rich character not any sequence specificity was identified so far. Moreover, no flanking direct repeats were identified. Multiple copies of IS153 were detected by hybridization in other strains of L. sanfranciscensis. Resulting hybridization patterns were shown to differentiate between organisms at strain level rather than a probe targeted against the 16S rDNA. With a PCR based approach IS153 or highly similar sequences were detected in L. acidophilus, L. casei, L. malefermentans, L. plantarum, L. hilgardii, L. collinoides L. farciminis L. sakei and L. salivarius, L. reuteri as well as in Enterococcus faecium, Pediococcus acidilactici and P. pentosaceus.
Enhancement of single guide RNA transcription for efficient CRISPR/Cas-based genomic engineering.

PubMed

Ui-Tei, Kumiko; Maruyama, Shohei; Nakano, Yuko

2017-06-01

Genomic engineering using clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) protein is a promising approach for targeting the genomic DNA of virtually any organism in a sequence-specific manner. Recent remarkable advances in CRISPR/Cas technology have made it a feasible system for use in therapeutic applications and biotechnology. In the CRISPR/Cas system, a guide RNA (gRNA), interacting with the Cas protein, recognizes a genomic region with sequence complementarity, and the double-stranded DNA at the target site is cleaved by the Cas protein. A widely used gRNA is an RNA polymerase III (pol III)-driven single gRNA (sgRNA), which is produced by artificial fusion of CRISPR RNA (crRNA) and trans-activation crRNA (tracrRNA). However, we identified a TTTT stretch, known as a termination signal of RNA pol III, in the scaffold region of the sgRNA. Here, we revealed that sgRNA carrying a TTTT stretch reduces the efficiency of sgRNA transcription due to premature transcriptional termination, and decreases the efficiency of genome editing. Unexpectedly, it was also shown that the premature terminated sgRNA may have an adverse effect of inducing RNA interference. Such disadvantageous effects were avoided by substituting one base in the TTTT stretch.
Recent Amplification of the Kangaroo Endogenous Retrovirus, KERV, Limited to the Centromere▿

PubMed Central

Ferreri, Gianni C.; Brown, Judith D.; Obergfell, Craig; Jue, Nathaniel; Finn, Caitlin E.; O'Neill, Michael J.; O'Neill, Rachel J.

2011-01-01

Mammalian retrotransposons, transposable elements that are processed through an RNA intermediate, are categorized as short interspersed elements (SINEs), long interspersed elements (LINEs), and long terminal repeat (LTR) retroelements, which include endogenous retroviruses. The ability of transposable elements to autonomously amplify led to their initial characterization as selfish or junk DNA; however, it is now known that they may acquire specific cellular functions in a genome and are implicated in host defense mechanisms as well as in genome evolution. Interactions between classes of transposable elements may exert a markedly different and potentially more significant effect on a genome than interactions between members of a single class of transposable elements. We examined the genomic structure and evolution of the kangaroo endogenous retrovirus (KERV) in the marsupial genus Macropus. The complete proviral structure of the kangaroo endogenous retrovirus, phylogenetic relationship among relative retroviruses, and expression of this virus in both Macropus rufogriseus and M. eugenii are presented for the first time. In addition, we show the relative copy number and distribution of the kangaroo endogenous retrovirus in the Macropus genus. Our data indicate that amplification of the kangaroo endogenous retrovirus occurred in a lineage-specific fashion, is restricted to the centromeres, and is not correlated with LINE depletion. Finally, analysis of KERV long terminal repeat sequences using massively parallel sequencing indicates that the recent amplification in M. rufogriseus is likely due to duplications and concerted evolution rather than a high number of independent insertion events. PMID:21389136
Role of the terminator hairpin in the biogenesis of functional Hfq-binding sRNAs

PubMed Central

Morita, Teppei; Nishino, Ryo; Aiba, Hiroji

2017-01-01

Rho-independent transcription terminators of the genes encoding bacterial Hfq-binding sRNAs possess a set of seven or more T residues at the 3′ end, as noted in previous studies. Here, we have studied the role of the terminator hairpin in the biogenesis of sRNAs focusing on SgrS and RyhB in Escherichia coli. We constructed variant sRNA genes in which the GC-rich inverted repeat sequences are extended to stabilize the terminator hairpins. We demonstrate that the extension of the hairpin stem leads to generation of heterogeneous transcripts in which the poly(U) tail is shortened. The transcripts with shortened poly(U) tails no longer bind to Hfq and lose the ability to repress the target mRNAs. The shortened transcripts are generated in an in vitro transcription system with purified RNA polymerase, indicating that the generation of shortened transcripts is caused by premature transcription termination. We conclude that the terminator structure of sRNA genes is optimized to generate functional sRNAs. Thus, the Rho-independent terminators of sRNA genes possess two common features: a long T residue stretch that is a prerequisite for generation of functional sRNAs and a moderate strength of hairpin structure that ensures the termination at the seventh or longer position within the consecutive T stretch. The modulation of the termination position at the Rho-independent terminators is critical for biosynthesis of functional sRNAs. PMID:28606943
Suppressor of sable [Su(s)] and Wdr82 down-regulate RNA from heat-shock-inducible repetitive elements by a mechanism that involves transcription termination

PubMed Central

Brewer-Jensen, Paul; Wilson, Carrie B.; Abernethy, John; Mollison, Lonna; Card, Samantha

2016-01-01

Although RNA polymerase II (Pol II) productively transcribes very long genes in vivo, transcription through extragenic sequences often terminates in the promoter-proximal region and the nascent RNA is degraded. Mechanisms that induce early termination and RNA degradation are not well understood in multicellular organisms. Here, we present evidence that the suppressor of sable [su(s)] regulatory pathway of Drosophila melanogaster plays a role in this process. We previously showed that Su(s) promotes exosome-mediated degradation of transcripts from endogenous repeated elements at an Hsp70 locus (Hsp70-αβ elements). In this report, we identify Wdr82 as a component of this process and show that it works with Su(s) to inhibit Pol II elongation through Hsp70-αβ elements. Furthermore, we show that the unstable transcripts produced during this process are polyadenylated at heterogeneous sites that lack canonical polyadenylation signals. We define two distinct regions that mediate this regulation. These results indicate that the Su(s) pathway promotes RNA degradation and transcription termination through a novel mechanism. PMID:26577379
Retinoic acid-induced differentiation of retrovirus-infected HL-60 cells is associated with enhanced transcription from the viral long terminal repeat

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collins, S.J.

1988-11-01

The author infected different human leukemic cell lines with an amphotropic retrovirus vector (designated PA317/N2) which confers G418 resistance and contains the Moloney murine leukemia virus long terminal repeat. In retrovirus-infected G418-resistant HL-60 cells, induction of granulocyte differentiation by retinoic acid was invariably accompanied by a marked increase (5- to 10-fold) in the transcriptional activity of the integrated retroviral long terminal repeat.
A novel peptide from the ACEI/BPP-CNP precursor in the venom of Crotalus durissus collilineatus.

PubMed

Higuchi, Shigesada; Murayama, Nobuhiro; Saguchi, Ken-ichi; Ohi, Hiroaki; Fujita, Yoshiaki; da Silva, Nelson Jorge; de Siqueira, Rodrigo José Bezerra; Lahlou, Saad; Aird, Steven D

2006-10-01

In crotaline venoms, angiotensin-converting enzyme inhibitors [ACEIs, also known as bradykinin potentiating peptides (BPPs)], are products of a gene coding for an ACEI/BPP-C-type natriuretic peptide (CNP) precursor. In the genes from Bothrops jararaca and Gloydius blomhoffii, ACEI/BPP sequences are repeated. Sequencing of a cDNA clone from venom glands of Crotalus durissus collilineatus showed that two ACEIs/BPPs are located together at the N-terminus, but without repeats. An additional sequence for CNP was unexpectedly found at the C-terminus. Homologous genes for the ACEI/BPP-CNP precursor suggest that most crotaline venoms contain both ACEIs/BPPs and CNP. The sequence of ACEIs/BPPs is separated from the CNP sequence by a long spacer sequence. Previously, there was no evidence that this spacer actually coded any expressed peptides. Aird and Kaiser (1986, unpublished) previously isolated and sequenced a peptide of 11 residues (TPPAGPDVGPR) from Crotalus viridis viridis venom. In the present study, analysis of the cDNA clone from C. d. collilineatus revealed a nearly identical sequence in the ACEI/BPP-CNP spacer. Fractionation of the crude venom by reverse phase HPLC (C(18)), and analysis of the fractions by mass spectrometry (MS) indicated a component of 1020.5 Da. Amino acid sequencing by MS/MS confirmed that C. d. collilineatus venom contains the peptide TPPAGPDGGPR. Its high proline content and paired proline residues are typical of venom hypotensive peptides, although it lacks the usual N-terminal pyroglutamate. It has no demonstrable hypotensive activity when injected intravenously in rats; however, its occurrence in the venoms of dissimilar species suggests that its presence is not accidental. Evidence suggests that these novel toxins probably activate anaphylatoxin C3a receptors.
The first armadillo repeat is involved in the recognition and regulation of beta-catenin phosphorylation by protein kinase CK1.

PubMed

Bustos, Victor H; Ferrarese, Anna; Venerando, Andrea; Marin, Oriano; Allende, Jorge E; Pinna, Lorenzo A

2006-12-26

Multiple phosphorylation of beta-catenin by glycogen synthase kinase 3 (GSK3) in the Wnt pathway is primed by CK1 through phosphorylation of Ser-45, which lacks a typical CK1 canonical sequence. Synthetic peptides encompassing amino acids 38-64 of beta-catenin are phosphorylated by CK1 on Ser-45 with low affinity (K(m) approximately 1 mM), whereas intact beta-catenin is phosphorylated at Ser-45 with very high affinity (K(m) approximately 200 nM). Peptides extended to include a putative CK1 docking motif (FXXXF) at 70-74 positions or a F74AA mutation in full-length beta-catenin had no significant effect on CK1 phosphorylation efficiency. beta-Catenin C-terminal deletion mutants up to residue 181 maintained their high affinity, whereas removal of the 131-181 fragment, corresponding to the first armadillo repeat, was deleterious, resulting in a 50-fold increase in K(m) value. Implication of the first armadillo repeat in beta-catenin targeting by CK1 is supported in that the Y142E mutation, which mimics phosphorylation of Tyr-142 by tyrosine kinases and promotes dissociation of beta-catenin from alpha-catenin, further improves CK1 phosphorylation efficiency, lowering the K(m) value to <50 nM, approximating the physiological concentration of beta-catenin. In contrast, alpha-catenin, which interacts with the N-terminal region of beta-catenin, prevents Ser-45 phosphorylation of CK1 in a dose-dependent manner. Our data show that the integrity of the N-terminal region and the first armadillo repeat are necessary and sufficient for high-affinity phosphorylation by CK1 of Ser-45. They also suggest that beta-catenin association with alpha-catenin and beta-catenin phosphorylation by CK1 at Ser-45 are mutually exclusive.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lamb, J.; Harris, P.C.; Wood, W.G.

The authors have previously described a series of patients in whom the deletion of 1--2 megabases (Mb) of DNA from the tip of the short arm of chromosome 16 (band 16p13.3) is associated with [alpha]-thalassemia/mental retardation syndrome (ATR-16). They now show that one of these patients has a de novo truncation of the terminal 2 Mb of chromosome 16p and that telomeric sequence (TTAGGG)[sub n] has been added at the site of breakage. This suggests that the chromosomal break, which is paternal in origin and which probably arose at meiosis, has been stabilized in vivo by the direct addition ofmore » the telomeric sequence. Sequence comparisons of this breakpoint with that of a previously described chromosomal truncation ([alpha][alpha][sup TI]) do not reveal extensive sequence homology. However, both breakpoints show minimal complementarity (3--4 bp) to the proposed RNA template of human telomerase at the site at which telomere repeats have been added. Unlike previously characterized individuals with ATR-16, the clinical features of this patient appear to be solely due to monosomy for the terminal portion of 16p13.3. The identification of further patients with [open quotes]pure[close quotes] monosomy for the tip of chromosome 16p will be important for defining the loci contributing to the phenotype of this syndrome. 33 refs., 4 figs., 1 tab.« less
Analysis of sequence repeats of proteins in the PDB.

PubMed

Mary Rajathei, David; Selvaraj, Samuel

2013-12-01

Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Polymorphism of CRISPR shows separated natural groupings of Shigella subtypes and evidence of horizontal transfer of CRISPR

PubMed Central

Yang, Chaojie; Li, Peng; Su, Wenli; Li, Hao; Liu, Hongbo; Yang, Guang; Xie, Jing; Yi, Shengjie; Wang, Jian; Cui, Xianyan; Wu, Zhihao; Wang, Ligui; Hao, Rongzhang; Jia, Leili; Qiu, Shaofu; Song, Hongbin

2015-01-01

Clustered, regularly interspaced, short palindromic repeats (CRISPR) act as an adaptive RNA-mediated immune mechanism in bacteria. They can also be used for identification and evolutionary studies based on polymorphisms within the CRISPR locus. We amplified and analyzed 6 CRISPR loci from 237 Shigella strains belonging to the 4 species groups, as well as 13 Escherichia coli strains. The CRISPR-associated (cas) gene sequence arrays of these strains were screened and compared. The CRISPR sequences from Shigella were conserved among subtypes, suggesting that CRISPR may represent a new identification tool for the detection and discrimination of Shigella species. Secondary structure analysis showed a different stem-loop structure at the terminal repeat, suggesting a distinct recognition mechanism in the formation of crRNA. In addition, the presence of “self-target” spacers and polymorphisms within CRISPR in Shigella indicated a selective pressure for inhibition of this system, which has the potential to damage “self DNA.” Homology analysis of spacers showed that CRISPR might be involved in the regulation of virulence transmission. Phylogenetic analysis based on CRISPR sequences from Shigella and E. coli indicated that although phenotypic properties maintain convergent evolution, the 4 Shigella species do not represent natural groupings. Surprisingly, comparative analysis of Shigella repeats with other species provided new evidence for CRISPR horizontal transfer. Our results suggested that CRISPR analysis is applicable for the detection of Shigella species and for investigation of evolutionary relationships. PMID:26327282
Polymorphism of CRISPR shows separated natural groupings of Shigella subtypes and evidence of horizontal transfer of CRISPR.

PubMed

Yang, Chaojie; Li, Peng; Su, Wenli; Li, Hao; Liu, Hongbo; Yang, Guang; Xie, Jing; Yi, Shengjie; Wang, Jian; Cui, Xianyan; Wu, Zhihao; Wang, Ligui; Hao, Rongzhang; Jia, Leili; Qiu, Shaofu; Song, Hongbin

2015-01-01

Clustered, regularly interspaced, short palindromic repeats (CRISPR) act as an adaptive RNA-mediated immune mechanism in bacteria. They can also be used for identification and evolutionary studies based on polymorphisms within the CRISPR locus. We amplified and analyzed 6 CRISPR loci from 237 Shigella strains belonging to the 4 species groups, as well as 13 Escherichia coli strains. The CRISPR-associated (cas) gene sequence arrays of these strains were screened and compared. The CRISPR sequences from Shigella were conserved among subtypes, suggesting that CRISPR may represent a new identification tool for the detection and discrimination of Shigella species. Secondary structure analysis showed a different stem-loop structure at the terminal repeat, suggesting a distinct recognition mechanism in the formation of crRNA. In addition, the presence of "self-target" spacers and polymorphisms within CRISPR in Shigella indicated a selective pressure for inhibition of this system, which has the potential to damage "self DNA." Homology analysis of spacers showed that CRISPR might be involved in the regulation of virulence transmission. Phylogenetic analysis based on CRISPR sequences from Shigella and E. coli indicated that although phenotypic properties maintain convergent evolution, the 4 Shigella species do not represent natural groupings. Surprisingly, comparative analysis of Shigella repeats with other species provided new evidence for CRISPR horizontal transfer. Our results suggested that CRISPR analysis is applicable for the detection of Shigella species and for investigation of evolutionary relationships.
Identification of a "glycine-loop"-like coiled structure in the 34 AA Pro,Gly,Met repeat domain of the biomineral-associated protein, PM27.

PubMed

Wustman, Brandon A; Santos, Rudolpho; Zhang, Bo; Evans, John Spencer

2002-12-05

Fracture resistance in biomineralized structures has been linked to the presence of proteins, some of which possess sequences that are associated with elastic behavior. One such protein superfamily, the Pro,Gly-rich sea urchin intracrystalline spicule matrix proteins, form protein-protein supramolecular assemblies that modify the microstructure and fracture-resistant properties of the calcium carbonate mineral phase within embryonic sea urchin spicules and adult sea urchin spines. In this report, we detail the identification of a repetitive keratin-like "glycine-loop"- or coil-like structure within the 34-AA (AA: amino acid) N-terminal domain, (PGMG)(8)PG, of the spicule matrix protein, PM27. The identification of this repetitive structural motif was accomplished using two capped model peptides: a 9-AA sequence, GPGMGPGMG, and a 34-AA peptide representing the entire motif. Using CD, NMR spectrometry, and molecular dynamics simulated annealing/minimization simulations, we have determined that the 9-AA model peptide adopts a loop-like structure at pH 7.4. The structure of the 34-AA polypeptide resembles a coil structure consisting of repeating loop motifs that do not exhibit long-range ordering. Given that loop structures have been associated with protein elastic behavior and protein motion, it is plausible that the 34-AA Pro,Gly,Met repeat sequence motif in PM27 represents a putative elastic or mobile domain. Copyright 2002 Wiley Periodicals, Inc.
Simple Sequence Repeats Provide a Substrate for Phenotypic Variation in the Neurospora crassa Circadian Clock

PubMed Central

Michael, Todd P.; Park, Sohyun; Kim, Tae-Sung; Booth, Jim; Byer, Amanda; Sun, Qi; Chory, Joanne; Lee, Kwangwon

2007-01-01

Background WHITE COLLAR-1 (WC-1) mediates interactions between the circadian clock and the environment by acting as both a core clock component and as a blue light photoreceptor in Neurospora crassa. Loss of the amino-terminal polyglutamine (NpolyQ) domain in WC-1 results in an arrhythmic circadian clock; this data is consistent with this simple sequence repeat (SSR) being essential for clock function. Methodology/Principal Findings Since SSRs are often polymorphic in length across natural populations, we reasoned that investigating natural variation of the WC-1 NpolyQ may provide insight into its role in the circadian clock. We observed significant phenotypic variation in the period, phase and temperature compensation of circadian regulated asexual conidiation across 143 N. crassa accessions. In addition to the NpolyQ, we identified two other simple sequence repeats in WC-1. The sizes of all three WC-1 SSRs correlated with polymorphisms in other clock genes, latitude and circadian period length. Furthermore, in a cross between two N. crassa accessions, the WC-1 NpolyQ co-segregated with period length. Conclusions/Significance Natural variation of the WC-1 NpolyQ suggests a mechanism by which period length can be varied and selected for by the local environment that does not deleteriously affect WC-1 activity. Understanding natural variation in the N. crassa circadian clock will facilitate an understanding of how fungi exploit their environments. PMID:17726525

Novel Virulent and Broad-Host-Range Erwinia amylovora Bacteriophages Reveal a High Degree of Mosaicism and a Relationship to Enterobacteriaceae Phages ▿†

PubMed Central

Born, Yannick; Fieseler, Lars; Marazzi, Janine; Lurz, Rudi; Duffy, Brion; Loessner, Martin J.

2011-01-01

A diverse set of 24 novel phages infecting the fire blight pathogen Erwinia amylovora was isolated from fruit production environments in Switzerland. Based on initial screening, four phages (L1, M7, S6, and Y2) with broad host ranges were selected for detailed characterization and genome sequencing. Phage L1 is a member of the Podoviridae, with a 39.3-kbp genome featuring invariable genome ends with direct terminal repeats. Phage S6, another podovirus, was also found to possess direct terminal repeats but has a larger genome (74.7 kbp), and the virus particle exhibits a complex tail fiber structure. Phages M7 and Y2 both belong to the Myoviridae family and feature long, contractile tails and genomes of 84.7 kbp (M7) and 56.6 kbp (Y2), respectively, with direct terminal repeats. The architecture of all four phage genomes is typical for tailed phages, i.e., organized into function-specific gene clusters. All four phages completely lack genes or functions associated with lysogeny control, which correlates well with their broad host ranges and indicates strictly lytic (virulent) lifestyles without the possibility for host lysogenization. Comparative genomics revealed that M7 is similar to E. amylovora virus ΦEa21-4, whereas L1, S6, and Y2 are unrelated to any other E. amylovora phage. Instead, they feature similarities to enterobacterial viruses T7, N4, and ΦEcoM-GJ1. In a series of laboratory experiments, we provide proof of concept that specific two-phage cocktails offer the potential for biocontrol of the pathogen. PMID:21764969
Novel virulent and broad-host-range Erwinia amylovora bacteriophages reveal a high degree of mosaicism and a relationship to Enterobacteriaceae phages.

PubMed

Born, Yannick; Fieseler, Lars; Marazzi, Janine; Lurz, Rudi; Duffy, Brion; Loessner, Martin J

2011-09-01

A diverse set of 24 novel phages infecting the fire blight pathogen Erwinia amylovora was isolated from fruit production environments in Switzerland. Based on initial screening, four phages (L1, M7, S6, and Y2) with broad host ranges were selected for detailed characterization and genome sequencing. Phage L1 is a member of the Podoviridae, with a 39.3-kbp genome featuring invariable genome ends with direct terminal repeats. Phage S6, another podovirus, was also found to possess direct terminal repeats but has a larger genome (74.7 kbp), and the virus particle exhibits a complex tail fiber structure. Phages M7 and Y2 both belong to the Myoviridae family and feature long, contractile tails and genomes of 84.7 kbp (M7) and 56.6 kbp (Y2), respectively, with direct terminal repeats. The architecture of all four phage genomes is typical for tailed phages, i.e., organized into function-specific gene clusters. All four phages completely lack genes or functions associated with lysogeny control, which correlates well with their broad host ranges and indicates strictly lytic (virulent) lifestyles without the possibility for host lysogenization. Comparative genomics revealed that M7 is similar to E. amylovora virus ΦEa21-4, whereas L1, S6, and Y2 are unrelated to any other E. amylovora phage. Instead, they feature similarities to enterobacterial viruses T7, N4, and ΦEcoM-GJ1. In a series of laboratory experiments, we provide proof of concept that specific two-phage cocktails offer the potential for biocontrol of the pathogen.
Identification and functional characterization of BTas transactivator as a DNA-binding protein.

PubMed

Tan, Juan; Hao, Peng; Jia, Rui; Yang, Wei; Liu, Ruichang; Wang, Jinzhong; Xi, Zhen; Geng, Yunqi; Qiao, Wentao

2010-09-30

The genome of bovine foamy virus (BFV) encodes a transcriptional transactivator, namely BTas, that remarkably enhances gene expression by binding to the viral long-terminal repeat promoter (LTR) and internal promoter (IP). In this report, we characterized the functional domains of BFV BTas. BTas contains two major functional domains: the N-terminal DNA-binding domain (residues 1-133) and the C-terminal activation domain (residues 198-249). The complete BTas responsive regions were mapped to the positions -380/-140 of LTR and 9205/9276 of IP. Four BTas responsive elements were identified at the positions -368/-346, -327/-307, -306/-285 and -186/-165 of the BFV LTR, and one element was identified at the position 9243/9264 of the BFV IP. Unlike other foamy viruses, the five BTas responsive elements in BFV shared obvious sequence homology. These data suggest that among the complex retroviruses, BFV appears to have a unique transactivation mechanism. Crown Copyright 2010. Published by Elsevier Inc. All rights reserved.
Characterization of a tandemly repeated DNA sequence family originally derived by retroposition of tRNA(Glu) in the newt.

PubMed

Nagahashi, S; Endoh, H; Suzuki, Y; Okada, N

1991-11-20

A previous report from this laboratory showed that in vitro transcription of total genomic DNA of the newt Cynopus pyrrhogaster resulted in a discrete sized 8 S RNA, which represented highly repetitive and transcribable sequences with a glutamic acid tRNA-like structure in the newt genome. We isolated four independent clones from a newt genomic library and determined the complete sequences of three 2000 to 2400 base-pair PstI fragments spanning the 8 S RNA gene. The glutamic acid tRNA-related segment in the 8 S RNA gene contains the CCA sequence expected as the 3' terminus of a tRNA molecule. Further, the 11 nucleotides located 13 nucleotides upstream from one of the two transcription initiation sites of the 8 S RNA were found to be repeated in the region upstream from the termination site, suggesting that the original unit, which is shorter than the 8 S RNA, was retrotransposed via cDNA intermediates from the PolIII transcript. In the upstream region of the 8 S RNA gene, a 360 nucleotide unit containing the glutamic acid tRNA-related segment was found to be duplicated (clones NE1 and NE10) or triplicated (clone NE3). Except for the difference in the number of the 360 nucleotide unit, the three sequences of the 2000 to 2400 base-pair PstI fragment were essentially the same with only a few mutations and minor deletions. Inverse polymerase chain reaction and sequence determination of the products, together with a Southern hybridization experiment, demonstrated that the family consists of a tandemly repeated unit of 3300, 3700 or 4100 base-pairs. Thus during evolution, this family in the newt was created by retroposition via cDNA intermediates, followed by duplication or triplication of the 360 nucleotide unit and multiplication of the 3300 to 4100 base-pair region at the DNA level.
Sequences in the intergenic spacer influence RNA Pol I transcription from the human rRNA promoter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, W.M.; Sylvester, J.E.

1994-09-01

In most eucaryotic species, ribosomal genes are tandemly repeated about 100-5000 times per haploid genome. The 43 Kb human rDNA repeat consists of a 13 Kb coding region for the 18S, 5.8S, 28S ribosomal RNAs (rRNAs) and transcribed spacers separated by a 30 Kb intergenic spacer. For species such as frog, mouse and rat, sequences in the intergenic spacer other than the gene promoter have been shown to modulate transcription of the ribosomal gene. These sequences are spacer promoters, enhancers and the terminator for spacer transcription. We are addressing whether the human ribosomal gene promoter is similarly influenced. In-vitro transcriptionmore » run-off assays have revealed that the 4.5 kb region (CBE), directly upstream of the gene promoter, has cis-stimulation and trans-competition properties. This suggests that the CBE fragment contains an enhancer(s) for ribosomal gene transcription. Further experiments have shown that a fragment ({approximately}1.6 kb) within the CBE fragment also has trans-competition function. Deletion subclones of this region are being tested to delineate the exact sequences responsible for these modulating activities. Previous sequence analysis and functional studies have revealed that CBE contains regions of DNA capable of adopting alternative structures such as bent DNA, Z-DNA, and triple-stranded DNA. Whether these structures are required for modulating transcription remains to be determined as does the specific DNA-protein interaction involved.« less
DNA Data Visualization (DDV): Software for Generating Web-Based Interfaces Supporting Navigation and Analysis of DNA Sequence Data of Entire Genomes.

PubMed

Neugebauer, Tomasz; Bordeleau, Eric; Burrus, Vincent; Brzezinski, Ryszard

2015-01-01

Data visualization methods are necessary during the exploration and analysis activities of an increasingly data-intensive scientific process. There are few existing visualization methods for raw nucleotide sequences of a whole genome or chromosome. Software for data visualization should allow the researchers to create accessible data visualization interfaces that can be exported and shared with others on the web. Herein, novel software developed for generating DNA data visualization interfaces is described. The software converts DNA data sets into images that are further processed as multi-scale images to be accessed through a web-based interface that supports zooming, panning and sequence fragment selection. Nucleotide composition frequencies and GC skew of a selected sequence segment can be obtained through the interface. The software was used to generate DNA data visualization of human and bacterial chromosomes. Examples of visually detectable features such as short and long direct repeats, long terminal repeats, mobile genetic elements, heterochromatic segments in microbial and human chromosomes, are presented. The software and its source code are available for download and further development. The visualization interfaces generated with the software allow for the immediate identification and observation of several types of sequence patterns in genomes of various sizes and origins. The visualization interfaces generated with the software are readily accessible through a web browser. This software is a useful research and teaching tool for genetics and structural genomics.
The Mr 140,000 Intermediate Chain of Chlamydomonas Flagellar Inner Arm Dynein Is a WD-Repeat Protein Implicated in Dynein Arm Anchoring

PubMed Central

Yang, Pinfen; Sale, Winfield S.

1998-01-01

Previous structural and biochemical studies have revealed that the inner arm dynein I1 is targeted and anchored to a unique site located proximal to the first radial spoke in each 96-nm axoneme repeat on flagellar doublet microtubules. To determine whether intermediate chains mediate the positioning and docking of dynein complexes, we cloned and characterized the 140-kDa intermediate chain (IC140) of the I1 complex. Sequence and secondary structural analysis, with particular emphasis on β-sheet organization, predicted that IC140 contains seven WD repeats. Reexamination of other members of the dynein intermediate chain family of WD proteins indicated that these polypeptides also bear seven WD/β-sheet repeats arranged in the same pattern along each intermediate chain protein. A polyclonal antibody was raised against a 53-kDa fusion protein derived from the C-terminal third of IC140. The antibody is highly specific for IC140 and does not bind to other dynein intermediate chains or proteins in Chlamydomonas flagella. Immunofluorescent microscopy of Chlamydomonas cells confirmed that IC140 is distributed along the length of both flagellar axonemes. In vitro reconstitution experiments demonstrated that the 53-kDa C-terminal fusion protein binds specifically to axonemes lacking the I1 complex. Chemical cross-linking indicated that IC140 is closely associated with a second intermediate chain in the I1 complex. These data suggest that IC140 contains domains responsible for the assembly and docking of the I1 complex to the doublet microtubule cargo. PMID:9843573
Evidence that a proposed repeated segment of glutamine residues is expressed in the Huntington disease protein

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jou, Y.S.; Myers, R.M.

1994-09-01

Huntington disease (HD) appears to be caused by a mutation that results in an expanded number of CAG repeats at the 5{prime} end of the gene. The nucleotide sequence of the gene and cDNA clones predicts a 347 kd protein that contains a stretch of polyglutamine, encoded by the CAG repeat, located 17 amino acids downstream from the proposed translation initiation site. Because understanding the mechanisms of the pathology of HD depends on whether the CAG-repeat is expressed in the protein, we used antibodies directed against portions of the predicted HD gene product to probe the structure of the proteinmore » in tissue culture cells. Two peptides, one located amino-terminal to the proposed polyglutamine stretch (hd1 peptide FESLKSFQQ from amino acids 11-19) and one located in the carboxy-terminal half of the predicted protein (hd2 peptide QQPRNKPLK from amino acids 2531-2539), were used to elicit polyclonal antibodies in NZW rabbits. We affinity-purified the antibodies and used them to analyze the HD protein. Both antisera specifically recognize the peptides used to elicit them, as well as the appropriate portions of the HD protein expressed in E. coli. Western blot analysis showed that both antisera recognize a protein with an apparent molecular weight of approximately 350,000 in human, monkey, rat and mouse cell lines, including two neutronal cell lines. These results, in combination with immunoprecipitation experiments, suggest strongly that the proposed polyglutamine stretch is indeed translated in the HD protein and is evolutionarily conserved in various mammalian species.« less
In vitro selection of DNA elements highly responsive to the human T-cell lymphotropic virus type I transcriptional activator, Tax.

PubMed

Paca-Uccaralertkun, S; Zhao, L J; Adya, N; Cross, J V; Cullen, B R; Boros, I M; Giam, C Z

1994-01-01

The human T-cell lymphotropic virus type I (HTLV-I) transactivator, Tax, the ubiquitous transcriptional factor cyclic AMP (cAMP) response element-binding protein (CREB protein), and the 21-bp repeats in the HTLV-I transcriptional enhancer form a ternary nucleoprotein complex (L. J. Zhao and C. Z. Giam, Proc. Natl. Acad. Sci. USA 89:7070-7074, 1992). Using an antibody directed against the COOH-terminal region of Tax along with purified Tax and CREB proteins, we selected DNA elements bound specifically by the Tax-CREB complex in vitro. Two distinct but related groups of sequences containing the cAMP response element (CRE) flanked by long runs of G and C residues in the 5' and 3' regions, respectively, were preferentially recognized by Tax-CREB. In contrast, CREB alone binds only to CRE motifs (GNTGACG[T/C]) without neighboring G- or C-rich sequences. The Tax-CREB-selected sequences bear a striking resemblance to the 5' or 3' two-thirds of the HTLV-I 21-bp repeats and are highly inducible by Tax. Gel electrophoretic mobility shift assays, DNA transfection, and DNase I footprinting analyses indicated that the G- and C-rich sequences flanking the CRE motif are crucial for Tax-CREB-DNA ternary complex assembly and Tax transactivation but are not in direct contact with the Tax-CREB complex. These data show that Tax recruits CREB to form a multiprotein complex that specifically recognizes the viral 21-bp repeats. The expanded DNA binding specificity of Tax-CREB and the obligatory role the ternary Tax-CREB-DNA complex plays in transactivation reveal a novel mechanism for regulating the transcriptional activity of leucine zipper proteins like CREB.
Comparative analysis of complete orthologous centromeres from two subspecies of rice reveals rapid variation of centromere organization and structure.

PubMed

Wu, Jianzhong; Fujisawa, Masaki; Tian, Zhixi; Yamagata, Harumi; Kamiya, Kozue; Shibata, Michie; Hosokawa, Satomi; Ito, Yukiyo; Hamada, Masao; Katagiri, Satoshi; Kurita, Kanako; Yamamoto, Mayu; Kikuta, Ari; Machita, Kayo; Karasawa, Wataru; Kanamori, Hiroyuki; Namiki, Nobukazu; Mizuno, Hiroshi; Ma, Jianxin; Sasaki, Takuji; Matsumoto, Takashi

2009-12-01

Centromeres are sites for assembly of the chromosomal structures that mediate faithful segregation at mitosis and meiosis. This function is conserved across species, but the DNA components that are involved in kinetochore formation differ greatly, even between closely related species. To shed light on the nature, evolutionary timing and evolutionary dynamics of rice centromeres, we decoded a 2.25-Mb DNA sequence covering the centromeric region of chromosome 8 of an indica rice variety, 'Kasalath' (Kas-Cen8). Analysis of repetitive sequences in Kas-Cen8 led to the identification of 222 long terminal repeat (LTR)-retrotransposon elements and 584 CentO satellite monomers, which account for 59.2% of the region. A comparison of the Kas-Cen8 sequence with that of japonica rice 'Nipponbare' (Nip-Cen8) revealed that about 66.8% of the Kas-Cen8 sequence was collinear with that of Nip-Cen8. Although the 27 putative genes are conserved between the two subspecies, only 55.4% of the total LTR-retrotransposon elements in 'Kasalath' had orthologs in 'Nipponbare', thus reflecting recent proliferation of a considerable number of LTR-retrotransposons since the divergence of two rice subspecies of indica and japonica within Oryza sativa. Comparative analysis of the subfamilies, time of insertion, and organization patterns of inserted LTR-retrotransposons between the two Cen8 regions revealed variations between 'Kasalath' and 'Nipponbare' in the preferential accumulation of CRR elements, and the expansion of CentO satellite repeats within the core domain of Cen8. Together, the results provide insights into the recent proliferation of LTR-retrotransposons, and the rapid expansion of CentO satellite repeats, underlying the dynamic variation and plasticity of plant centromeres.
Properties of an equine herpesvirus 1 mutant devoid of the internal inverted repeat sequence of the genomic short region

PubMed Central

Ahn, ByungChul; Zhang, Yunfei; Osterrieder, Nikolaus; O'Callaghan, Dennis J.

2010-01-01

The 150 kbp genome of equine herpesvirus -1 (EHV-1) is composed of a unique long (UL) region and a unique short (Us) segment, which is flanked by identical internal and terminal repeat (IR and TR) sequences of 12.7kbp. We constructed an EHV-1 lacking the entire IR (vL11ΔIR) and showed that the IR is dispensable for EHV-1 replication but that the vL11ΔIR exhibits a smaller plaque size and delayed growth kinetics. Western blot analyses of cells infected with vL11ΔIR showed that the synthesis of viral proteins encoded by the immediate-early, early, and late genes was reduced at immediate-early and early times, but by late stages of replication reached wild type levels. Intranasal infection of CBA mice revealed that the vL11ΔIR was significantly attenuated as mice infected with the vL11ΔIR showed a reduced lung viral titer and greater ability to survive infection compared to mice infected with parental or revertant virus. PMID:21176938
Genome Sequence, Structural Proteins, and Capsid Organization of the Cyanophage Syn5: A “Horned” Bacteriophage of Marine Synechococcus

PubMed Central

Pope, Welkin H.; Weigele, Peter R.; Chang, Juan; Pedulla, Marisa L.; Ford, Michael E.; Houtz, Jennifer M.; Jiang, Wen; Chiu, Wah; Hatfull, Graham F.; Hendrix, Roger W.; King, Jonathan

2010-01-01

Marine Synechococcus spp and marine Prochlorococcus spp are numerically dominant photoautotrophs in the open oceans and contributors to the global carbon cycle. Syn5 is a short-tailed cyanophage isolated from the Sargasso Sea on Synechococcus strain WH8109. Syn5 has been grown in WH8109 to high titer in the laboratory and purified and concentrated retaining infectivity. Genome sequencing and annotation of Syn5 revealed that the linear genome is 46,214bp with a 237bp terminal direct repeat. Sixty-one open reading frames (ORFs) were identified. Based on genomic organization and sequence similarity to known protein sequences within GenBank, Syn5 shares features with T7-like phages. The presence of a putative integrase suggests access to a temperate life-cycle. Assignment of eleven ORFs to structural proteins found within the phage virion was confirmed by mass-spectrometry and N-terminal sequencing. Eight of these identified structural proteins exhibited amino acid sequence similarity to enteric phage proteins. The remaining three virion proteins did not resemble any known phage sequences in GenBank as of August 2006. Cryoelectron micrographs of purified Syn5 virions revealed that the capsid has a single “horn”, a novel fibrous structure protruding from the opposing end of the capsid from the tail of the virion. The tail appendage displayed an apparent three-fold rather than six-fold symmetry. An 18Å-resolution icosahedral reconstruction of the capsid revealed a T=7 lattice, but with an unusual pattern of surface knobs. This phage/host system should allow detailed investigation of the physiology and biochemistry of phage propagation in marine photosynthetic bacteria. PMID:17383677
Utility of next-generation RNA-sequencing in identifying chimeric transcription involving human endogenous retroviruses.

PubMed

Sokol, Martin; Jessen, Karen Margrethe; Pedersen, Finn Skou

2016-01-01

Several studies have shown that human endogenous retroviruses and endogenous retrovirus-like repeats (here collectively HERVs) impose direct regulation on human genes through enhancer and promoter motifs present in their long terminal repeats (LTRs). Although chimeric transcription in which novel gene isoforms containing retroviral and human sequence are transcribed from viral promoters are commonly associated with disease, regulation by HERVs is beneficial in other settings; for example, in human testis chimeric isoforms of TP63 induced by an ERV9 LTR protect the male germ line upon DNA damage by inducing apoptosis, whereas in the human globin locus the γ- and β-globin switch during normal hematopoiesis is mediated by complex interactions of an ERV9 LTR and surrounding human sequence. The advent of deep sequencing or next-generation sequencing (NGS) has revolutionized the way researchers solve important scientific questions and develop novel hypotheses in relation to human genome regulation. We recently applied next-generation paired-end RNA-sequencing (RNA-seq) together with chromatin immunoprecipitation with sequencing (ChIP-seq) to examine ERV9 chimeric transcription in human reference cell lines from Encyclopedia of DNA Elements (ENCODE). This led to the discovery of advanced regulation mechanisms by ERV9s and other HERVs across numerous human loci including transcription of large gene-unannotated genomic regions, as well as cooperative regulation by multiple HERVs and non-LTR repeats such as Alu elements. In this article, well-established examples of human gene regulation by HERVs are reviewed followed by a description of paired-end RNA-seq, and its application in identifying chimeric transcription genome-widely. Based on integrative analyses of RNA-seq and ChIP-seq, data we then present novel examples of regulation by ERV9s of tumor suppressor genes CADM2 and SEMA3A, as well as transcription of an unannotated region. Taken together, this article highlights the high suitability of contemporary sequencing methods in future analyses of human biology in relation to evolutionary acquired retroviruses in the human genome. © 2016 APMIS. Published by John Wiley & Sons Ltd.
A codon-usage variant in the (GGN){sub n} trinucleotide polymorphism of the androgen receptor gene as an aid in the prenatal diagnosis of ambiguous genitalia due to partial androgen insensitivity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lumbroso, R.; Vasiliou, M.; Beitel, L.K.

1994-09-01

Exon 1 at the X-linked androgen receptor (AR) locus encodes an N-terminal modulatory domain that contains two large homopolyamino acid tracts: (CAG;glutamine;Gln){sub 11-33} and (GGN;Glycine;Cly){sub 15-27}. Certain AR mutations cause partial androgen insensitivity (PAI) with frank genital ambiguity that may engender appreciable parental anxiety and patient morbidity. If the AR mutation in a PAI family is unknown, the AR`s intragenic trinucleotide repeat polymorphisms may be used for prenatal diagnosis. However, intergenerational instability of repeat-size may be worrisome, particularly when the information alleles differ by only a few repeats. Here, we report the discovery of a codon-usage (silent substitution) variant inmore » the GGN repeat, and describe its use as a source of complementary information for prenatal diagnosis. The standard sense sequence of the (GGN){sub n} tract is (GGT){sub 3} GGG(GGT){sub 2} (GGC){sub 9-21}. On 4 of 27 X chromosomes we noted that the internal GGT sequence was expanded to 3 or 4 repeats. We used an internal (GGT){sub 4} repeat in a total (GGN){sub 24} tract together with a (CAG){sub 20} tract to distinguish an X chromosome with a mutant AR allele from another X chromosome, bearing a normal allele, that had an internal (GGT){sub 2} repeat in a total (GGN){sub 23} tract together with a (CAG){sub 21} tract. Subsequently, we found the base change leading to a pathogenic amino acid substitution (M779I) in codon 6 of the mutant AR gene in an affected maternal aunt and the fetus at risk. This confirmed the prenatal diagnosis based on the intragenic trinucleotide repeat polymorphisms, and it strengthened the prediction of external genital ambiguity using our previous experience with M779I in another family.« less
Role of the terminator hairpin in the biogenesis of functional Hfq-binding sRNAs.

PubMed

Morita, Teppei; Nishino, Ryo; Aiba, Hiroji

2017-09-01

Rho-independent transcription terminators of the genes encoding bacterial Hfq-binding sRNAs possess a set of seven or more T residues at the 3' end, as noted in previous studies. Here, we have studied the role of the terminator hairpin in the biogenesis of sRNAs focusing on SgrS and RyhB in Escherichia coli. We constructed variant sRNA genes in which the GC-rich inverted repeat sequences are extended to stabilize the terminator hairpins. We demonstrate that the extension of the hairpin stem leads to generation of heterogeneous transcripts in which the poly(U) tail is shortened. The transcripts with shortened poly(U) tails no longer bind to Hfq and lose the ability to repress the target mRNAs. The shortened transcripts are generated in an in vitro transcription system with purified RNA polymerase, indicating that the generation of shortened transcripts is caused by premature transcription termination. We conclude that the terminator structure of sRNA genes is optimized to generate functional sRNAs. Thus, the Rho-independent terminators of sRNA genes possess two common features: a long T residue stretch that is a prerequisite for generation of functional sRNAs and a moderate strength of hairpin structure that ensures the termination at the seventh or longer position within the consecutive T stretch. The modulation of the termination position at the Rho-independent terminators is critical for biosynthesis of functional sRNAs. © 2017 Morita et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Opposite consequences of two transcription pauses caused by an intrinsic terminator oligo(U): antitermination versus termination by bacteriophage T7 RNA polymerase.

PubMed

Lee, Sooncheol; Kang, Changwon

2011-05-06

The RNA oligo(U) sequence, along with an immediately preceding RNA hairpin structure, is an essential cis-acting element for bacterial class I intrinsic termination. This sequence not only causes a pause in transcription during the beginning of the termination process but also facilitates transcript release at the end of the process. In this study, the oligo(U) sequence of the bacteriophage T7 intrinsic terminator Tφ, rather than the hairpin structure, induced pauses of phage T7 RNA polymerase not only at the termination site, triggering a termination process, but also 3 bp upstream, exerting an antitermination effect. The upstream pause presumably allowed RNA to form a thermodynamically more stable secondary structure rather than a terminator hairpin and to persist because the 5'-half of the terminator hairpin-forming sequence could be sequestered by a farther upstream sequence via sequence-specific hybridization, prohibiting formation of the terminator hairpin and termination. The putative antiterminator RNA structure lacked several base pairs essential for termination when probed using RNases A, T1, and V1. When the antiterminator was destabilized by incorporation of IMP into nascent RNA at G residue positions, antitermination was abolished. Furthermore, antitermination strength increased with more stable antiterminator secondary structures and longer pauses. Thus, the oligo(U)-mediated pause prior to the termination site can exert a cis-acting antitermination activity on intrinsic terminator Tφ, and the termination efficiency depends primarily on the termination-interfering pause that precedes the termination-facilitating pause at the termination site.
The LINE-1 DNA sequences in four mammalian orders predict proteins that conserve homologies to retrovirus proteins.

PubMed Central

Fanning, T; Singer, M

1987-01-01

Recent work suggests that one or more members of the highly repeated LINE-1 (L1) DNA family found in all mammals may encode one or more proteins. Here we report the sequence of a portion of an L1 cloned from the domestic cat (Felis catus). These data permit comparison of the L1 sequences in four mammalian orders (Carnivore, Lagomorph, Rodent and Primate) and the comparison supports the suggested coding potential. In two separate, noncontiguous regions in the carboxy terminal half of the proteins predicted from the DNA sequences, there are several strongly conserved segments. In one region, these share homology with known or suspected reverse transcriptases, as described by others in rodents and primates. In the second region, closer to the carboxy terminus, the strongly conserved segments are over 90% homologous among the four orders. One of the latter segments is cysteine rich and resembles the putative metal binding domains of nucleic acid binding proteins, including those of TFIIIA and retroviruses. PMID:3562227
Zaba: a novel miniature transposable element present in genomes of legume plants.

PubMed

Macas, J; Neumann, P; Pozárková, D

2003-08-01

A novel family of miniature transposable elements, named Zaba, was identified in pea (Pisum sativum) and subsequently also in other legume species using computer analysis of their DNA sequences. Zaba elements are 141-190 bp long, generate 10-bp target site duplications, and their terminal inverted repeats make up most of the sequence. Zaba elements thus resemble class 3 foldback transposons. The elements are only moderately repetitive in pea (tens to hundreds copies per haploid genome), but they are present in up to thousands of copies in the genomes of several Medicago and Vicia species. More detailed analysis of the elements from pea, including isolation of new sequences from a genomic library, revealed that a fraction of these elements are truncated, and that their last transposition probably did not occur recently. A search for Zaba sequences in EST databases showed that at least some elements are transcribed, most probably due to their association with genic regions.
DNA sequences of three beta-1,4-endoglucanase genes from Thermomonospora fusca.

PubMed Central

Lao, G; Ghangas, G S; Jung, E D; Wilson, D B

1991-01-01

The DNA sequences of the Thermomonospora fusca genes encoding cellulases E2 and E5 and the N-terminal end of E4 were determined. Each sequence contains an identical 14-bp inverted repeat upstream of the initiation codon. There were no significant homologies between the coding regions of the three genes. The E2 gene is 73% identical to the celA gene from Microbispora bispora, but this was the only homology found with other cellulase genes. E2 belongs to a family of cellulases that includes celA from M. bispora, cenA from Cellulomonas fimi, casA from an alkalophilic Streptomyces strain, and cellobiohydrolase II from Trichoderma reesei. E4 shows 44% identity to an avocado cellulase, while E5 belongs to the Bacillus cellulase family. There were strong similarities between the amino acid sequences of the E2 and E5 cellulose binding domains, and these regions also showed homology with C. fimi and Pseudomonas fluorescens cellulose binding domains. PMID:1904434
Surface display of a massively variable lipoprotein by a Legionella diversity-generating retroelement.

PubMed

Arambula, Diego; Wong, Wenge; Medhekar, Bob A; Guo, Huatao; Gingery, Mari; Czornyj, Elizabeth; Liu, Minghsun; Dey, Sanghamitra; Ghosh, Partho; Miller, Jeff F

2013-05-14

Diversity-generating retroelements (DGRs) are a unique family of retroelements that confer selective advantages to their hosts by facilitating localized DNA sequence evolution through a specialized error-prone reverse transcription process. We characterized a DGR in Legionella pneumophila, an opportunistic human pathogen that causes Legionnaires disease. The L. pneumophila DGR is found within a horizontally acquired genomic island, and it can theoretically generate 10(26) unique nucleotide sequences in its target gene, legionella determinent target A (ldtA), creating a repertoire of 10(19) distinct proteins. Expression of the L. pneumophila DGR resulted in transfer of DNA sequence information from a template repeat to a variable repeat (VR) accompanied by adenine-specific mutagenesis of progeny VRs at the 3'end of ldtA. ldtA encodes a twin-arginine translocated lipoprotein that is anchored in the outer leaflet of the outer membrane, with its C-terminal variable region surface exposed. Related DGRs were identified in L. pneumophila clinical isolates that encode unique target proteins with homologous VRs, demonstrating the adaptability of DGR components. This work characterizes a DGR that diversifies a bacterial protein and confirms the hypothesis that DGR-mediated mutagenic homing occurs through a conserved mechanism. Comparative bioinformatics predicts that surface display of massively variable proteins is a defining feature of a subset of bacterial DGRs.

Unprecedented large inverted repeats at the replication terminus of circular bacterial chromosomes suggest a novel mode of chromosome rescue

PubMed Central

El Kafsi, Hela; Loux, Valentin; Mariadassou, Mahendra; Blin, Camille; Chiapello, Hélène; Abraham, Anne-Laure; Maguin, Emmanuelle; van de Guchte, Maarten

2017-01-01

The first Lactobacillus delbrueckii ssp. bulgaricus genome sequence revealed the presence of a very large inverted repeat (IR), a DNA sequence arrangement which thus far seemed inconceivable in a non-manipulated circular bacterial chromosome, at the replication terminus. This intriguing observation prompted us to investigate if similar IRs could be found in other bacteria. IRs with sizes varying from 38 to 76 kbp were found at the replication terminus of all 5 L. delbrueckii ssp. bulgaricus chromosomes analysed, but in none of 1373 other chromosomes. They represent the first naturally occurring very large IRs detected in circular bacterial genomes. A comparison of the L. bulgaricus replication terminus regions and the corresponding regions without IR in 5 L. delbrueckii ssp. lactis genomes leads us to propose a model for the formation and evolution of the IRs. The DNA sequence data are consistent with a novel model of chromosome rescue after premature replication termination or irreversible chromosome damage near the replication terminus, involving mechanisms analogous to those proposed in the formation of very large IRs in human cancer cells. We postulate that the L. delbrueckii ssp. bulgaricus-specific IRs in different strains derive from a single ancestral IR of at least 93 kbp. PMID:28281695
The Diversity of Prokaryotic DDE Transposases of the Mutator Superfamily, Insertion Specificity, and Association with Conjugation Machineries

PubMed Central

Guérillot, Romain; Siguier, Patricia; Gourbeyre, Edith; Chandler, Michael; Glaser, Philippe

2014-01-01

Transposable elements (TEs) are major components of both prokaryotic and eukaryotic genomes and play a significant role in their evolution. In this study, we have identified new prokaryotic DDE transposase families related to the eukaryotic Mutator-like transposases. These genes were retrieved by cascade PSI-Blast using as initial query the transposase of the streptococcal integrative and conjugative element (ICE) TnGBS2. By combining secondary structure predictions and protein sequence alignments, we predicted the DDE catalytic triad and the DNA-binding domain recognizing the terminal inverted repeats. Furthermore, we systematically characterized the organization and the insertion specificity of the TEs relying on these prokaryotic Mutator-like transposases (p-MULT) for their mobility. Strikingly, two distant TE families target their integration upstream σA dependent promoters. This allowed us to identify a transposase sequence signature associated with this unique insertion specificity and to show that the dissymmetry between the two inverted repeats is responsible for the orientation of the insertion. Surprisingly, while DDE transposases are generally associated with small and simple transposons such as insertion sequences (ISs), p-MULT encoding TEs show an unprecedented diversity with several families of IS, transposons, and ICEs ranging in size from 1.1 to 52 kb. PMID:24418649
Localization of Action of the Is50-Encoded Transposase Protein

PubMed Central

Phadnis, Suhas H.; Sasakawa, Chihiro; Berg, Douglas E.

1986-01-01

The movement of the bacterial insertion sequence IS50 and of composite elements containing direct terminal repeats of IS50 involves the two ends of IS50, designated O (outside) and I (inside), which are weakly matched in DNA sequence, and an IS50 encoded protein, transposase, which recognizes the O and I ends and acts preferentially in cis. Previous data had suggested that, initially, transposase interacts preferentially with the O end sequence and then, in a second step, with either an O or an I end. To better understand the cis action of transposase and how IS50 ends are selected, we generated a series of composite transposons which contain direct repeats of IS50 elements. In each transposon, one IS50 element encoded transposase (tnp +), and the other contained a null (tnp-) allele. In each of the five sets of composite transposons studied, the transposon for which the tnp+ IS50 element contained its O end was more active than a complementary transposon for which the tnp - IS50 element contained its O end. This pattern of O end use suggests models in which the cis action of transposase and its choice of ends is determined by protein tracking along DNA molecules. PMID:3007274
Small leucine-rich repeat proteoglycans associated with mature insoluble elastin serve as binding sites for galectins.

PubMed

Itoh, Aiko; Nonaka, Yasuhiro; Ogawa, Takashi; Nakamura, Takanori; Nishi, Nozomu

2017-11-01

We previously reported that galectin-9 (Gal-9), an immunomodulatory animal lectin, could bind to insoluble collagen preparations and exerted direct cytocidal effects on immune cells. In the present study, we found that mature insoluble elastin is capable of binding Gal-9 and other members of the human galectin family. Lectin blot analysis of a series of commercial water-soluble elastin preparations, PES-(A) ~ PES-(E), revealed that only PES-(E) contained substances recognized by Gal-9. Gal-9-interacting substances in PES-(E) were affinity-purified, digested with trypsin and then analyzed by reversed-phase HPLC. Peptide fragments derived from five members of the small leucine-rich repeat proteoglycan family, versican, lumican, osteoglycin/mimecan, prolargin, and fibromodulin, were identified by N-terminal amino acid sequence analysis. The results indicate that Gal-9 and possibly other galectins recognize glycans attached to small leucine-rich repeat proteoglycans associated with insoluble elastin and also indicate the possibility that mature insoluble elastin serves as an extracellular reservoir for galectins.
Unique molecular architecture of silk fibroin in the waxmoth, Galleria mellonella.

PubMed

Zurovec, Michal; Sehnal, Frantisek

2002-06-21

Proteins of silk fibers are characterized by reiterations of amino acid repeats. Physical properties of the fiber are determined by the amino acid composition, the complexity of repetitive units, and arrangement of these units into higher order arrays. Except for very short motifs of 6-10 residues, the length of repetitive units and the number of these units concatenated in higher order assemblies vary in all spider and lepidopteran silks analyzed so far. This paper describes an exceptional silk protein represented by the 500-kDa heavy chain fibroin (H-fibroin) of the waxmoth, Galleria mellonella. Its non-repetitive N-terminal (175 residues) and C-terminal (60 residues) parts, the overall gene organization, and the nucleotide sequence around the TATA box show that it is homologous to the H-fibroins of other Lepidoptera. However, over 95% of the protein consists of highly ordered repetitive structures that are unmatched in other species. The repetitive region includes 11 assemblies AB(1)AB(1)AB(1)AB(2)(AB(2))AB(2) of remarkably conserved polypeptide repeats A (63 amino acid residues), B(1) (43 residues), and B(2) (18 residues). The repeats contain a high proportion of Gly (31.6%), Ala (23.8%), Ser (18.1%), and of residues with long hydrophobic side chains (16% for Leu, Ile, and Val combined). The presence of the GLGGLG and SSAASAA(AA) motifs suggests formation of pleated beta-sheets and their stacking into crystallites. Conspicuous conservation of the apolar sequence VIVI followed by DD or ED is interpreted as indicating the importance of hydrophobicity and electrostatic charge in H-fibroin cross-linking. The environment of G. mellonella larvae within bee cultures requires continuous production of silk that must be both strong and elastic. The spectacular arrangement of the repetitive H-fibroin region apparently evolved to meet these requirements.
Insertion of reticuloendotheliosis virus long terminal repeat into CVI988 strain of Marek’s disease virus results in enhanced growth and protection

USDA-ARS?s Scientific Manuscript database

It has been reported that co-cultivation of a JM/102W strain, a virulent strain of Marek’s disease virus (MDV), with reticuloendotheliosis virus (REV) resulted in the integration of REV long terminal repeat (LTR) into the MDV repeat region. The resulting virus, RM1, was unable to transform T-cells ...
Circularized Chromosome with a Large Palindromic Structure in Streptomyces griseus Mutants

PubMed Central

Uchida, Tetsuya; Ishihara, Naoto; Zenitani, Hiroyuki; Hiratsu, Keiichiro; Kinashi, Haruyasu

2004-01-01

Streptomyces linear chromosomes display various types of rearrangements after telomere deletion, including circularization, arm replacement, and amplification. We analyzed the new chromosomal deletion mutants Streptomyces griseus 301-22-L and 301-22-M. In these mutants, chromosomal arm replacement resulted in long terminal inverted repeats (TIRs) at both ends; different sizes were deleted again and recombined inside the TIRs, resulting in a circular chromosome with an extremely large palindrome. Short palindromic sequences were found in parent strain 2247, and these sequences might have played a role in the formation of this unique structure. Dynamic structural changes of Streptomyces linear chromosomes shown by this and previous studies revealed extraordinary strategies of members of this genus to keep a functional chromosome, even if it is linear or circular. PMID:15150216
Molecular interactions involved in the transactivation of the human T-cell leukemia virus type 1 promoter mediated by Tax and CREB-2 (ATF-4).

PubMed

Gachon, F; Thebault, S; Peleraux, A; Devaux, C; Mesnard, J M

2000-05-01

The human T-cell leukemia virus type 1 (HTLV-1) Tax protein activates viral transcription through three 21-bp repeats located in the U3 region of the HTLV-1 long terminal repeat and called Tax-responsive elements (TxREs). Each TxRE contains nucleotide sequences corresponding to imperfect cyclic AMP response elements (CRE). In this study, we demonstrate that the bZIP transcriptional factor CREB-2 is able to bind in vitro to the TxREs and that CREB-2 binding to each of the 21-bp motifs is enhanced by Tax. We also demonstrate that Tax can weakly interact with CREB-2 bound to a cellular palindromic CRE motif such as that found in the somatostatin promoter. Mutagenesis of Tax and CREB-2 demonstrates that both N- and C-terminal domains of Tax and the C-terminal region of CREB-2 are required for direct interaction between the two proteins. In addition, the Tax mutant M47, defective for HTLV-1 activation, is unable to form in vitro a ternary complex with CREB-2 and TxRE. In agreement with recent results suggesting that Tax can recruit the coactivator CREB-binding protein (CBP) on the HTLV-1 promoter, we provide evidence that Tax, CREB-2, and CBP are capable of cooperating to stimulate viral transcription. Taken together, our data highlight the major role played by CREB-2 in Tax-mediated transactivation.
The Carboxy-Terminal Domain of Erb1 Is a Seven-Bladed ß-Propeller that Binds RNA

PubMed Central

Marcin, Wegrecki; Neira, Jose Luis; Bravo, Jeronimo

2015-01-01

Erb1 (Eukaryotic Ribosome Biogenesis 1) protein is essential for the maturation of the ribosomal 60S subunit. Functional studies in yeast and mammalian cells showed that altogether with Nop7 and Ytm1 it forms a stable subcomplex called PeBoW that is crucial for a correct rRNA processing. The exact function of the protein within the process remains unknown. The N-terminal region of the protein includes a well conserved region shown to be involved in PeBoW complex formation whereas the carboxy-terminal half was predicted to contain seven WD40 repeats. This first structural report on Erb1 from yeast describes the architecture of a seven-bladed β-propeller domain that revealed a characteristic extra motif formed by two α-helices and a β-strand that insert within the second WD repeat. We performed analysis of molecular surface and crystal packing, together with multiple sequence alignment and comparison of the structure with other β-propellers, in order to identify areas that are more likely to mediate protein-protein interactions. The abundance of many positively charged residues on the surface of the domain led us to investigate whether the propeller of Erb1 might be involved in RNA binding. Three independent assays confirmed that the protein interacted in vitro with polyuridilic acid (polyU), thus suggesting a possible role of the domain in rRNA rearrangement during ribosome biogenesis. PMID:25880847
Molecular Interactions Involved in the Transactivation of the Human T-Cell Leukemia Virus Type 1 Promoter Mediated by Tax and CREB-2 (ATF-4)

PubMed Central

Gachon, Frederic; Thebault, Sabine; Peleraux, Annick; Devaux, Christian; Mesnard, Jean-Michel

2000-01-01

The human T-cell leukemia virus type 1 (HTLV-1) Tax protein activates viral transcription through three 21-bp repeats located in the U3 region of the HTLV-1 long terminal repeat and called Tax-responsive elements (TxREs). Each TxRE contains nucleotide sequences corresponding to imperfect cyclic AMP response elements (CRE). In this study, we demonstrate that the bZIP transcriptional factor CREB-2 is able to bind in vitro to the TxREs and that CREB-2 binding to each of the 21-bp motifs is enhanced by Tax. We also demonstrate that Tax can weakly interact with CREB-2 bound to a cellular palindromic CRE motif such as that found in the somatostatin promoter. Mutagenesis of Tax and CREB-2 demonstrates that both N- and C-terminal domains of Tax and the C-terminal region of CREB-2 are required for direct interaction between the two proteins. In addition, the Tax mutant M47, defective for HTLV-1 activation, is unable to form in vitro a ternary complex with CREB-2 and TxRE. In agreement with recent results suggesting that Tax can recruit the coactivator CREB-binding protein (CBP) on the HTLV-1 promoter, we provide evidence that Tax, CREB-2, and CBP are capable of cooperating to stimulate viral transcription. Taken together, our data highlight the major role played by CREB-2 in Tax-mediated transactivation. PMID:10779337
Two different factors act separately or together to specify functionally distinct activities at a single transcriptional enhancer.

PubMed Central

DeFranco, D; Yamamoto, K R

1986-01-01

The expression of genes fused downstream of the Moloney murine sarcoma virus (MoMSV) long terminal repeat is stimulated by glucocorticoids. We mapped the glucocorticoid response element that conferred this hormonal regulation and found that it is a hormone-dependent transcriptional enhancer, designated Sg; it resides within DNA fragments that also carry a previously described enhancer element (B. Levinson, G. Khoury, G. Vande Woude, and P. Gruss, Nature [London] 295:568-572, 1982), here termed Sa, whose activity is independent of the hormone. Nuclease footprinting revealed that purified glucocorticoid receptor bound at multiple discrete sites within and at the borders of the tandemly repeated sequence motif that defines Sa. The Sa and Sg activities stimulated the apparent efficiency of cognate or heterologous promoter utilization, individually providing modest enhancement and in concert yielding higher levels of activity. A deletion mutant lacking most of the tandem repeat but retaining a single receptor footprint sequence lost Sa activity but still conferred Sg activity. The two enhancer components could also be distinguished physiologically: both were operative within cultured rat fibroblasts, but only Sg activity was detectable in rat exocrine pancreas cells. Therefore, the sequence determinants of Sa and Sg activity may be interdigitated, and when both components are active, the receptor and a putative Sa factor can apparently bind and act simultaneously. We concluded that MoMSV enhancer activity is effected by at least two distinct binding factors, suggesting that combinatorial regulation of promoter function can be mediated even from a single genetic element. Images PMID:3023887
Molecular characterization and in situ mRNA localization of the neural recognition molecule J1-160/180: a modular structure similar to tenascin

PubMed Central

1993-01-01

The oligodendrocyte-derived extracellular matrix glycoprotein J1- 160/180 is a recognition molecule expressed exclusively in the central nervous system. J1-160/180 has been shown to be adhesive for astrocytes and repellent towards neurons and growth cones. We report here the complete nucleotide sequence of J1-160/180 in the rat. The predicted amino acid sequence showed a structural architecture very similar to tenascin: a cysteine-rich amino terminal region is followed by 4.5 epidermal growth factor-like repeats, 9 fibronectin type III homologous repeats and a domain homologous to fibrinogen. Sequence comparison analysis revealed highest homology of rat J1-160/180 to mouse tenascin and chicken restrictin with a similarity of 66% and 85%, respectively. The J1-160/180-coding mRNA is derived from a single copy gene. Using the polymerase chain reaction we could show that two J1-160/180 isoforms are generated by alternative splicing of the sixth fibronectin type III homologous repeat. Localization of J1-160/180 mRNA by in situ hybridization in the cerebellum, hippocampus and olfactory bulb confirmed the expression of J1-160/180 by oligodendrocytes with a peak of transcription at 7-14 d after birth, indicating a functional role during myelination. In addition, J1-160/180-specific RNA was found in a small subset of neurons in all three structures of the CNS analyzed. These neurons continue to express J1-160/180 in the adult. PMID:7679676
Molecular identification of Mango, Mangifera indica L.var. totupura

PubMed Central

Jagarlamudi, Sankar; G, Rosaiah; Kurapati, Ravi Kumar; Pinnamaneni, Rajasekhar

2011-01-01

Mango (>Mangifera indica) belonging to Anacardiaceae family is a fruit that grows in tropical regions. It is considered as the King of fruits. The present work was taken up to identify a tool in identifying the mango species at the molecular level. The chloroplast trnL-F region was amplified from extracted total genomic DNA using the polymerase chain reaction (PCR) and sequenced. Sequence of the dominant DGGE band revealed that Mangifera indica in tested leaves was Mangifera indica (100% similarity to the ITS sequences of Mangifera indica). This sequence was deposited in NCBI with the accession no. GQ927757. Abbreviations AFLP - Amplified fragment length polymorphism , cpDNA - Chloroplast DNA, DDGE - Denaturing gradient gel electrophoresis, DNA - Deoxyribo nucleic acid, EDTA - Ethylenediamine tetraacetic acid, HCl - Hydrochloric acid, ISSR - Inter simple sequence repeats, ITS - Internal transcribed spacer, MATAB - Methyl Ammonium Bromide, Na2SO3 - Sodium sulphite, NaCl - Sodium chloride, NCBI - National Centre for Biotechnology Information, PCR - Polymerase chain reaction, PEG - Polyethylene glycol, RAPD - Randomly amplified polymorphic DNA, trnL-F - Transfer RNA genes start codon- termination codon. PMID:21423885
Sequence repeats and protein structure

NASA Astrophysics Data System (ADS)

Hoang, Trinh X.; Trovato, Antonio; Seno, Flavio; Banavar, Jayanth R.; Maritan, Amos

2012-11-01

Repeats are frequently found in known protein sequences. The level of sequence conservation in tandem repeats correlates with their propensities to be intrinsically disordered. We employ a coarse-grained model of a protein with a two-letter amino acid alphabet, hydrophobic (H) and polar (P), to examine the sequence-structure relationship in the realm of repeated sequences. A fraction of repeated sequences comprises a distinct class of bad folders, whose folding temperatures are much lower than those of random sequences. Imperfection in sequence repetition improves the folding properties of the bad folders while deteriorating those of the good folders. Our results may explain why nature has utilized repeated sequences for their versatility and especially to design functional proteins that are intrinsically unstructured at physiological temperatures.
Distribution and Evolution of Yersinia Leucine-Rich Repeat Proteins

PubMed Central

Hu, Yueming; Huang, He; Hui, Xinjie; Cheng, Xi; White, Aaron P.

2016-01-01

Leucine-rich repeat (LRR) proteins are widely distributed in bacteria, playing important roles in various protein-protein interaction processes. In Yersinia, the well-characterized type III secreted effector YopM also belongs to the LRR protein family and is encoded by virulence plasmids. However, little has been known about other LRR members encoded by Yersinia genomes or their evolution. In this study, the Yersinia LRR proteins were comprehensively screened, categorized, and compared. The LRR proteins encoded by chromosomes (LRR1 proteins) appeared to be more similar to each other and different from those encoded by plasmids (LRR2 proteins) with regard to repeat-unit length, amino acid composition profile, and gene expression regulation circuits. LRR1 proteins were also different from LRR2 proteins in that the LRR1 proteins contained an E3 ligase domain (NEL domain) in the C-terminal region or an NEL domain-encoding nucleotide relic in flanking genomic sequences. The LRR1 protein-encoding genes (LRR1 genes) varied dramatically and were categorized into 4 subgroups (a to d), with the LRR1a to -c genes evolving from the same ancestor and LRR1d genes evolving from another ancestor. The consensus and ancestor repeat-unit sequences were inferred for different LRR1 protein subgroups by use of a maximum parsimony modeling strategy. Structural modeling disclosed very similar repeat-unit structures between LRR1 and LRR2 proteins despite the different unit lengths and amino acid compositions. Structural constraints may serve as the driving force to explain the observed mutations in the LRR regions. This study suggests that there may be functional variation and lays the foundation for future experiments investigating the functions of the chromosomally encoded LRR proteins of Yersinia. PMID:27217422
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sharma, Manisha; Jamieson, Cara; Lui, Christina

β-catenin is a key mediator of Wnt signaling and its deregulated nuclear accumulation can drive cancer progression. While the central armadillo (Arm) repeats of β-catenin stimulate nuclear entry, the N- and C-terminal “tail” sequences are thought to regulate turnover and transactivation. We show here that the N- and C-tails are also potent transport sequences. The unstructured tails of β-catenin, when individually fused to a GFP-reporter, could enter and exit the nucleus rapidly in live cells. Proximity ligation assays and pull-down assays identified a weak interaction between the tail sequences and the FG-repeats of nucleoporins, consistent with a possible direct translocationmore » of β-catenin through the nuclear pore complex. Extensive alanine mutagenesis of the tail sequences revealed that nuclear translocation of β-catenin was dependent on specific uniformly distributed patches of hydrophobic residues, whereas the mutagenesis of acidic amino acids had no effect. Moreover, the mutation of hydrophobic patches within the N-tail and C-tail of full length β-catenin reduced nuclear transport rate and diminished its ability to activate transcription. We propose that the tail sequences can contribute to β-catenin transport and suggest a possible similar role for hydrophobic unstructured regions in other proteins. - Highlights: • We show that the N- and C-tails of beta-catenin possess nuclear transport activity. • Nuclear transport of the N- or C-tails requires specific hydrophobic amino acids. • Mutagenesis of the N-terminus diminished nuclear entry of full-length beta-catenin. • We propose the N-tail contributes to beta-catenin nuclear entry and transactivation.« less
Length variation and sequence divergence in mitochondrial control region of Schizothoracine (Teleostei: Cyperinidae) species.

PubMed

Syed, Mudasir Ahmad; Bhat, Farooz Ahmad; Balkhi, Masood-ul Hassan; Bhat, Bilal Ahmad

2016-01-01

Schizothoracine fish commonly called snow trouts inhibit the entire network of snow and spring fed cool waters of Kashmir, India. Over 10 species reported earlier, only five species have been found, these include Schizothorax niger, Schizothorax esocinus, Schizothorax plagiostomus, Schizothorax curvifrons and Schizothorax labiatus. The relationship between these species is contradicting. To understand the evolutionary relation of these species, we examined the sequence information of mitochondrial D-loop of 25 individuals representing five species. Sequence alignment showed D-loop region highly variable and length variation was observed in di-nucleotide (TA)n microsatellite between and within species. Interestingly, all these species have (TA)n microsatellite not associated with longer tandem repeats at the 3' end of the mitochondrial control region and do not show heteroplasmy. Our analysis also indicates the presence of four conserved sequence blocks (CSB), CSB-D, CSB-1, CSB-II and CSB-III, four (Termination Associated Sequence) TAS motifs and 15bp pyrimidine block within the mitochondrial control region, that are highly conserved within genus Schizothorax when compared with other species. The phylogenetic analysis carried by Maximum likelihood (ML), Neighbor Joining (NJ) and Bayesian inference (BI) generated almost identical results. The resultant BI tree showed a close genetic relationship of all the five species and supports two distinct grouping of S. esocinus species. Besides the species relation, the presence of length variation in tandem repeats is attributed to differences in predicting the stability of secondary structures. The role of CSBs and TASs, reported so far as main regulatory signals, would explain the conservation of these elements in evolution.
Human T-lymphotropic virus type 1 (HTLV-1) genetic typing in Kakeroma Island, an island at the crossroads of the ryukyuans and Wajin in Japan, providing further insights into the origin of the virus in Japan.

PubMed

Eguchi, Katsuyuki; Fujii, Hidefumi; Oshima, Kengo; Otani, Masashi; Matsuo, Toshiaki; Yamamoto, Taro

2009-08-01

Peripheral blood samples were collected from 23 human T-lymphotropic virus type-1 (HTLV-1) carriers residing in Kakeroma Island, Japan (Kagoshima Prefecture, Oshima County, Setouchi Town), one of the most highly endemic areas in Japan. The samples were subjected to amplification by PCR and sequencing of the Long Terminal Repeat in order to reconstruct a phylogenetic tree of HTLV-1 isolates. Restriction Fragment Length Polymorphism (RFLP) analysis of env region was also conducted for subgrouping of HTLV-1. Although one sample could not be amplified by PCR, and three more could not be sequenced due to the existence of conspicuous nonspecific bands or repeated sequences, the phylogenetic analysis revealed that the remaining 19 isolates obtained from Kakeroma Island belonged to either the Transcontinental or the Japanese subgroups of the Cosmopolitan subtype, one of the three major subtypes. The RFLP data corresponded closely with the typing data throughout the sequencing. The proportion of the Transcontinental subgroup among the isolates was 26.3% (5 of 19) by sequence analysis and 27.3% (6 of 22) by RFLP. Unlike in Taiwan, China and Okinawa, the Japanese subgroup was dominant in Kakeroma Island. The analysis would also suggest that the Japanese subgroup seems not to have derived from the Transcontinental subgroup, but rather that the Transcontinental subgroup came to Japan first and was followed later by the Japanese one. 2009 Wiley-Liss, Inc.
A variant Tc4 transposable element in the nematode C. elegans could encode a novel protein.

PubMed Central

Li, W; Shaw, J E

1993-01-01

A variant C. elegans Tc4 transposable element, Tc4-rh1030, has been sequenced and is 3483 bp long. The Tc4 element that had been analyzed previously is 1605 bp long, consists of two 774-bp nearly perfect inverted terminal repeats connected by a 57-bp loop, and lacks significant open reading frames. In Tc4-rh1030, by comparison, a 2343-bp novel sequence is present in place of a 477-bp segment in one of the inverted repeats. The novel sequence of Tc4-rh1030 is present about five times per haploid genome and is invariably associated with Tc4 elements; we have used the designation Tc4v to denote this variant subfamily of Tc4 elements. Sequence analysis of three cDNA clones suggests that a Tc4v element contains at least five exons that could encode a novel basic protein of 537 amino acid residues. On northern blots, a 1.6-kb Tc4v-specific transcript was detected in the mutator strain TR679 but not in the wild-type strain N2; Tc4 elements are known to transpose in TR679 but appear to be quiescent in N2. We have analyzed transcripts produced by an unc-33 gene that has the Tc4-rh1030 insertional mutation in its transcribed region; all or almost all of the Tc4v sequence is frequently spliced out of the mutant unc-33 transcripts, sometimes by means of non-consensus splice acceptor sites. Images PMID:8382791
Isolation and sequence analysis of the wheat B genome subtelomeric DNA.

PubMed

Salina, Elena A; Sergeeva, Ekaterina M; Adonina, Irina G; Shcherban, Andrey B; Afonnikov, Dmitry A; Belcram, Harry; Huneau, Cecile; Chalhoub, Boulos

2009-09-05

Telomeric and subtelomeric regions are essential for genome stability and regular chromosome replication. In this work, we have characterized the wheat BAC (bacterial artificial chromosome) clones containing Spelt1 and Spelt52 sequences, which belong to the subtelomeric repeats of the B/G genomes of wheats and Aegilops species from the section Sitopsis. The BAC library from Triticum aestivum cv. Renan was screened using Spelt1 and Spelt52 as probes. Nine positive clones were isolated; of them, clone 2050O8 was localized mainly to the distal parts of wheat chromosomes by in situ hybridization. The distribution of the other clones indicated the presence of different types of repetitive sequences in BACs. Use of different approaches allowed us to prove that seven of the nine isolated clones belonged to the subtelomeric chromosomal regions. Clone 2050O8 was sequenced and its sequence of 119,737 bp was annotated. It is composed of 33% transposable elements (TEs), 8.2% Spelt52 (namely, the subfamily Spelt52.2) and five non-TE-related genes. DNA transposons are predominant, making up 24.6% of the entire BAC clone, whereas retroelements account for 8.4% of the clone length. The full-length CACTA transposon Caspar covers 11,666 bp, encoding a transposase and CTG-2 proteins, and this transposon accounts for 40% of the DNA transposons. The in situ hybridization data for 2050O8 derived subclones in combination with the BLAST search against wheat mapped ESTs (expressed sequence tags) suggest that clone 2050O8 is located in the terminal bin 4BL-10 (0.95-1.0). Additionally, four of the predicted 2050O8 genes showed significant homology to four putative orthologous rice genes in the distal part of rice chromosome 3S and confirm the synteny to wheat 4BL. Satellite DNA sequences from the subtelomeric regions of diploid wheat progenitor can be used for selecting the BAC clones from the corresponding regions of hexaploid wheat chromosomes. It has been demonstrated for the first time that Spelt52 sequences were involved in the evolution of terminal regions of common wheat chromosomes. Our research provides new insights into the microcollinearity in the terminal regions of wheat chromosomes 4BL and rice chromosome 3S.

AT-rich sequence elements promote nascent transcript cleavage leading to RNA polymerase II termination

PubMed Central

White, Eleanor; Kamieniarz-Gdula, Kinga; Dye, Michael J.; Proudfoot, Nick J.

2013-01-01

RNA Polymerase II (Pol II) termination is dependent on RNA processing signals as well as specific terminator elements located downstream of the poly(A) site. One of the two major terminator classes described so far is the Co-Transcriptional Cleavage (CoTC) element. We show that homopolymer A/T tracts within the human β-globin CoTC-mediated terminator element play a critical role in Pol II termination. These short A/T tracts, dispersed within seemingly random sequences, are strong terminator elements, and bioinformatics analysis confirms the presence of such sequences in 70% of the putative terminator regions (PTRs) genome-wide. PMID:23258704
Effects of pre- and pro-sequence of thaumatin on the secretion by Pichia pastoris.

PubMed

Ide, Nobuyuki; Masuda, Tetsuya; Kitabatake, Naofumi

2007-11-23

Thaumatin is a 22-kDa sweet-tasting protein containing eight disulfide bonds. When thaumatin is expressed in Pichia pastoris using the thaumatin cDNA fused with both the alpha-factor signal sequence and the Kex2 protease cleavage site from Saccharomyces cerevisiae, the N-terminal sequence of the secreted thaumatin molecule is not processed correctly. To examine the role of the thaumatin cDNA-encoded N-terminal pre-sequence and C-terminal pro-sequence on the processing of thaumatin and efficiency of thaumatin production in P. pastoris, four expression plasmids with different pre-sequence and pro-sequence were constructed and transformed into P. pastoris. The transformants containing pre-thaumatin gene that has the native plant signal, secreted thaumatin molecules in the medium. The N-terminal amino acid sequence of the secreted thaumatin molecule was processed correctly. The production yield of thaumatin was not affected by the C-terminal pro-sequence, and the pro-sequence was not processed in P. pastoris, indicating that pro-sequence is not necessary for thaumatin synthesis.
One precursor, three apolipoproteins: the relationship between two crustacean lipoproteins, the large discoidal lipoprotein and the high density lipoprotein/β-glucan binding protein.

PubMed

Stieb, Stefanie; Roth, Ziv; Dal Magro, Christina; Fischer, Sabine; Butz, Eric; Sagi, Amir; Khalaila, Isam; Lieb, Bernhard; Schenk, Sven; Hoeger, Ulrich

2014-12-01

The novel discoidal lipoprotein (dLp) recently detected in the crayfish, differs from other crustacean lipoproteins in its large size, apoprotein composition and high lipid binding capacity, We identified the dLp sequence by transcriptome analyses of the hepatopancreas and mass spectrometry. Further de novo assembly of the NGS data followed by BLAST searches using the sequence of the high density lipoprotein/1-glucan binding protein (HDL-BGBP) of Astacus leptodactylus as query revealed a putative precursor molecule with an open reading frame of 14.7 kb and a deduced primary structure of 4889 amino acids. The presence of an N-terminal lipid bind- ing domain and a DUF 1943 domain suggests the relationship with the large lipid transfer proteins. Two-putative dibasic furin cleavage sites were identified bordering the sequence of the HDL-BGBP. When subjected to mass spectroscopic analyses, tryptic peptides of the large apoprotein of dLp matched the N-terminal part of the precursor, while the peptides obtained for its small apoprotein matched the C-terminal part. Repeating the analysis in the prawn Macrobrachium rosenbergii revealed a similar protein with identical domain architecture suggesting that our findings do not represent an isolated instance. Our results indicate that the above three apolipoproteins (i.e HDL-BGBP and both the large and the small subunit of dLp) are translated as a large precursor. Cleavage at the furin type sites releases two subunits forming a heterodimeric dLP particle, while the remaining part forms an HDL-BGBP whose relationship with other lipoproteins as well as specific functions are yet to be elucidated.
RTS,S/AS01 malaria vaccine mismatch observed among Plasmodium falciparum isolates from southern and central Africa and globally.

PubMed

Pringle, Julia C; Carpi, Giovanna; Almagro-Garcia, Jacob; Zhu, Sha Joe; Kobayashi, Tamaki; Mulenga, Modest; Bobanga, Thierry; Chaponda, Mike; Moss, William J; Norris, Douglas E

2018-04-26

The RTS,S/AS01 malaria vaccine encompasses the central repeats and C-terminal of Plasmodium falciparum circumsporozoite protein (PfCSP). Although no Phase II clinical trial studies observed evidence of strain-specific immunity, recent studies show a decrease in vaccine efficacy against non-vaccine strain parasites. In light of goals to reduce malaria morbidity, anticipating the effectiveness of RTS,S/AS01 is critical to planning widespread vaccine introduction. We deep sequenced C-terminal Pfcsp from 77 individuals living along the international border in Luapula Province, Zambia and Haut-Katanga Province, the Democratic Republic of the Congo (DRC) and compared translated amino acid haplotypes to the 3D7 vaccine strain. Only 5.2% of the 193 PfCSP sequences from the Zambia-DRC border region matched 3D7 at all 84 amino acids. To further contextualize the genetic diversity sampled in this study with global PfCSP diversity, we analyzed an additional 3,809 Pfcsp sequences from the Pf3k database and constructed a haplotype network representing 15 countries from Africa and Asia. The diversity observed in our samples was similar to the diversity observed in the global haplotype network. These observations underscore the need for additional research assessing genetic diversity in P. falciparum and the impact of PfCSP diversity on RTS,S/AS01 efficacy.
Reversal of a Neurospora Translocation by Crossing over Involving Displaced Rdna, and Methylation of the Rdna Segments That Result from Recombination

PubMed Central

Perkins, David D.; Metzenberg, Robert L.; Raju, Namboori B.; Selker, Eric U.; Barry, Edward G.

1986-01-01

In translocation OY321 of Neurospora crassa, the nucleolus organizer is divided into two segments, a proximal portion located interstitially in one interchange chromosome, and a distal portion now located terminally on another chromosome, linkage group I. In crosses of Translocation x Translocation, exceptional progeny are recovered nonselectively in which the chromosome sequence has apparently reverted to Normal. Genetic, cytological, and molecular evidence indicates that reversion is the result of meiotic crossing over between homologous displaced rDNA repeats. Marker linkages are wild type in these exceptional progeny. They differ from wild type, however, in retaining an interstitial block of rRNA genes which can be demonstrated cytologically by the presence of a second, small interstitial nucleolus and genetically by linkage of an rDNA restriction site polymorphism to the mating-type locus in linkage group I. The interstitial rDNA is more highly methylated than the terminal rDNA. The mechanism by which methylation enzymes distinguish between interstitial rDNA and terminal rDNA is unknown. Some hypotheses are considered. PMID:2947829
The transcriptional terminator sequences downstream of the covR gene terminate covR/S operon transcription to generate covR monocistronic transcripts in Streptococcus pyogenes.

PubMed

Chiang-Ni, Chuan; Tsou, Chih-Cheng; Lin, Yee-Shin; Chuang, Woei-Jer; Lin, Ming-T; Liu, Ching-Chuan; Wu, Jiunn-Jong

2008-12-31

CovR/S is an important two component regulatory system, which regulates about 15% of the gene expression in Streptococcus pyogenes. The covR/S locus was identified as an operon generating an RNA transcript around 2.5-kb in size. In this study, we found the covR/S operon produced three RNA transcripts (around 2.5-, 1.0-, and 0.8-kb in size). Using RNA transcriptional terminator sequence prediction and transcriptional terminator analysis, we identified two atypical rho-independent terminator sequences downstream of the covR gene and showed these terminator sequences terminate RNA transcription efficiently. These results indicate that covR/S operon generates covR/S transcript and monocistronic covR transcripts.
Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Torella, JP; Lienert, F; Boehm, CR

2014-08-07

Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less
Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

PubMed Central

Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

2016-01-01

Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822
Evolutionary dynamics of retrotransposons assessed by high-throughput sequencing in wild relatives of wheat.

PubMed

Senerchia, Natacha; Wicker, Thomas; Felber, François; Parisod, Christian

2013-01-01

Transposable elements (TEs) represent a major fraction of plant genomes and drive their evolution. An improved understanding of genome evolution requires the dynamics of a large number of TE families to be considered. We put forward an approach bypassing the required step of a complete reference genome to assess the evolutionary trajectories of high copy number TE families from genome snapshot with high-throughput sequencing. Low coverage sequencing of the complex genomes of Aegilops cylindrica and Ae. geniculata using 454 identified more than 70% of the sequences as known TEs, mainly long terminal repeat (LTR) retrotransposons. Comparing the abundance of reads as well as patterns of sequence diversity and divergence within and among genomes assessed the dynamics of 44 major LTR retrotransposon families of the 165 identified. In particular, molecular population genetics on individual TE copies distinguished recently active from quiescent families and highlighted different evolutionary trajectories of retrotransposons among related species. This work presents a suite of tools suitable for current sequencing data, allowing to address the genome-wide evolutionary dynamics of TEs at the family level and advancing our understanding of the evolution of nonmodel genomes.
NMR Analysis of Amide Hydrogen Exchange Rates in a Pentapeptide-Repeat Protein from A. thaliana.

PubMed

Xu, Shenyuan; Ni, Shuisong; Kennedy, Michael A

2017-05-23

At2g44920 from Arabidopsis thaliana is a pentapeptide-repeat protein (PRP) composed of 25 repeats capped by N- and C-terminal α-helices. PRP structures are dominated by four-sided right-handed β-helices typically consisting of mixtures of type II and type IV β-turns. PRPs adopt repeated five-residue (Rfr) folds with an Rfr consensus sequence (STAV)(D/N)(L/F)(S/T/R)(X). Unlike other PRPs, At2g44920 consists exclusively of type II β-turns. At2g44920 is predicted to be located in the thylakoid lumen although its biochemical function remains unknown. Given its unusual structure, we investigated the biophysical properties of At2g44920 as a representative of the β-helix family to determine if it had exceptional global stability, backbone dynamics, or amide hydrogen exchange rates. Circular dichroism measurements yielded a melting point of 62.8°C, indicating unexceptional global thermal stability. Nuclear spin relaxation measurements indicated that the Rfr-fold core was rigid with order parameters ranging from 0.7 to 0.9. At2g44920 exhibited a striking range of amide hydrogen exchange rates spanning 10 orders of magnitude, with lifetimes ranging from minutes to several months. A weak correlation was found among hydrogen exchange rates, hydrogen bonding energies, and amino acid solvent-accessible areas. Analysis of contributions from fast (approximately picosecond to nanosecond) backbone dynamics to amide hydrogen exchange rates revealed that the average order parameter of amides undergoing fast exchange was significantly smaller compared to those undergoing slow exchange. Importantly, the activation energies for amide hydrogen exchange were found to be generally higher for the slowest exchanging amides in the central Rfr coil and decreased toward the terminal coils. This could be explained by assuming that the concerted motions of two preceding or following coils required for hydrogen bond disruption and amide hydrogen exchange have a higher activation energy compared to that required for displacement of a single coil to facilitate amide hydrogen exchange in either the terminal or penultimate coils. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Sequence-specific epigenetic effects of the maternal somatic genome on developmental rearrangements of the zygotic genome in Paramecium primaurelia.

PubMed Central

Meyer, E; Butler, A; Dubrana, K; Duharcourt, S; Caron, F

1997-01-01

In ciliates, the germ line genome is extensively rearranged during the development of the somatic macronucleus from a mitotic product of the zygotic nucleus. Germ line chromosomes are fragmented in specific regions, and a large number of internal sequence elements are eliminated. It was previously shown that transformation of the vegetative macronucleus of Paramecium primaurelia with a plasmid containing a subtelomeric surface antigen gene can affect the processing of the homologous germ line genomic region during development of a new macronucleus in sexual progeny of transformed clones. The gene and telomere-proximal flanking sequences are deleted from the new macronuclear genome, although the germ line genome remains wild type. Here we show that plasmids containing nonoverlapping segments of the same genomic region are able to induce similar terminal deletions; the locations of deletion end points depend on the particular sequence used. Transformation of the maternal macronucleus with a sequence internal to a macronuclear chromosome also causes the occurrence of internal deletions between short direct repeats composed of alternating thymines and adenines. The epigenetic influence of maternal macronuclear sequences on developmental rearrangements of the zygotic genome thus appears to be both sequence specific and general, suggesting that this trans-nucleus effect is mediated by pairing of homologous sequences. PMID:9199294
Transcription of telomeric DNA leads to high levels of homologous recombination and t-loops.

PubMed

Kar, Anirban; Willcox, Smaranda; Griffith, Jack D

2016-11-02

The formation of DNA loops at chromosome ends (t-loops) and the transcription of telomeres producing G-rich RNA (TERRA) represent two central features of telomeres. To explore a possible link between them we employed artificial human telomeres containing long arrays of TTAGGG repeats flanked by the T7 or T3 promoters. Transcription of these DNAs generates a high frequency of t-loops within individual molecules and homologous recombination events between different DNAs at their telomeric sequences. T-loop formation does not require a single strand overhang, arguing that both terminal strands insert into the preceding duplex. The loops are very stable and some RNase H resistant TERRA remains at the t-loop, likely adding to their stability. Transcription of DNAs containing TTAGTG or TGAGTG repeats showed greatly reduced loop formation. While in the cell multiple pathways may lead to t-loop formation, the pathway revealed here does not depend on the shelterins but rather on the unique character of telomeric DNA when it is opened for transcription. Hence, telomeric sequences may have evolved to facilitate their ability to loop back on themselves. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
A novel family of sequence-specific endoribonucleases associated with the clustered regularly interspaced short palindromic repeats.

PubMed

Beloglazova, Natalia; Brown, Greg; Zimmerman, Matthew D; Proudfoot, Michael; Makarova, Kira S; Kudritska, Marina; Kochinyan, Samvel; Wang, Shuren; Chruszcz, Maksymilian; Minor, Wladek; Koonin, Eugene V; Edwards, Aled M; Savchenko, Alexei; Yakunin, Alexander F

2008-07-18

Clustered regularly interspaced short palindromic repeats (CRISPRs) together with the associated CAS proteins protect microbial cells from invasion by foreign genetic elements using presently unknown molecular mechanisms. All CRISPR systems contain proteins of the CAS2 family, suggesting that these uncharacterized proteins play a central role in this process. Here we show that the CAS2 proteins represent a novel family of endoribonucleases. Six purified CAS2 proteins from diverse organisms cleaved single-stranded RNAs preferentially within U-rich regions. A representative CAS2 enzyme, SSO1404 from Sulfolobus solfataricus, cleaved the phosphodiester linkage on the 3'-side and generated 5'-phosphate- and 3'-hydroxyl-terminated oligonucleotides. The crystal structure of SSO1404 was solved at 1.6A resolution revealing the first ribonuclease with a ferredoxin-like fold. Mutagenesis of SSO1404 identified six residues (Tyr-9, Asp-10, Arg-17, Arg-19, Arg-31, and Phe-37) that are important for enzymatic activity and suggested that Asp-10 might be the principal catalytic residue. Thus, CAS2 proteins are sequence-specific endoribonucleases, and we propose that their role in the CRISPR-mediated anti-phage defense might involve degradation of phage or cellular mRNAs.
The coactivator CBP stimulates human T-cell lymphotrophic virus type I Tax transactivation in vitro.

PubMed

Kashanchi, F; Duvall, J F; Kwok, R P; Lundblad, J R; Goodman, R H; Brady, J N

1998-12-18

Tax interacts with the cellular cyclic AMP-responsive element binding protein (CREB) and facilitates the binding of the coactivator CREB binding protein (CBP), forming a multimeric complex on the cyclic AMP-responsive element (CRE)-like sites in the human T-cell lymphotrophic virus type I (HTLV-I) promoter. The trimeric complex is believed to recruit additional regulatory proteins to the HTLV-I long terminal repeat, but there has been no direct evidence that CBP is required for Tax-mediated transactivation. We present evidence that Tax and CBP activate transcription from the HTLV-I 21 base pair repeats on naked DNA templates. Transcriptional activation of the HTLV-I sequences required both Tax and CBP and could be mediated by either the N-terminal activation domain of CBP or the full-length protein. Fluorescence polarization binding assays indicated that CBP does not markedly enhance the affinity of Tax for the trimeric complex. Transcription analyses suggest that CBP activates Tax-dependent transcription by promoting transcriptional initiation and reinitiation. The ability of CBP to activate the HTLV-I promoter does not involve the stabilization of Tax binding, but rather depends upon gene activation properties of the co-activator that function in the context of a naked DNA template.
TRF1 and TRF2 binding to telomeres is modulated by nucleosomal organization

PubMed Central

Galati, Alessandra; Micheli, Emanuela; Alicata, Claudia; Ingegnere, Tiziano; Cicconi, Alessandro; Pusch, Miriam Caroline; Giraud-Panis, Marie-Josèphe; Gilson, Eric; Cacchione, Stefano

2015-01-01

The ends of eukaryotic chromosomes need to be protected from the activation of a DNA damage response that leads the cell to replicative senescence or apoptosis. In mammals, protection is accomplished by a six-factor complex named shelterin, which organizes the terminal TTAGGG repeats in a still ill-defined structure, the telomere. The stable interaction of shelterin with telomeres mainly depends on the binding of two of its components, TRF1 and TRF2, to double-stranded telomeric repeats. Tethering of TRF proteins to telomeres occurs in a chromatin environment characterized by a very compact nucleosomal organization. In this work we show that binding of TRF1 and TRF2 to telomeric sequences is modulated by the histone octamer. By means of in vitro models, we found that TRF2 binding is strongly hampered by the presence of telomeric nucleosomes, whereas TRF1 binds efficiently to telomeric DNA in a nucleosomal context and is able to remodel telomeric nucleosomal arrays. Our results indicate that the different behavior of TRF proteins partly depends on the interaction with histone tails of their divergent N-terminal domains. We propose that the interplay between the histone octamer and TRF proteins plays a role in the steps leading to telomere deprotection. PMID:25999344
Genomics of Three New Bacteriophages Useful in the Biocontrol of Salmonella

PubMed Central

Bardina, Carlota; Colom, Joan; Spricigo, Denis A.; Otero, Jennifer; Sánchez-Osuna, Miquel; Cortés, Pilar; Llagostera, Montserrat

2016-01-01

Non-typhoid Salmonella is the principal pathogen related to food-borne diseases throughout the world. Widespread antibiotic resistance has adversely affected human health and has encouraged the search for alternative antimicrobial agents. The advances in bacteriophage therapy highlight their use in controlling a broad spectrum of food-borne pathogens. One requirement for the use of bacteriophages as antibacterials is the characterization of their genomes. In this work, complete genome sequencing and molecular analyses were carried out for three new virulent Salmonella-specific bacteriophages (UAB_Phi20, UAB_Phi78, and UAB_Phi87) able to infect a broad range of Salmonella strains. Sequence analysis of the genomes of UAB_Phi20, UAB_Phi78, and UAB_Phi87 bacteriophages did not evidence the presence of known virulence-associated and antibiotic resistance genes, and potential immunoreactive food allergens. The UAB_Phi20 genome comprised 41,809 base pairs with 80 open reading frames (ORFs); 24 of them with assigned function. Genome sequence showed a high homology of UAB_Phi20 with Salmonella bacteriophage P22 and other P22likeviruses genus of the Podoviridae family, including ST64T and ST104. The DNA of UAB_Phi78 contained 44,110 bp including direct terminal repeats (DTR) of 179 bp and 58 putative ORFs were predicted and 20 were assigned function. This bacteriophage was assigned to the SP6likeviruses genus of the Podoviridae family based on its high similarity not only with SP6 but also with the K1-5, K1E, and K1F bacteriophages, all of which infect Escherichia coli. The UAB_Phi87 genome sequence consisted of 87,669 bp with terminal direct repeats of 608 bp; although 148 ORFs were identified, putative functions could be assigned to only 29 of them. Sequence comparisons revealed the mosaic structure of UAB_Phi87 and its high similarity with bacteriophages Felix O1 and wV8 of E. coli with respect to genetic content and functional organization. Phylogenetic analysis of large terminase subunits confirms their packaging strategies and grouping to the different phage genus type. All these studies are necessary for the development and the use of an efficient cocktail with commercial applications in bacteriophage therapy against Salmonella. PMID:27148229
Characterisation of a large family of polymorphic collagen-like proteins in the endospore-forming bacterium Pasteuria ramosa.

PubMed

McElroy, Kerensa; Mouton, Laurence; Du Pasquier, Louis; Qi, Weihong; Ebert, Dieter

2011-09-01

Collagen-like proteins containing glycine-X-Y repeats have been identified in several pathogenic bacteria potentially involved in virulence. Recently, a collagen-like surface protein, Pcl1a, was identified in Pasteuria ramosa, a spore-forming parasite of Daphnia. Here we characterise 37 novel putative P. ramosa collagen-like protein genes (PCLs). PCR amplification and sequencing across 10 P. ramosa strains showed they were polymorphic, distinguishing genotypes matching known differences in Daphnia/P. ramosa interaction specificity. Thirty PCLs could be divided into four groups based on sequence similarity, conserved N- and C-terminal regions and G-X-Y repeat structure. Group 1, Group 2 and Group 3 PCLs formed triplets within the genome, with one member from each group represented in each triplet. Maximum-likelihood trees suggested that these groups arose through multiple instances of triplet duplication. For Group 1, 2, 3 and 4 PCLs, X was typically proline and Y typically threonine, consistent with other bacterial collagen-like proteins. The amino acid composition of Pcl2 closely resembled Pcl1a, with X typically being glutamic acid or aspartic acid and Y typically being lysine or glutamine. Pcl2 also showed sequence similarity to Pcl1a and contained a predicted signal peptide, cleavage site and transmembrane domain, suggesting that it is a surface protein. Copyright © 2011 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
The mitochondrial genome of Hydra oligactis (Cnidaria, Hydrozoa) sheds new light on animal mtDNA evolution and cnidarian phylogeny.

PubMed

Kayal, Ehsan; Lavrov, Dennis V

2008-02-29

The 16,314-nuceotide sequence of the linear mitochondrial DNA (mtDNA) molecule of Hydra oligactis (Cnidaria, Hydrozoa)--the first from the class Hydrozoa--has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs, as is typical for cnidarians. All genes have the same transcriptional orientation and their arrangement in the genome is similar to that of the jellyfish Aurelia aurita. In addition, a partial copy of cox1 is present at one end of the molecule in a transcriptional orientation opposite to the rest of the genes, forming a part of inverted terminal repeat characteristic of linear mtDNA and linear mitochondrial plasmids. The sequence close to at least one end of the molecule contains several homonucleotide runs as well as small inverted repeats that are able to form strong secondary structures and may be involved in mtDNA maintenance and expression. Phylogenetic analysis of mitochondrial genes of H. oligactis and other cnidarians supports the Medusozoa hypothesis but also suggests that Anthozoa may be paraphyletic, with octocorallians more closely related to the Medusozoa than to the Hexacorallia. The latter inference implies that Anthozoa is paraphyletic and that the polyp (rather than a medusa) is the ancestral body type in Cnidaria.
Recombination Analysis of Herpes Simplex Virus 1 Reveals a Bias toward GC Content and the Inverted Repeat Regions

PubMed Central

Lee, Kyubin; Kolb, Aaron W.; Sverchkov, Yuriy; Cuellar, Jacqueline A.; Craven, Mark

2015-01-01

ABSTRACT Herpes simplex virus 1 (HSV-1) causes recurrent mucocutaneous ulcers and is the leading cause of infectious blindness and sporadic encephalitis in the United States. HSV-1 has been shown to be highly recombinogenic; however, to date, there has been no genome-wide analysis of recombination. To address this, we generated 40 HSV-1 recombinants derived from two parental strains, OD4 and CJ994. The 40 OD4-CJ994 HSV-1 recombinants were sequenced using the Illumina sequencing system, and recombination breakpoints were determined for each of the recombinants using the Bootscan program. Breakpoints occurring in the terminal inverted repeats were excluded from analysis to prevent double counting, resulting in a total of 272 breakpoints in the data set. By placing windows around the 272 breakpoints followed by Monte Carlo analysis comparing actual data to simulated data, we identified a recombination bias toward both high GC content and intergenic regions. A Monte Carlo analysis also suggested that recombination did not appear to be responsible for the generation of the spontaneous nucleotide mutations detected following sequencing. Additionally, kernel density estimation analysis across the genome found that the large, inverted repeats comprise a recombination hot spot. IMPORTANCE Herpes simplex virus 1 (HSV-1) virus is the leading cause of sporadic encephalitis and blinding keratitis in developed countries. HSV-1 has been shown to be highly recombinogenic, and recombination itself appears to be a significant component of genome replication. To date, there has been no genome-wide analysis of recombination. Here we present the findings of the first genome-wide study of recombination performed by generating and sequencing 40 HSV-1 recombinants derived from the OD4 and CJ994 parental strains, followed by bioinformatics analysis. Recombination breakpoints were determined, yielding 272 breakpoints in the full data set. Kernel density analysis determined that the large inverted repeats constitute a recombination hot spot. Additionally, Monte Carlo analyses found biases toward high GC content and intergenic and repetitive regions. PMID:25926637
A genome-wide screening of BEL-Pao like retrotransposons in Anopheles gambiae by the LTR_STRUC program.

PubMed

Marsano, Renè Massimiliano; Caizzi, Ruggiero

2005-09-12

The advanced status of assembly of the nematoceran Anopheles gambiae genomic sequence allowed us to perform a wide genome analysis to looking at the presence of Long Terminal Repeats (LTRs) in the range of 10 kb by means of the LTR_STRUC tool. More than three hundred sequences were retrieved and 210 were treated as putative complete retrotransposons that were individually analysed with respect to known retrotransposons of A. gambiae and D. melanogaster. The results show that the vast majority of the retrotransposons analysed belong to the Ty3/gypsy class and only 8% to the Ty1/copia class. In addition, phylogenetic analysis allowed us to characterize in more detail the relationship of a large BEL-Pao lineage in which a single family was shown to harbour an additional env gene.

An Epstein-Barr virus immediate-early gene product trans-activates gene expression from the human immunodeficiency virus long terminal repeat.

PubMed

Kenney, S; Kamine, J; Markovitz, D; Fenrick, R; Pagano, J

1988-03-01

Acquired immunodeficiency syndrome patients are frequently coinfected with Epstein-Barr virus (EBV). In this report, we demonstrate that an EBV immediate-early gene product, BamHI MLF1, stimulates expression of the bacterial chloramphenicol acetyltransferase (CAT) gene linked to the human immunodeficiency virus (HIV) promoter. The HIV promoter sequences necessary for trans-activation by EBV do not include the tat-responsive sequences. In addition, in contrast to the other herpesvirus trans-activators previously studied, the EBV BamHI MLF1 gene product appears to function in part by a posttranscriptional mechanism, since it increases pHIV-CAT protein activity more than it increases HIV-CAT mRNA. This ability of an EBV gene product to activate HIV gene expression may have biologic consequences in persons coinfected with both viruses.
Turnover of R1 (Type I) and R2 (Type Ii) Retrotransposable Elements in the Ribosomal DNA of Drosophila Melanogaster

PubMed Central

Jakubczak, J. L.; Zenni, M. K.; Woodruff, R. C.; Eickbush, T. H.

1992-01-01

R1 and R2 are distantly related non-long terminal repeat retrotransposable elements each of which inserts into a specific site in the 28S rRNA genes of most insects. We have analyzed aspects of R1 and R2 abundance and sequence variation in 27 geographical isolates of Drosophila melanogaster. The fraction of 28S rRNA genes containing these elements varied greatly between strains, 17-67% for R1 elements and 2-28% for R2 elements. The total percentage of the rDNA repeats inserted ranged from 32 to 77%. The fraction of the rDNA repeats that contained both of these elements suggested that R1 and R2 exhibit neither an inhibition of nor preference for insertion into a 28S gene already containing the other type of element. Based on the conservation of restriction sites in the elements of all strains, and sequence analysis of individual elements from three strains, nucleotide divergence is very low for R1 and R2 elements within or between strains (<0.6%). This sequence uniformity is the expected result of the forces of concerted evolution (unequal crossovers and gene conversion) which act on the rRNA genes themselves. Evidence for the role of retrotransposition in the turnover of R1 and R2 was obtained by using naturally occurring 5' length polymorphisms of the elements as markers for independent transposition events. The pattern of these different length 5' truncations of R1 and R2 was found to be diverse and unique to most strains analyzed. Because recombination can only, with time, amplify or eliminate those length variants already present, the diversity found in each strain suggests that retrotransposition has played a critical role in maintaining these elements in the rDNA repeats of D. melanogaster. PMID:1317313
Comparison of simple sequence repeats in 19 Archaea.

PubMed

Trivedi, S

2006-12-05

All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.
The mitochondrial genome of the pathogenic yeast Candida subhashii: GC-rich linear DNA with a protein covalently attached to the 5′ termini

PubMed Central

Fricova, Dominika; Valach, Matus; Farkas, Zoltan; Pfeiffer, Ilona; Kucsera, Judit; Tomaska, Lubomir; Nosek, Jozef

2010-01-01

As a part of our initiative aimed at a large-scale comparative analysis of fungal mitochondrial genomes, we determined the complete DNA sequence of the mitochondrial genome of the yeast Candida subhashii and found that it exhibits a number of peculiar features. First, the mitochondrial genome is represented by linear dsDNA molecules of uniform length (29 795 bp), with an unusually high content of guanine and cytosine residues (52.7 %). Second, the coding sequences lack introns; thus, the genome has a relatively compact organization. Third, the termini of the linear molecules consist of long inverted repeats and seem to contain a protein covalently bound to terminal nucleotides at the 5′ ends. This architecture resembles the telomeres in a number of linear viral and plasmid DNA genomes classified as invertrons, in which the terminal proteins serve as specific primers for the initiation of DNA synthesis. Finally, although the mitochondrial genome of C. subhashii contains essentially the same set of genes as other closely related pathogenic Candida species, we identified additional ORFs encoding two homologues of the family B protein-priming DNA polymerases and an unknown protein. The terminal structures and the genes for DNA polymerases are reminiscent of linear mitochondrial plasmids, indicating that this genome architecture might have emerged from fortuitous recombination between an ancestral, presumably circular, mitochondrial genome and an invertron-like element. PMID:20395267
Characterization of a linear DNA plasmid from the filamentous fungal plant pathogen Glomerella musae [Anamorph: Colletotrichum musae (Berk. and Curt.) arx.

USGS Publications Warehouse

Freeman, S.; Redman, R.S.; Grantham, G.; Rodriguez, R.J.

1997-01-01

A 7.4-kilobase (kb) DNA plasmid was isolated from Glomerella musae isolate 927 and designated pGML1. Exonuclease treatments indicated that pGML1 was a linear plasmid with blocked 5' termini. Cell-fractionation experiments combined with sequence-specific PCR amplification revealed that pGML1 resided in mitochondria. The pGML1 plasmid hybridized to cesium chloride-fractionated nuclear DNA but not to A + T-rich mitochondrial DNA. An internal 7.0-kb section of pGML1 was cloned and did not hybridize with either nuclear or mitochondrial DNA from G. musae. Sequence analysis revealed identical terminal inverted repeats (TIR) of 520 bp at the ends of the cloned 7.0-kb section of pGML1. The occurrence of pGML1 did not correspond with the pathogenicity of G. musae on banana fruit. Four additional isolates of G. musae possessed extrachromosomal DNA fragments similar in size and sequence to pGML1.
The complete genome sequence and proteomics of Yersinia pestis phage Yep-phi.

PubMed

Zhao, Xiangna; Wu, Weili; Qi, Zhizhen; Cui, Yujun; Yan, Yanfeng; Guo, Zhaobiao; Wang, Zuyun; Wang, Hu; Deng, Haijun; Xue, Yan; Chen, Weijun; Wang, Xiaoyi; Yang, Ruifu

2011-01-01

Yep-phi, a lytic phage of Yersinia pestis, was isolated in China and is routinely used as a diagnostic phage for the identification of the plague pathogen. Yep-phi has an isometric hexagonal head containing dsDNA and a short non-contractile conical tail. In this study, we sequenced the Yep-phi genome (GenBank accession no. HQ333270) and performed proteomics analysis. The genome consists of 38 ,616 bp of DNA, including direct terminal repeats of 222 bp, and is predicted to contain 45 ORFs. Most structural proteins were identified by proteomics analysis. Compared with the three available genome sequences of lytic phages for Y. pestis, the phages could be divided into two subgroups. Yep-phi displays marked homology to the bacteriophages Berlin (GenBank accession no. AM183667) and Yepe2 (GenBank accession no. EU734170), and these comprise one subgroup. The other subgroup is represented by bacteriophage ΦA1122 (GenBank accession no. AY247822). Potential recombination was detected among the Yep-phi subgroup.
The active site of O-GlcNAc transferase imposes constraints on substrate sequence

PubMed Central

Rafie, Karim; Blair, David E.; Borodkin, Vladimir S.; Albarbarawi, Osama; van Aalten, Daan M. F.

2016-01-01

O-GlcNAc transferase (OGT) glycosylates a diverse range of intracellular proteins with O-linked N-acetylglucosamine (O-GlcNAc), an essential and dynamic post-translational modification in metazoa. Although this enzyme modifies hundreds of proteins with O-GlcNAc, it is not understood how OGT achieves substrate specificity. In this study, we describe the application of a high-throughput OGT assay on a library of peptides. The sites of O-GlcNAc modification were mapped by ETD-mass spectrometry, and found to correlate with previously detected O-GlcNAc sites. Crystal structures of four acceptor peptides in complex with human OGT suggest that a combination of size and conformational restriction defines sequence specificity in the −3 to +2 subsites. This work reveals that while the N-terminal TPR repeats of hOGT may play a role in substrate recognition, the sequence restriction imposed by the peptide-binding site makes a significant contribution to O-GlcNAc site specificity. PMID:26237509
Sequencing and generation of an infectious clone of the pathogenic goose parvovirus strain LH.

PubMed

Wang, Jianye; Duan, Jinkun; Zhu, Liqian; Jiang, Zhiwei; Zhu, Guoqiang

2015-03-01

In this study, the complete genome of the virulent strain LH of goose parvovirus (GPV) was sequenced and cloned into the pBluescript II (SK) plasmid vector. Sequence alignments of the inverted terminal repeats (ITR) of GPV strains revealed a common 14-nt-pair deletion in the stem of the palindromic structure in the LH strain and three other strains isolated after 1982 when compared to three GPV strains isolated earlier than that time. Transfection of 11-day-old embryonated goose eggs with the plasmid pLH, which contains the entire genome of strain LH, resulted in successful rescue of the infectious virus. Death of embryos after transfection via the chorioallantoic membrane infiltration route occurred earlier than when transfection was done via the allantoic cavity inoculation route. The rescued virus exhibited virulence similar to that of its parental virus, as evaluated by the mortality rate in goslings. Generation of the pathogenic infectious clone provides us with a powerful tool to elucidate the molecular pathogenesis of GPV in the future.
[Influence of antisense RNA and sequences of viral transactivators traps on RNA synthesis of HTLV-1 virus].

PubMed

Borisenko, A S; Kotus, E V; Kaloshin, A A

2008-01-01

Significant number of scientific publications devoted to inhibition of viral replication by antisense RNA (asRNA) genes shows that this approach is useful for gene therapy of viral infections. To investigate the possibility of suppression of HTLV-1 virus reproduction by asRNA we constructed recombinant plasmids containing asRNA genes against U3 long terminal repeats region and X gene under the control of promoter of myeloproliferative sarcoma virus (MPSV) or without such promoter. Using stable calcium-phosphate transfection method with subsequent selection in the presence of G-418, RaHOS line-based cell clones carrying both asRNA genes and sequences able to bind HTLV-1 transactivator proteins (i.e. "traps" of viral transactivators, TVT) were obtained. Data from dot-hybridization analysis of viral RNA extracted from RaHOS cell clones showed that TVT sequences are able to suppress the viral RNA synthesis on 90% and asRNA against X gene synthesis--on 50%.
A retrotransposable element from the mosquito Anopheles gambiae .

PubMed Central

Besansky, N J

1990-01-01

A family of middle repetitive elements from the African malaria vector Anopheles gambiae is described. Approximately 100 copies of the element, designated T1Ag, are dispersed in the genome. Full-length elements are 4.6 kilobase pairs in length, but truncation of the 5' end is common. Nucleotide sequences of one full-length, two 5'-truncated, and two 5' ends of T1Ag elements were determined and aligned to define a consensus sequence. Sequence analysis revealed two long, overlapping open reading frames followed by a polyadenylation signal, AATAAA, and a tail consisting of tandem repetitions of the motif TGAAA. No direct or inverted long terminal repeats (LTRs) were detected. The first open reading frame, 442 amino acids in length, includes a domain resembling that of nucleic acid-binding proteins. The second open reading frame, 975 amino acids long, resembles the reverse transcriptases of a category of retrotransposable elements without LTRs, variously termed class II retrotransposons, class III elements or non-LTR retrotransposons. Similarity at the sequence and structural levels places T1Ag in this category. Images PMID:1689457
Genome Sequence of the Bacterium Streptomyces davawensis JCM 4913 and Heterologous Production of the Unique Antibiotic Roseoflavin

PubMed Central

Jankowitsch, Frank; Schwarz, Julia; Rückert, Christian; Gust, Bertolt; Szczepanowski, Rafael; Blom, Jochen; Pelzer, Stefan; Kalinowski, Jörn

2012-01-01

Streptomyces davawensis JCM 4913 synthesizes the antibiotic roseoflavin, a structural riboflavin (vitamin B2) analog. Here, we report the 9,466,619-bp linear chromosome of S. davawensis JCM 4913 and a 89,331-bp linear plasmid. The sequence has an average G+C content of 70.58% and contains six rRNA operons (16S-23S-5S) and 69 tRNA genes. The 8,616 predicted protein-coding sequences include 32 clusters coding for secondary metabolites, several of which are unique to S. davawensis. The chromosome contains long terminal inverted repeats of 33,255 bp each and atypical telomeres. Sequence analysis with regard to riboflavin biosynthesis revealed three different patterns of gene organization in Streptomyces species. Heterologous expression of a set of genes present on a subgenomic fragment of S. davawensis resulted in the production of roseoflavin by the host Streptomyces coelicolor M1152. Phylogenetic analysis revealed that S. davawensis is a close relative of Streptomyces cinnabarinus, and much to our surprise, we found that the latter bacterium is a roseoflavin producer as well. PMID:23043000
Active site of tripeptidyl peptidase II from human erythrocytes is of the subtilisin type.

PubMed Central

Tomkinson, B; Wernstedt, C; Hellman, U; Zetterqvist, O

1987-01-01

The present report presents evidence that the amino acid sequence around the serine of the active site of human tripeptidyl peptidase II is of the subtilisin type. The enzyme from human erythrocytes was covalently labeled at its active site with [3H]diisopropyl fluorophosphate, and the protein was subsequently reduced, alkylated, and digested with trypsin. The labeled tryptic peptides were purified by gel filtration and repeated reversed-phase HPLC, and their amino-terminal sequences were determined. Residue 9 contained the radioactive label and was, therefore, considered to be the active serine residue. The primary structure of the part of the active site (residues 1-10) containing this residue was concluded to be Xaa-Thr-Gln-Leu-Met-Asx-Gly-Thr-Ser-Met. This amino acid sequence is homologous to the sequence surrounding the active serine of the microbial peptidases subtilisin and thermitase. These data demonstrate that human tripeptidyl peptidase II represents a potentially distinct class of human peptidases and raise the question of an evolutionary relationship between the active site of a mammalian peptidase and that of the subtilisin family of serine peptidases. PMID:3313395
Recombination, rearrangement, reshuffling, and divergence in a centromeric region of rice.

PubMed

Ma, Jianxin; Bennetzen, Jeffrey L

2006-01-10

Centromeres have many unusual biological properties, including kinetochore attachment and severe repression of local meiotic recombination. These properties are partly an outcome, partly a cause, of unusual DNA structure in the centromeric region. Although several plant and animal genomes have been sequenced, most centromere sequences have not been completed or analyzed in depth. To shed light on the unique organization, variability, and evolution of centromeric DNA, detailed analysis of a 1.97-Mb sequence that includes centromere 8 (CEN8) of japonica rice was undertaken. Thirty-three long-terminal repeat (LTR)-retrotransposon families (including 11 previously unknown) were identified in the CEN8 region, totaling 245 elements and fragments that account for 67% of the region. The ratio of solo LTRs to intact elements in the CEN8 region is approximately 0.9:1, compared with approximately 2.2:1 in noncentromeric regions of rice. However, the ratio of solo LTRs to intact elements in the core of the CEN8 region ( approximately 2.5:1) is higher than in any other region investigated in rice, suggesting a hotspot for unequal recombination. Comparison of the CEN8 region of japonica and its orthologous segments from indica rice indicated that approximately 15% of the intact retrotransposons and solo LTRs were inserted into CEN8 after the divergence of japonica and indica from a common ancestor, compared with approximately 50% for previously studied euchromatic regions. Frequent DNA rearrangements were observed in the CEN8 region, including a 212-kb subregion that was found to be composed of three rearranged tandem repeats. Phylogenetic analysis also revealed recent segmental duplication and extensive rearrangement and reshuffling of the CentO satellite repeats.
The primary structures of two yeast enolase genes. Homology between the 5' noncoding flanking regions of yeast enolase and glyceraldehyde-3-phosphate dehydrogenase genes.

PubMed

Holland, M J; Holland, J P; Thill, G P; Jackson, K A

1981-02-10

Segments of yeast genomic DNA containing two enolase structural genes have been isolated by subculture cloning procedures using a cDNA hybridization probe synthesized from purified yeast enolase mRNA. Based on restriction endonuclease and transcriptional maps of these two segments of yeast DNA, each hybrid plasmid contains a region of extensive nucleotide sequence homology which forms hybrids with the cDNA probe. The DNA sequences which flank this homologous region in the two hybrid plasmids are nonhomologous indicating that these sequences are nontandemly repeated in the yeast genome. The complete nucleotide sequence of the coding as well as the flanking noncoding regions of these genes has been determined. The amino acid sequence predicted from one reading frame of both structural genes is extremely similar to that determined for yeast enolase (Chin, C. C. Q., Brewer, J. M., Eckard, E., and Wold, F. (1981) J. Biol. Chem. 256, 1370-1376), confirming that these isolated structural genes encode yeast enolase. The nucleotide sequences of the coding regions of the genes are approximately 95% homologous, and neither gene contains an intervening sequence. Codon utilization in the enolase genes follows the same biased pattern previously described for two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes (Holland, J. P., and Holland, M. J. (1980) J. Biol. Chem. 255, 2596-2605). DNA blotting analysis confirmed that the isolated segments of yeast DNA are colinear with yeast genomic DNA and that there are two nontandemly repeated enolase genes per haploid yeast genome. The noncoding portions of the two enolase genes adjacent to the initiation and termination codons are approximately 70% homologous and contain sequences thought to be involved in the synthesis and processing messenger RNA. Finally there are regions of extensive homology between the two enolase structural genes and two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes within the 5- noncoding portions of these glycolytic genes.
[Mutation Analysis of 19 STR Loci in 20 723 Cases of Paternity Testing].

PubMed

Bi, J; Chang, J J; Li, M X; Yu, C Y

2017-06-01

To observe and analyze the confirmed cases of paternity testing, and to explore the mutation rules of STR loci. The mutant STR loci were screened from 20 723 confirmed cases of paternity testing by Goldeneye 20A system．The mutation rates, and the sources, fragment length, steps and increased or decreased repeat sequences of mutant alleles were counted for the analysis of the characteristics of mutation-related factors. A total of 548 mutations were found on 19 STR loci, and 557 mutation events were observed. The loci mutation rate was 0.07‰-2.23‰. The ratio of paternal to maternal mutant events was 3.06:1. One step mutation was the main mutation, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. The repeat sequences were more likely to decrease in two steps mutation and above. Mutation mainly occurred in the medium allele, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. In long allele mutations, the decreased repeat sequences were significantly more than the increased repeat sequences. The number of the increased repeat sequences was almost the same as the decreased repeat sequences in paternal mutation, while the decreased repeat sequences were more than the increased in maternal mutation. There are significant differences in the mutation rate of each locus. When one or two loci do not conform to the genetic law, other detection system should be added, and PI value should be calculated combined with the information of the mutate STR loci in order to further clarify the identification opinions. Copyright© by the Editorial Department of Journal of Forensic Medicine
Predicting repeat protein folding kinetics from an experimentally determined folding energy landscape

PubMed Central

Street, Timothy O; Barrick, Doug

2009-01-01

The Notch ankyrin domain is a repeat protein whose folding has been characterized through equilibrium and kinetic measurements. In previous work, equilibrium folding free energies of truncated constructs were used to generate an experimentally determined folding energy landscape (Mello and Barrick, Proc Natl Acad Sci USA 2004;101:14102–14107). Here, this folding energy landscape is used to parameterize a kinetic model in which local transition probabilities between partly folded states are based on energy values from the landscape. The landscape-based model correctly predicts highly diverse experimentally determined folding kinetics of the Notch ankyrin domain and sequence variants. These predictions include monophasic folding and biphasic unfolding, curvature in the unfolding limb of the chevron plot, population of a transient unfolding intermediate, relative folding rates of 19 variants spanning three orders of magnitude, and a change in the folding pathway that results from C-terminal stabilization. These findings indicate that the folding pathway(s) of the Notch ankyrin domain are thermodynamically selected: the primary determinants of kinetic behavior can be simply deduced from the local stability of individual repeats. PMID:19177351
Sequences required for transcription termination at the intrinsic lambdatI terminator.

PubMed

Martínez-Trujillo, Miguel; Sánchez-Trujillo, Alejandra; Ceja, Víctor; Avila-Moreno, Federico; Bermúdez-Cruz, Rosa María; Court, Donald; Montañez, Cecilia

2010-02-01

The lambdatI terminator is located approximately 280 bp beyond the lambdaint gene, and it has a typical structure of an intrinsic terminator. To identify sequences required for lambdatI transcription termination a set of deletion mutants were generated, either from the 5' or the 3' end onto the lambdatI region. The termination efficiency was determined by measuring galactokinase (galK) levels by Northern blot assays and by in vitro transcription termination. The importance of the uridines and the stability of the stem structure in the termination were demonstrated. The nontranscribed DNA beyond the 3' end also affects termination. Additionally, sequences upstream have a small effect on transcription termination. The in vivo RNA termination sites at lambdatI were determined by S1 mapping and were located at 8 different positions. Processing of transcripts from the 3' end confirmed the importance of the hairpin stem in protection against exonuclease.
Cell type-specific termination of transcription by transposable element sequences.

PubMed

Conley, Andrew B; Jordan, I King

2012-09-30

Transposable elements (TEs) encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question. Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS) genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3' UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS. TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are evolutionarily young. The extent of transcription termination by TEs seen here, along with the preference for sense-oriented TE insertions to provide TTS, is consistent with the observed antisense orientation bias of human TEs.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

PubMed Central

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-01-01

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363
Sequences characterization of microsatellite DNA sequences in Pacific abalone ( Haliotis discus hannai)

NASA Astrophysics Data System (ADS)

Li, Qi; Akihiro, Kijima

2007-01-01

The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber (1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats (13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (<20 repeats) were most abundant, accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatellite isolation in other abalone species.

Molecular analysis of the glucocerebrosidase gene locus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Winfield, S.L.; Martin, B.M.; Fandino, A.

1994-09-01

Gaucher disease is due to a deficiency in the activity of the lysosomal enzyme glucocerebrosidase. Both the functional gene for this enzyme and a pseudogene are located in close proximity on chromosome 1q21. Analysis of the mutations present in patient samples has suggested interaction between the functional gene and the pseudogene in the origin of mutant genotypes. To investigate the involvement of regions flanking the functional gene and pseudogene in the origin of mutations found in Gaucher disease, a YAC clone containing DNA from this locus has been subcloned and characterized. The original YAC containing {approximately}360 kb was truncated withmore » the use of fragmentation plasmids to about 85 kb. A lambda library derived from this YAC was screened to obtain clones containing glucocerebrosidase sequences. PCR amplification was used to identify subclones containing 5{prime}, central, or 3{prime} sequences of the functional gene or of the pseudogene. Clones spanning the entire distance from the last exon of the functional gene to intron 1 of the pseudogene, the 5{prime} end of the functional gene and 16 kb of 5{prime} flanking region and approximately 15 kb of 3{prime} flanking region of the pseudogene were sequenced. Sequence data from 48 kb of intergenic and flanking regions of the glucocerebrosidase gene and its pseudogene has been generated. A large number of Alu sequences and several simple repeats have been found. Two of these repeats exhibit fragment length polymorphism. There is almost 100% homology between the 3{prime} flanking regions of the functional gene and the pseudogene, extending to about 4 kb past the termination codons. A much lower degree of homology is observed in the 5{prime} flanking region. Patient samples are currently being screened for polymorphisms in these flanking regions.« less
Comprehensive definition of genome features in Spirodela polyrhiza by high-depth physical mapping and short-read DNA sequencing strategies.

PubMed

Michael, Todd P; Bryant, Douglas; Gutierrez, Ryan; Borisjuk, Nikolai; Chu, Philomena; Zhang, Hanzhong; Xia, Jing; Zhou, Junfei; Peng, Hai; El Baidouri, Moaine; Ten Hallers, Boudewijn; Hastie, Alex R; Liang, Tiffany; Acosta, Kenneth; Gilbert, Sarah; McEntee, Connor; Jackson, Scott A; Mockler, Todd C; Zhang, Weixiong; Lam, Eric

2017-02-01

Spirodela polyrhiza is a fast-growing aquatic monocot with highly reduced morphology, genome size and number of protein-coding genes. Considering these biological features of Spirodela and its basal position in the monocot lineage, understanding its genome architecture could shed light on plant adaptation and genome evolution. Like many draft genomes, however, the 158-Mb Spirodela genome sequence has not been resolved to chromosomes, and important genome characteristics have not been defined. Here we deployed rapid genome-wide physical maps combined with high-coverage short-read sequencing to resolve the 20 chromosomes of Spirodela and to empirically delineate its genome features. Our data revealed a dramatic reduction in the number of the rDNA repeat units in Spirodela to fewer than 100, which is even fewer than that reported for yeast. Consistent with its unique phylogenetic position, small RNA sequencing revealed 29 Spirodela-specific microRNA, with only two being shared with Elaeis guineensis (oil palm) and Musa balbisiana (banana). Combining DNA methylation data and small RNA sequencing enabled the accurate prediction of 20.5% long terminal repeats (LTRs) that doubled the previous estimate, and revealed a high Solo:Intact LTR ratio of 8.2. Interestingly, we found that Spirodela has the lowest global DNA methylation levels (9%) of any plant species tested. Taken together our results reveal a genome that has undergone reduction, likely through eliminating non-essential protein coding genes, rDNA and LTRs. In addition to delineating the genome features of this unique plant, the methodologies described and large-scale genome resources from this work will enable future evolutionary and functional studies of this basal monocot family. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
Molecular coevolution of mammalian ribosomal gene terminator sequences and the transcription termination factor TTF-I.

PubMed Central

Evers, R; Grummt, I

1995-01-01

Both the DNA elements and the nuclear factors that direct termination of ribosomal gene transcription exhibit species-specific differences. Even between mammals--e.g., human and mouse--the termination signals are not identical and the respective transcription termination factors (TTFs) which bind to the terminator sequence are not fully interchangeable. To elucidate the molecular basis for this species-specificity, we have cloned TTF-I from human and mouse cells and compared their structural and functional properties. Recombinant TTF-I exhibits species-specific DNA binding and terminates transcription both in cell-free transcription assays and in transfection experiments. Chimeric constructs of mouse TTF-I and human TTF-I reveal that the major determinant for species-specific DNA binding resides within the C terminus of TTF-I. Replacing 31 C-terminal amino acids of mouse TTF-I with the homologous human sequences relaxes the DNA-binding specificity and, as a consequence, allows the chimeric factor to bind the human terminator sequence and to specifically stop rDNA transcription. Images Fig. 2 Fig. 3 Fig. 4 PMID:7597036
Orpinomyces cellulase CelE protein and coding sequences

DOEpatents

Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

2000-08-29

A CDNA designated celE cloned from Orpinomyces PC-2 encodes a polypeptide (CelE) of 477 amino acids. CelE is highly homologous to CelB of Orpinomyces (72.3% identity) and Neocallimastix (67.9% identity), and like them, it has a non-catalytic repeated peptide domain (NCRPD) at the C-terminal end. The catalytic domain of CelE is homologous to glycosyl hydrolases of Family 5, found in several anaerobic bacteria. The gene of celE is devoid of introns. The recombinant proteins CelE and CelB of Orpinomyces PC-2 randomly hydrolyze carboxymethylcellulose and cello-oligosaccharides in the pattern of endoglucanases.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution.

PubMed

Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L

2013-01-30

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

PubMed Central

2013-01-01

Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705
Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm

PubMed Central

Glunčić, Matko; Paar, Vladimir

2013-01-01

The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes). PMID:22977183
Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species.

PubMed

Larsen, Svend Arild; Mogensen, Line; Dietz, Rune; Baagøe, Hans Jørgen; Andersen, Mogens; Werge, Thomas; Rasmussen, Henrik Berg

2005-12-01

In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of potential functional sites varied pronouncedly between species. Our observations provide a platform for future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that differences in the structure of this tandem repeat contribute to specialization and generation of diversity in receptor function.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

PubMed

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-11-16

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Capillary electrophoresis of Big-Dye terminator sequencing reactions for human mtDNA Control Region haplotyping in the identification of human remains.

PubMed

Montesino, Marta; Prieto, Lourdes

2012-01-01

Cycle sequencing reaction with Big-Dye terminators provides the methodology to analyze mtDNA Control Region amplicons by means of capillary electrophoresis. DNA sequencing with ddNTPs or terminators was developed by (1). The progressive automation of the method by combining the use of fluorescent-dye terminators with cycle sequencing has made it possible to increase the sensibility and efficiency of the method and hence has allowed its introduction into the forensic field. PCR-generated mitochondrial DNA products are the templates for sequencing reactions. Different set of primers can be used to generate amplicons with different sizes according to the quality and quantity of the DNA extract providing sequence data for different ranges inside the Control Region.
Amyloid-like self-assembly of peptide sequences from the adenovirus fiber shaft: insights from molecular dynamics simulations.

PubMed

Tamamis, Phanourios; Kasotakis, Emmanouil; Mitraki, Anna; Archontis, Georgios

2009-11-26

The self-assembly of peptides and proteins into nanostructures is related to the fundamental problems of protein folding and misfolding and has potential applications in medicine, materials science and nanotechnology. Natural peptides, corresponding to sequence repeats from self-assembling proteins, may constitute elementary building blocks of such nanostructures. In this work, we study by implicit-solvent replica-exchange simulations the self-assembly of two amyloidogenic sequences derived from the naturally occurring fiber shaft of the adenovirus, the octapeptide NSGAITIG (asparagine-serine-glycine-alanine-isoleucine-threonine-isoleucine-glycine) and its hexapeptide counterpart, GAITIG. In accordance with their amyloidogenic capacity, both peptides form readily intermolecular beta-sheets, stabilized by extensive main- and side-chain contacts involving the C-terminal moieties (segments 3-8 and 2-6, respectively). The structural and energetic properties of these sheets are analyzed extensively. The N-terminal residues Asn1 and Ser2 of the octapeptide remain disordered in the sheets, suggesting that these residues are exposed at the exterior of the fibrils and accessible. On the basis of insight provided by the simulations, cysteine residues were recently substituted at positions 1 and 2 of NSGAITIG; the newly designed peptides maintain their amyloidogenic properties and can bind to silver, gold and platinum nanoparticles [Kasotakis et al. Biopolymers 2009, 92, 164-172]. Computational investigation can identify suitable positions for rational modification of peptide building blocks, aiming at the fabrication of novel biomaterials.
Conserved structure and inferred evolutionary history of long terminal repeats (LTRs)

PubMed Central

2013-01-01

Background Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability. The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible. Results Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups. Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5´TGTTRNR…YNYAACA 3´); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region. The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening. The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a ‘Superviterbi’ alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny. Conclusion The HMMs clearly demonstrated a unitary conserved structure of LTRs, supporting that they arose once during evolution. We attempted to follow the evolution of LTRs by tracing their functional foundations, that is, acquisition of RNAse H, a combined promoter/ polyadenylation site, integrase, hairpin priming and the primer binding site (PBS). Available information did not support a simple evolutionary chain of events. PMID:23369192
3' terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing.

PubMed

Goldfarb, Katherine C; Cech, Thomas R

2013-09-21

Post-transcriptional 3' end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3' RACE coupled with high-throughput sequencing to characterize the 3' terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. The 3' terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3' terminus of an in vitro transcribed MRP RNA control and the differing 3' terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). 3' RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3' terminal sequences of noncoding RNAs.
Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne): nucleotide sequences, genome location and potential for host-race identification.

PubMed Central

Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R

1991-01-01

Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769
LTRs of endogenous retroviruses as a source of Tbx6 binding sites

NASA Astrophysics Data System (ADS)

Yasuhiko, Yukuto; Hirabayashi, Yoko; Ono, Ryuichi

2017-06-01

Retrotransposons are abundant in mammalian genomes and can modulate the gene expression of surrounding genes by disrupting endogenous binding sites for transcription factors (TFs) or providing novel TFs binding sites within retrotransposon sequences. Here, we show that a (C/T)CACACCT sequence motif in ORR1A, ORR1B, ORR1C and ORR1D, Long Terminal Repeats (LTRs) of MaLR endogenous retrovirus (ERV), is the direct target of Tbx6, an evolutionary conserved family of T-box transcription factors. Moreover, by comparing gene expression between control mice (Tbx6 +/-) and Tbx6-deficient mice (Tbx6 -/-), we demonstrate that at least four genes, Twist2, Pitx2, Oscp1, and Nfxl1, are down-regulated with Tbx6 deficiency. These results suggest that ORR1A, ORR1B, ORR1C and ORR1D may contribute to the evolution of mammalian embryogenesis.
An Epstein-Barr virus immediate-early gene product trans-activates gene expression from the human immunodeficiency virus long terminal repeat.

PubMed Central

Kenney, S; Kamine, J; Markovitz, D; Fenrick, R; Pagano, J

1988-01-01

Acquired immunodeficiency syndrome patients are frequently coinfected with Epstein-Barr virus (EBV). In this report, we demonstrate that an EBV immediate-early gene product, BamHI MLF1, stimulates expression of the bacterial chloramphenicol acetyltransferase (CAT) gene linked to the human immunodeficiency virus (HIV) promoter. The HIV promoter sequences necessary for trans-activation by EBV do not include the tat-responsive sequences. In addition, in contrast to the other herpesvirus trans-activators previously studied, the EBV BamHI MLF1 gene product appears to function in part by a posttranscriptional mechanism, since it increases pHIV-CAT protein activity more than it increases HIV-CAT mRNA. This ability of an EBV gene product to activate HIV gene expression may have biologic consequences in persons coinfected with both viruses. Images PMID:2830625
Epstein-Barr virus immediate-early gene product trans-activates gene expression from the human immunodeficiency virus long terminal repeat

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kenney, S.; Kamine, J.; Markovitz, D.

Acquired immunodeficiency syndrome patients are frequently coinfected with Epstein-Barr virus (EBV). In this report, the authors demonstrate that an EBV immediate-early gene product, BamHI MLF1, stimulates expression of the bacterial chloramphenicol acetyltransferase (CAT) gene linked to the human immunodeficiency virus (HIV) promoter. The HIV promoter sequences necessary for trans-activation by EBV do not include the tat-responsive sequences. In addition, in contrast to the other herpesvirus trans-activators previously studied, the EBV BamHI MLF1 gene product appears to function in part by a posttranscriptional mechanism, since it increases pHIV-CAT protein activity more than it increases HIV-CAT mRNA. This ability of an EBVmore » gene product to activate HIV gene expression may have biologic consequences in persons coinfected with both viruses.« less
LTRs of Endogenous Retroviruses as a Source of Tbx6 Binding Sites

PubMed Central

Yasuhiko, Yukuto; Hirabayashi, Yoko; Ono, Ryuichi

2017-01-01

Retrotransposons are abundant in mammalian genomes and can modulate the gene expression of surrounding genes by disrupting endogenous binding sites for transcription factors (TFs) or providing novel TFs binding sites within retrotransposon sequences. Here, we show that a (C/T)CACACCT sequence motif in ORR1A, ORR1B, ORR1C, and ORR1D, Long Terminal Repeats (LTRs) of MaLR endogenous retrovirus (ERV), is the direct target of Tbx6, an evolutionary conserved family of T-box TFs. Moreover, by comparing gene expression between control mice (Tbx6 +/−) and Tbx6-deficient mice (Tbx6 −/−), we demonstrate that at least four genes, Twist2, Pitx2, Oscp1, and Nfxl1, are down-regulated with Tbx6 deficiency. These results suggest that ORR1A, ORR1B, ORR1C and ORR1D may contribute to the evolution of mammalian embryogenesis. PMID:28664156
LTRs of Endogenous Retroviruses as a Source of Tbx6 Binding Sites.

PubMed

Yasuhiko, Yukuto; Hirabayashi, Yoko; Ono, Ryuichi

2017-01-01

Retrotransposons are abundant in mammalian genomes and can modulate the gene expression of surrounding genes by disrupting endogenous binding sites for transcription factors (TFs) or providing novel TFs binding sites within retrotransposon sequences. Here, we show that a (C/T)CACACCT sequence motif in ORR1A, ORR1B, ORR1C, and ORR1D, Long Terminal Repeats (LTRs) of MaLR endogenous retrovirus (ERV), is the direct target of Tbx6, an evolutionary conserved family of T-box TFs. Moreover, by comparing gene expression between control mice (Tbx6 +/-) and Tbx6-deficient mice (Tbx6 -/-), we demonstrate that at least four genes, Twist2, Pitx2, Oscp1 , and Nfxl1 , are down-regulated with Tbx6 deficiency. These results suggest that ORR1A, ORR1B, ORR1C and ORR1D may contribute to the evolution of mammalian embryogenesis.
A novel class of dual-family immunophilins.

PubMed

Adams, Brian; Musiyenko, Alla; Kumar, Rajinder; Barik, Sailen

2005-07-01

Immunophilins are protein chaperones with peptidylprolyl isomerase activity that belong to one of two large families, the cyclosporin-binding cyclophilins (CyPs) and the FK506-binding proteins (FKBPs). Each family displays characteristic and conserved sequence features that differ between the two families. We report a novel group of dual-family immunophilins that contain both CyP and FKBP domains for which we propose the name FCBP (FK506- and cyclosporin-binding protein). The FCBP of Toxoplasma gondii, a protozoan parasite, contained N-terminal FKBP and C-terminal CyP domains joined by tetratricopeptide repeats. Structure-function analysis revealed that both domains were functional and exhibited family-specific drug sensitivity. The individual domains of FCBP inhibited calcineurin (protein phosphatase 2B) in the presence of the appropriate drugs. In binding studies, FCBP recruited calcineurin in the presence of FK506 and a putative target of rapamycin homolog in the presence of rapamycin. Two additional FCBP sequences in Flavobacterium and one in Treponema (spirochete) were also identified in which the CyP and FKBP domains were in the reverse order. T. gondii growth was inhibited by cyclosporin and FK506 in a moderately synergistic manner. The knockdown of FCBP by RNA interference revealed its essentiality for T. gondii growth. Clearly, the FCBPs are novel chaperones and potential targets of multiple immunosuppressant drugs.

A polyvalent hybrid protein elicits antibodies against the diverse allelic types of block 2 in Plasmodium falciparum merozoite surface protein 1.

PubMed

Tetteh, Kevin K A; Conway, David J

2011-10-13

Merozoite surface protein 1 (MSP1) of Plasmodium falciparum has been implicated as an important target of acquired immunity, and candidate components for a vaccine include polymorphic epitopes in the N-terminal polymorphic block 2 region. We designed a polyvalent hybrid recombinant protein incorporating sequences of the three major allelic types of block 2 together with a composite repeat sequence of one of the types and N-terminal flanking T cell epitopes, and compared this with a series of recombinant proteins containing modular sub-components and similarly expressed in Escherichia coli. Immunogenicity of the full polyvalent hybrid protein was tested in both mice and rabbits, and comparative immunogenicity studies of the sub-component modules were performed in mice. The full hybrid protein induced high titre antibodies against each of the major block 2 allelic types expressed as separate recombinant proteins and against a wide range of allelic types naturally expressed by a panel of diverse P. falciparum isolates, while the sub-component modules had partial antigenic coverage as expected. This encourages further development and evaluation of the full MSP1 block 2 polyvalent hybrid protein as a candidate blood-stage component of a malaria vaccine. Copyright © 2011 Elsevier Ltd. All rights reserved.
Identification of a polymorphic collagen-like protein in the crustacean bacteria Pasteuria ramosa.

PubMed

Mouton, Laurence; Traunecker, Emmanuel; McElroy, Kerensa; Du Pasquier, Louis; Ebert, Dieter

2009-12-01

Pasteuria ramosa is a spore-forming bacterium that infects Daphnia species. Previous results demonstrated a high specificity of host clone/parasite genotype interactions. Surface proteins of bacteria often play an important role in attachment to host cells prior to infection. We analyzed surface proteins of P. ramosa spores by two-dimensional gel electrophoresis. For the first time, we prove that two isolates selected for their differences in infectivity reveal few but clear-cut differences in protein patterns. Using internal sequencing and LC/MS/MS, we identified a collagen-like protein named Pcl1a (Pasteuria collagen-like protein 1a). This protein, reconstructed with the help of Pasteuria genome sequences, contains three domains: a 75-amino-acid amino-terminal domain with a potential transmembrane helix domain, a central collagen-like region (CLR) containing Gly-Xaa-Yaa (GXY) repeats, and a 7-amino-acid carboxy-terminal domain. The CLR region is polymorphic among the two isolates with amino-acid substitutions and a variable number of GXY triplets. Collagen-like proteins are rare in prokaryotes, although they have been described in several pathogenic bacteria, including Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis, closely related to Pasteuria species, in which they could be involved in the adherence of bacteria to host cells.
A conserved region of leptospiral immunoglobulin-like A and B proteins as a DNA vaccine elicits a prophylactic immune response against leptospirosis.

PubMed

Forster, Karine M; Hartwig, Daiane D; Seixas, Fabiana K; Bacelo, Kátia L; Amaral, Marta; Hartleben, Cláudia P; Dellagostin, Odir A

2013-05-01

The leptospiral immunoglobulin-like (Lig) proteins LigA and LigB possess immunoglobulin-like domains with 90-amino-acid repeats and are adhesion molecules involved in pathogenicity. They are conserved in pathogenic Leptospira spp. and thus are of interest for use as serodiagnostic antigens and in recombinant vaccine formulations. The N-terminal amino acid sequences of the LigA and LigB proteins are identical, but the C-terminal sequences vary. In this study, we evaluated the protective potential of five truncated forms of LigA and LigB proteins from Leptospira interrogans serovar Canicola as DNA vaccines using the pTARGET mammalian expression vector. Hamsters immunized with the DNA vaccines were subjected to a heterologous challenge with L. interrogans serovar Copenhageni strain Spool via the intraperitoneal route. Immunization with a DNA vaccine encoding LigBrep resulted in the survival of 5/8 (62.5%) hamsters against lethal infection (P < 0.05). None of the control hamsters or animals immunized with the other vaccine preparations survived. The vaccine induced an IgG antibody response and, additionally, conferred sterilizing immunity in 80% of the surviving animals. Our results indicate that the LigBrep DNA vaccine is a promising candidate for inclusion in a protective leptospiral vaccine.
A Conserved Region of Leptospiral Immunoglobulin-Like A and B Proteins as a DNA Vaccine Elicits a Prophylactic Immune Response against Leptospirosis

PubMed Central

Forster, Karine M.; Hartwig, Daiane D.; Seixas, Fabiana K.; Bacelo, Kátia L.; Amaral, Marta; Hartleben, Cláudia P.

2013-01-01

The leptospiral immunoglobulin-like (Lig) proteins LigA and LigB possess immunoglobulin-like domains with 90-amino-acid repeats and are adhesion molecules involved in pathogenicity. They are conserved in pathogenic Leptospira spp. and thus are of interest for use as serodiagnostic antigens and in recombinant vaccine formulations. The N-terminal amino acid sequences of the LigA and LigB proteins are identical, but the C-terminal sequences vary. In this study, we evaluated the protective potential of five truncated forms of LigA and LigB proteins from Leptospira interrogans serovar Canicola as DNA vaccines using the pTARGET mammalian expression vector. Hamsters immunized with the DNA vaccines were subjected to a heterologous challenge with L. interrogans serovar Copenhageni strain Spool via the intraperitoneal route. Immunization with a DNA vaccine encoding LigBrep resulted in the survival of 5/8 (62.5%) hamsters against lethal infection (P < 0.05). None of the control hamsters or animals immunized with the other vaccine preparations survived. The vaccine induced an IgG antibody response and, additionally, conferred sterilizing immunity in 80% of the surviving animals. Our results indicate that the LigBrep DNA vaccine is a promising candidate for inclusion in a protective leptospiral vaccine. PMID:23486420
An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data.

PubMed

Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E; Greenwood, Alex D

2015-11-24

Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals.
An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data

PubMed Central

Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E.; Greenwood, Alex D.

2015-01-01

Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals. PMID:26610552
Analysis of sequence variability in the macronuclear DNA of Paramecium tetraurelia: A somatic view of the germline

PubMed Central

Duret, Laurent; Cohen, Jean; Jubin, Claire; Dessen, Philippe; Goût, Jean-François; Mousset, Sylvain; Aury, Jean-Marc; Jaillon, Olivier; Noël, Benjamin; Arnaiz, Olivier; Bétermier, Mireille; Wincker, Patrick; Meyer, Eric; Sperling, Linda

2008-01-01

Ciliates are the only unicellular eukaryotes known to separate germinal and somatic functions. Diploid but silent micronuclei transmit the genetic information to the next sexual generation. Polyploid macronuclei express the genetic information from a streamlined version of the genome but are replaced at each sexual generation. The macronuclear genome of Paramecium tetraurelia was recently sequenced by a shotgun approach, providing access to the gene repertoire. The 72-Mb assembly represents a consensus sequence for the somatic DNA, which is produced after sexual events by reproducible rearrangements of the zygotic genome involving elimination of repeated sequences, precise excision of unique-copy internal eliminated sequences (IES), and amplification of the cellular genes to high copy number. We report use of the shotgun sequencing data (>106 reads representing 13× coverage of a completely homozygous clone) to evaluate variability in the somatic DNA produced by these developmental genome rearrangements. Although DNA amplification appears uniform, both of the DNA elimination processes produce sequence heterogeneity. The variability that arises from IES excision allowed identification of hundreds of putative new IESs, compared to 42 that were previously known, and revealed cases of erroneous excision of segments of coding sequences. We demonstrate that IESs in coding regions are under selective pressure to introduce premature termination of translation in case of excision failure. PMID:18256234
Innate Immune Complexity in the Purple Sea Urchin: Diversity of the Sp185/333 System

PubMed Central

Smith, L. Courtney

2012-01-01

The California purple sea urchin, Strongylocentrotus purpuratus, is a long-lived echinoderm with a complex and sophisticated innate immune system. There are several large gene families that function in immunity in this species including the Sp185/333 gene family that has ∼50 (±10) members. The family shows intriguing sequence diversity and encodes a broad array of diverse yet similar proteins. The genes have two exons of which the second encodes the mature protein and has repeats and blocks of sequence called elements. Mosaics of element patterns plus single nucleotide polymorphisms-based variants of the elements result in significant sequence diversity among the genes yet maintains similar structure among the members of the family. Sequence of a bacterial artificial chromosome insert shows a cluster of six, tightly linked Sp185/333 genes that are flanked by GA microsatellites. The sequences between the GA microsatellites in which the Sp185/333 genes and flanking regions are located, are much more similar to each other than are the sequences outside the microsatellites suggesting processes such as gene conversion, recombination, or duplication. However, close linkage does not correspond with greater sequence similarity compared to randomly cloned and sequenced genes that are unlikely to be linked. There are three segmental duplications that are bounded by GAT microsatellites and include three almost identical genes plus flanking regions. RNA editing is detectible throughout the mRNAs based on comparisons to the genes, which, in combination with putative post-translational modifications to the proteins, results in broad arrays of Sp185/333 proteins that differ among individuals. The mature proteins have an N-terminal glycine-rich region, a central RGD motif, and a C-terminal histidine-rich region. The Sp185/333 proteins are localized to the cell surface and are found within vesicles in subsets of polygonal and small phagocytes. The coelomocyte proteome shows full-length and truncated proteins, including some with missense sequence. Current results suggest that both native Sp185/333 proteins and a recombinant protein bind bacteria and are likely important in sea urchin innate immunity. PMID:22566951
25 CFR 11.1114 - Termination.

Code of Federal Regulations, 2010 CFR

2010-04-01

... functions; (iii) The parent(s) has subjected the minor to willful and repeated acts of sexual abuse; (iv... Minor-in-Need-of-Care Procedure § 11.1114 Termination. (a) Parental rights to a child may be terminated by the children's court according to the procedures in this section. (b) Proceedings to terminate...
Genetic Studies of the Prp17 Gene of Saccharomyces Cerevisiae: A Domain Essential for Function Maps to a Nonconserved Region of the Protein

PubMed Central

Seshadri, V.; Vaidya, V. C.; Vijayraghavan, U.

1996-01-01

The PRP17 gene product is required for the second step of pre-mRNA splicing reactions. The C-terminal half of this protein bears four repeat units with homology to the β transducin repeat. Missense mutations in three temperature-sensitive prp17 mutants map to a region in the N-terminal half of the protein. We have generated, in vitro, 11 missense alleles at the β transducin repeat units and find that only one affects function in vivo. A phenotypically silent missense allele at the fourth repeat unit enhances the slow-growing phenotype conferred by an allele at the third repeat, suggesting an interaction between these domains. Although many missense mutations in highly conserved amino acids lack phenotypic effects, deletion analysis suggests an essential role for these units. Only mutations in the N-terminal nonconserved domain of PRP17 are synthetically lethal in combination with mutations in PRP16 and PRP18, two other gene products required for the second splicing reaction. A mutually allele-specific interaction between prp17 and snr7, with mutations in U5 snRNA, was observed. We therefore suggest that the functional region of Prp17p that interacts with Prp18p, Prp16p, and U5 snRNA is in the N terminal region of the protein. PMID:8722761
On the phylogenetic placement of human T cell leukemia virus type 1 sequences associated with an Andean mummy.

PubMed

Coulthart, Michael B; Posada, David; Crandall, Keith A; Dekaban, Gregory A

2006-03-01

Recently, the putative finding of ancient human T cell leukemia virus type 1 (HTLV-1) long terminal repeat (LTR) DNA sequences in association with a 1500-year-old Chilean mummy has stirred vigorous debate. The debate is based partly on the inherent uncertainties associated with phylogenetic reconstruction when only short sequences of closely related genotypes are available. However, a full analysis of what phylogenetic information is present in the mummy data has not previously been published, leaving open the question of what precisely is the range of admissible interpretation. To fulfill this need, we re-analyzed the mummy data in a new way. We first performed phylogenetic analysis of 188 published LTR DNA sequences from extant strains belonging to the HTLV-1 Cosmopolitan clade, using the method of statistical parsimony which is designed both to optimize phylogenetic resolution among sequences with little evolutionary divergence, and to permit precise mapping of individual sequence mutations onto branches of a divergence network. We then deduced possible phylogenetic positions for the two main categories of published Chilean mummy sequences, based on their published 157-nucleotide LTR sequences. The possible phylogenetic placements for one of the mummy sequence categories are consistent with a modern origin. However, one of these placements for the other mummy sequence category falls very close to the root of the Cosmopolitan clade, consistent with an ancient origin for both this mummy sequence and the Cosmopolitan clade.
Molecular Dynamics Simulation of Rap1 Myb-type domain in Saccharomyces cerevisiae

PubMed Central

Mukherjee, Koel; Pandey, Dev Mani; Vidyarthi, Ambarish Saran

2012-01-01

Telomere is a nucleoprotein complex that plays important role in stability and their maintenance and consists of random repeats of species specific motifs. In budding Saccharomyces cerevisiae, Repressor Activator Protein 1 (Rap1) is a sequence specific protein that involved in transcriptional regulation. Rap1 consist of three active domains like N-terminal BRCT-domain, DNA-binding domain and C-terminal RCT-domain. In this study the unknown 3D structure of Myb-type domain (having 61 residues) within DNAbinding domain was modeled by Modeller7, and verified using different online bioinformatics tools (ProCheck, WhatIf, Verify3D). Dynamics of Myb-type domain of Rap1was carried out through simulation studies using GROMACS software. Time dependent interactions among the molecules were analyzed by Root Mean Square Deviation (RMSD), Radius of Gyration (Rg) and Root Mean Square Fluctuation (RMSF) plots. Motional properties in reduced dimension were also performed by Principal Component Analysis (PCA). Result indicated that Rap1 interacts with DNA major groove through its Helix Turn Helix motifs. Helix 3 was rigid, less amount of fluctuation was found as it interacts with DNA major groove. Helix2 and N-terminal having considerable fluctuation in the time scale. PMID:23144544
Molecular Dynamics Simulation of Rap1 Myb-type domain in Saccharomyces cerevisiae.

PubMed

Mukherjee, Koel; Pandey, Dev Mani; Vidyarthi, Ambarish Saran

2012-01-01

Telomere is a nucleoprotein complex that plays important role in stability and their maintenance and consists of random repeats of species specific motifs. In budding Saccharomyces cerevisiae, Repressor Activator Protein 1 (Rap1) is a sequence specific protein that involved in transcriptional regulation. Rap1 consist of three active domains like N-terminal BRCT-domain, DNA-binding domain and C-terminal RCT-domain. In this study the unknown 3D structure of Myb-type domain (having 61 residues) within DNAbinding domain was modeled by Modeller7, and verified using different online bioinformatics tools (ProCheck, WhatIf, Verify3D). Dynamics of Myb-type domain of Rap1was carried out through simulation studies using GROMACS software. Time dependent interactions among the molecules were analyzed by Root Mean Square Deviation (RMSD), Radius of Gyration (Rg) and Root Mean Square Fluctuation (RMSF) plots. Motional properties in reduced dimension were also performed by Principal Component Analysis (PCA). Result indicated that Rap1 interacts with DNA major groove through its Helix Turn Helix motifs. Helix 3 was rigid, less amount of fluctuation was found as it interacts with DNA major groove. Helix2 and N-terminal having considerable fluctuation in the time scale.
End Joining-Mediated Gene Expression in Mammalian Cells Using PCR-Amplified DNA Constructs that Contain Terminator in Front of Promoter.

PubMed

Nakamura, Mikiko; Suzuki, Ayako; Akada, Junko; Tomiyoshi, Keisuke; Hoshida, Hisashi; Akada, Rinji

2015-12-01

Mammalian gene expression constructs are generally prepared in a plasmid vector, in which a promoter and terminator are located upstream and downstream of a protein-coding sequence, respectively. In this study, we found that front terminator constructs-DNA constructs containing a terminator upstream of a promoter rather than downstream of a coding region-could sufficiently express proteins as a result of end joining of the introduced DNA fragment. By taking advantage of front terminator constructs, FLAG substitutions, and deletions were generated using mutagenesis primers to identify amino acids specifically recognized by commercial FLAG antibodies. A minimal epitope sequence for polyclonal FLAG antibody recognition was also identified. In addition, we analyzed the sequence of a C-terminal Ser-Lys-Leu peroxisome localization signal, and identified the key residues necessary for peroxisome targeting. Moreover, front terminator constructs of hepatitis B surface antigen were used for deletion analysis, leading to the identification of regions required for the particle formation. Collectively, these results indicate that front terminator constructs allow for easy manipulations of C-terminal protein-coding sequences, and suggest that direct gene expression with PCR-amplified DNA is useful for high-throughput protein analysis in mammalian cells.
Predicted stem-loop structures and variation in nucleotide sequence of 3' noncoding regions among animal calicivirus genomes.

PubMed

Seal, B S; Neill, J D; Ridpath, J F

1994-07-01

Caliciviruses are nonenveloped with a polyadenylated genome of approximately 7.6 kb and a single capsid protein. The "RNA Fold" computer program was used to analyze 3'-terminal noncoding sequences of five feline calicivirus (FCV), rabbit hemorrhagic disease virus (RHDV), and two San Miguel sea lion virus (SMSV) isolates. The FCV 3'-terminal sequences are 40-46 nucleotides in length and 72-91% similar. The FCV sequences were predicted to contain two possible duplex structures and one stem-loop structure with free energies of -2.1 to -18.2 kcal/mole. The RHDV genomic 3'-terminal RNA sequences are 54 nucleotides in length and share 49% sequence similarity to homologous regions of the FCV genome. The RHDV sequence was predicted to form two duplex structures in the 3'-terminal noncoding region with a single stem-loop structure, resembling that of FCV. In contrast, the SMSV 1 and 4 genomic 3'-terminal noncoding sequences were 185 and 182 nucleotides in length, respectively. Ten possible duplex structures were predicted with an average structural free energy of -35 kcal/mole. Sequence similarity between the two SMSV isolates was 75%. Furthermore, extensive cloverleaflike structures are predicted in the 3' noncoding region of the SMSV genome, in contrast to the predicted single stem-loop structures of FCV or RHDV.
The complete mitochondrial genome structure of snow leopard Panthera uncia.

PubMed

Wei, Lei; Wu, Xiaobing; Jiang, Zhigang

2009-05-01

The complete mitochondrial genome (mtDNA) of snow leopard Panthera uncia was obtained by using the polymerase chain reaction (PCR) technique based on the PCR fragments of 30 primers we designed. The entire mtDNA sequence was 16 773 base pairs (bp) in length, and the base composition was: A-5,357 bp (31.9%); C-4,444 bp (26.5%); G-2,428 bp (14.5%); T-4,544 bp (27.1%). The structural characteristics [0] of the P. uncia mitochondrial genome were highly similar to these of Felis catus, Acinonyx jubatus, Neofelis nebulosa and other mammals. However, we found several distinctive features of the mitochondrial genome of Panthera unica. First, the termination codon of COIII was TAA, which differed from those of F. catus, A. jubatus and N. nebulosa. Second, tRNA(Ser) ((AGY)), which lacked the ''DHU'' arm, could not be folded into the typical cloverleaf-shaped structure. Third, in the control region, a long repetitive sequence in RS-2 (32 bp) region was found with 2 repeats while one short repetitive segment (9 bp) was found with 15 repeats in the RS-3 region. We performed phylogenetic analysis based on a 3 816 bp concatenated sequence of 12S rRNA, 16S rRNA, ND2, ND4, ND5, Cyt b and ATP8 for P. uncia and other related species, the result indicated that P. uncia and P. leo were the sister species, which was different from the previous findings.
New molecular markers and cytogenetic probes enable chromosome identification of wheat-Thinopyrum intermedium introgression lines for improving protein and gluten contents.

PubMed

Li, Guangrong; Wang, Hongjin; Lang, Tao; Li, Jianbo; La, Shixiao; Yang, Ennian; Yang, Zujun

2016-10-01

New molecular markers were developed for targeting Thinopyrum intermedium 1St#2 chromosome, and novel FISH probe representing the terminal repeats was produced for identification of Thinopyrum chromosomes. Thinopyrum intermedium has been used as a valuable resource for improving the disease resistance and yield potential of wheat. A wheat-Th. intermedium ssp. trichophorum chromosome 1St#2 substitution and translocation has displayed superior grain protein and wet gluten content. With the aim to develop a number of chromosome 1St#2 specific molecular and cytogenetic markers, a high throughput, low-cost specific-locus amplified fragment sequencing (SLAF-seq) technology was used to compare the sequences between a wheat-Thinopyrum 1St#2 (1D) substitution and the related species Pseudoroegneria spicata (St genome, 2n = 14). A total of 5142 polymorphic fragments were analyzed and 359 different SLAF markers for 1St#2 were predicted. Thirty-seven specific molecular markers were validated by PCR from 50 randomly selected SLAFs. Meanwhile, the distribution of transposable elements (TEs) at the family level between wheat and St genomes was compared using the SLAFs. A new oligo-nucleotide probe named Oligo-pSt122 from high SLAF reads was produced for fluorescence in situ hybridization (FISH), and was observed to hybridize to the terminal region of 1St#L and also onto the terminal heterochromatic region of Th. intermedium genomes. The genome-wide markers and repetitive based probe Oligo-pSt122 will be valuable for identifying Thinopyrum chromosome segments in wheat backgrounds.
YAC cloning Mus musculus telomeric DNA: physical, genetic, in situ and STS markers for the distal telomere of chromosome 10.

PubMed

Kipling, D; Wilson, H E; Thomson, E J; Cooke, H J

1995-06-01

Three Mus musculus DBA/2 YAC libraries were constructed using a half-YAC telomere cloning vector. This functional complementation approach yields libraries which include terminal restriction fragments of the mouse genome. Screening all three libraries led to the isolation of 32 independent clones which carry linear YACs containing the mouse terminal repeat sequence, (TTAGGG)n. These YACs provide a resource to isolate regions of the mouse genome close to chromosome termini and excluded from existing conventional YAC libraries. To demonstrate their utility, a hybridization probe was isolated from Mtel-1, the first (TTAGGG)n-containing YAC isolated. This probe detects a approximately 70 kb Kpnl fragment in the mouse genome which is sensitive to pretreatment with BAL31 exonuclease. A PCR-based genetic marker generated from the sequence of this probe maps 4.4 cM from the most distal anchor locus on chromosome 10 in the EUCIB interspecific backcross. STS primers for this locus, D10Hgu1, were used to isolate YAC 110F4 from a commercially available mouse YAC library. Fluorescence in situ hybridization demonstrates that YAC 110F4 hybridizes to the distal telomere of chromosome 10. Clones in this collection of telomere YACs therefore partially overlap clones in conventional YAC libraries, and thus the previously unavailable terminal regions of the mouse genome can now be linked with the developing mouse STS YAC contig. Genetic markers such as D10Hgu1 allow the ends of the mouse genetic map to be defined, thus closing the map.
Characterization of transposable elements in the ectomycorrhizal fungus Laccaria bicolor.

PubMed

Labbé, Jessy; Murat, Claude; Morin, Emmanuelle; Tuskan, Gerald A; Le Tacon, François; Martin, Francis

2012-01-01

The publicly available Laccaria bicolor genome sequence has provided a considerable genomic resource allowing systematic identification of transposable elements (TEs) in this symbiotic ectomycorrhizal fungus. Using a TE-specific annotation pipeline we have characterized and analyzed TEs in the L. bicolor S238N-H82 genome. TEs occupy 24% of the 60 Mb L. bicolor genome and represent 25,787 full-length and partial copy elements distributed within 171 families. The most abundant elements were the Copia-like. TEs are not randomly distributed across the genome, but are tightly nested or clustered. The majority of TEs exhibits signs of ancient transposition except some intact copies of terminal inverted repeats (TIRS), long terminal repeats (LTRs) and a large retrotransposon derivative (LARD) element. There were three main periods of TE expansion in L. bicolor: the first from 57 to 10 Mya, the second from 5 to 1 Mya and the most recent from 0.5 Mya ago until now. LTR retrotransposons are closely related to retrotransposons found in another basidiomycete, Coprinopsis cinerea. This analysis 1) represents an initial characterization of TEs in the L. bicolor genome, 2) contributes to improve genome annotation and a greater understanding of the role TEs played in genome organization and evolution and 3) provides a valuable resource for future research on the genome evolution within the Laccaria genus.
Characterization of Transposable Elements in the Ectomycorrhizal Fungus Laccaria bicolor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Labbe, Jessy L; Murat, Claude; Morin, Emmanuelle

2012-01-01

Background: The publicly available Laccaria bicolor genome sequence has provided a considerable genomic resource allowing systematic identification of transposable elements (TEs) in this symbiotic ectomycorrhizal fungus. Using a TEspecific annotation pipeline we have characterized and analyzed TEs in the L. bicolor S238N-H82 genome. Methodology/Principal Findings: TEs occupy 24% of the 60 Mb L. bicolor genome and represent 25,787 full-length and partial copy elements distributed within 171 families. The most abundant elements were the Copia-like. TEs are not randomly distributed across the genome, but are tightly nested or clustered. The majority of TEs exhibits signs of ancient transposition except some intactmore » copies of terminal inverted repeats (TIRS), long terminal repeats (LTRs) and a large retrotransposon derivative (LARD) element. There were three main periods of TE expansion in L. bicolor: the first from 57 to 10 Mya, the second from 5 to 1 Mya and the most recent from 0.5 Mya ago until now. LTR retrotransposons are closely related to retrotransposons found in another basidiomycete, Coprinopsis cinerea. Conclusions: This analysis 1) represents an initial characterization of TEs in the L. bicolor genome, 2) contributes to improve genome annotation and a greater understanding of the role TEs played in genome organization and evolution and 3) provides a valuable resource for future research on the genome evolution within the Laccaria genus.« less

Characterization of Transposable Elements in Laccaria bicolor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Labbe, Jessy L; Murat, Claude; Morin, Emmanuelle

2012-01-01

Background: The publicly available Laccaria bicolor genome sequence has provided a considerable genomic resource allowing systematic identification of transposable elements (TEs) in this symbiotic ectomycorrhizal fungus. Using a TE-specific annotation pipeline we have characterized and analyzed TEs in the L. bicolor S238N-H82 genome. Methodology/Principal Findings: TEs occupy 24% of the 60 Mb L. bicolor genome and represent 25,787 full-length and partial copies elements distributed within 172 families. The most abundant elements were the Copia-like. TEs are not randomly distributed across the genome, but are tightly nested or clustered. The majority of TEs are ancient except some terminal inverted repeats (TIRS),more » long terminal repeats (LTRs) and a large retrotransposon derivative (LARD) element. There were three main periods of TEs expansion in L. bicolor; the first from 57 to 10 Mya, the second from 5 to 1 Mya and the most recent from 500,000 years ago until now. LTR retrotransposons are closely related to retrotransposons found in another basidiomycete, Coprinopsis cinerea. Conclusions: This analysis represents an initial characterization of TEs in the L. bicolor genome, contributes to genome assembly and to a greater understanding of the role TEs played in genome organization and evolution, and provides a valuable resource for the ongoing Laccaria Pan-Genome project supported by the U.S.-DOE Joint Genome Institute.« less
Hypervariability of ribosomal DNA at multiple chromosomal sites in lake trout (Salvelinus namaycush).

PubMed

Zhuo, L; Reed, K M; Phillips, R B

1995-06-01

Variation in the intergenic spacer (IGS) of the ribosomal DNA (rDNA) of lake trout (Salvelinus namaycush) was examined. Digestion of genomic DNA with restriction enzymes showed that almost every individual had a unique combination of length variants with most of this variation occurring within rather than between populations. Sequence analysis of a 2.3 kilobase (kb) EcoRI-DraI fragment spanning the 3' end of the 28S coding region and approximately 1.8 kb of the IGS revealed two blocks of repetitive DNA. Putative transcriptional termination sites were found approximately 220 bases (b) downstream from the end of the 28S coding region. Comparison of the 2.3-kb fragments with two longer (3.1 kb) fragments showed that the major difference in length resulted from variation in the number of short (89 b) repeats located 3' to the putative terminator. Repeat units within a single nucleolus organizer region (NOR) appeared relatively homogeneous and genetic analysis found variants to be stably inherited. A comparison of the number of spacer-length variants with the number of NORs found that the number of length variants per individual was always less than the number of NORs. Examination of spacer variants in five populations showed that populations with more NORs had more spacer variants, indicating that variants are present at different rDNA sites on nonhomologous chromosomes.
A parvovirus isolated from royal python (Python regius) is a member of the genus Dependovirus.

PubMed

Farkas, Szilvia L; Zádori, Zoltán; Benko, Mária; Essbauer, Sandra; Harrach, Balázs; Tijssen, Peter

2004-03-01

Parvoviruses were isolated from Python regius and Boa constrictor snakes and propagated in viper heart (VH-2) and iguana heart (IgH-2) cells. The full-length genome of a snake parvovirus was cloned and both strands were sequenced. The organization of the 4432-nt-long genome was found to be typical of parvoviruses. This genome was flanked by inverted terminal repeats (ITRs) of 154 nt, containing 122 nt terminal hairpins and contained two large open reading frames, encoding the non-structural and structural proteins. Genes of this new parvovirus were most similar to those from waterfowl parvoviruses and from adeno-associated viruses (AAVs), albeit to a relatively low degree and with some organizational differences. The structure of its ITRs also closely resembled those of AAVs. Based on these data, we propose to classify this virus, the first serpentine parvovirus to be identified, as serpentine adeno-associated virus (SAAV) in the genus Dependovirus.
Cracking the ANP32 whips: Important functions, unequal requirement, and hints at disease implications

PubMed Central

Reilly, Patrick T; Yu, Yun; Hamiche, Ali; Wang, Lishun

2014-01-01

The acidic (leucine-rich) nuclear phosphoprotein 32 kDa (ANP32) family is composed of small, evolutionarily conserved proteins characterized by an N-terminal leucine-rich repeat domain and a C-terminal low-complexity acidic region. The mammalian family members (ANP32A, ANP32B, and ANP32E) are ascribed physiologically diverse functions including chromatin modification and remodelling, apoptotic caspase modulation, protein phosphatase inhibition, as well as regulation of intracellular transport. In addition to reviewing the widespread literature on the topic, we present a concept of the ANP32s as having a whip-like structure. We also present hypotheses that ANP32C and other intronless sequences should not currently be considered bona fide family members, that their disparate necessity in development may be due to compensatory mechanisms, that their contrasting roles in cancer are likely context-dependent, along with an underlying hypothesis that ANP32s represent an important node of physiological regulation by virtue of their diverse biochemical activities. PMID:25156960
Expression, purification and crystallization of the C-terminal LRR domain of Streptococcus pyogenes protein 0843.

PubMed

Haikarainen, Teemu; Loimaranta, Vuokko; Prieto-Lopez, Carlos; Battula, Pradeep; Finne, Jukka; Papageorgiou, Anastassios C

2013-05-01

Streptococcus pyogenes protein 0843 (Spy0843) is a recently identified protein with a potential adhesin function. Sequence analysis has shown that Spy0843 contains two leucine-rich repeat (LRR) domains that mediate interactions with the gp340 receptor. Here, the C-terminal LRR domain was overexpressed in Escherichia coli, purified and crystallized in the presence of 1.7-1.8 M ammonium sulfate pH 7.4 as precipitant. Data were collected from a single crystal to 1.59 Å resolution at 100 K at a synchrotron-radiation source. The crystal was found to belong to space group I41, with unit-cell parameters a = b = 121.4, c = 51.5 Å and one molecule in the asymmetric unit. Elucidation of the crystal structure will provide insights into the interactions of Spy0843 with the gp340 receptor and a better understanding of the role of Spy0843 in streptococcal infections.
Expression, purification and crystallization of the C-terminal LRR domain of Streptococcus pyogenes protein 0843

PubMed Central

Haikarainen, Teemu; Loimaranta, Vuokko; Prieto-Lopez, Carlos; Battula, Pradeep; Finne, Jukka; Papageorgiou, Anastassios C.

2013-01-01

Streptococcus pyogenes protein 0843 (Spy0843) is a recently identified protein with a potential adhesin function. Sequence analysis has shown that Spy0843 contains two leucine-rich repeat (LRR) domains that mediate interactions with the gp340 receptor. Here, the C-terminal LRR domain was overexpressed in Escherichia coli, purified and crystallized in the presence of 1.7–1.8 M ammonium sulfate pH 7.4 as precipitant. Data were collected from a single crystal to 1.59 Å resolution at 100 K at a synchrotron-radiation source. The crystal was found to belong to space group I41, with unit-cell parameters a = b = 121.4, c = 51.5 Å and one molecule in the asymmetric unit. Elucidation of the crystal structure will provide insights into the interactions of Spy0843 with the gp340 receptor and a better understanding of the role of Spy0843 in streptococcal infections. PMID:23695577
A Novel Family of Sequence-specific Endoribonucleases Associated with the Clustered Regularly Interspaced Short Palindromic Repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beloglazova, Natalia; Brown, Greg; Zimmerman, Matthew D.

Clustered regularly interspaced short palindromic repeats (CRISPRs) together with the associated CAS proteins protect microbial cells from invasion by foreign genetic elements using presently unknown molecular mechanisms. All CRISPR systems contain proteins of the CAS2 family, suggesting that these uncharacterized proteins play a central role in this process. Here we show that the CAS2 proteins represent a novel family of endoribonucleases. Six purified CAS2 proteins from diverse organisms cleaved single-stranded RNAs preferentially within U-rich regions. A representative CAS2 enzyme, SSO1404 from Sulfolobus solfataricus, cleaved the phosphodiester linkage on the 3'-side and generated 5'-phosphate- and 3'-hydroxyl-terminated oligonucleotides. The crystal structure ofmore » SSO1404 was solved at 1.6{angstrom} resolution revealing the first ribonuclease with a ferredoxin-like fold. Mutagenesis of SSO1404 identified six residues (Tyr-9, Asp-10, Arg-17, Arg-19, Arg-31, and Phe-37) that are important for enzymatic activity and suggested that Asp-10 might be the principal catalytic residue. Thus, CAS2 proteins are sequence-specific endoribonucleases, and we propose that their role in the CRISPR-mediated anti-phage defense might involve degradation of phage or cellular mRNAs.« less
Isolation, sequencing and expression of RED, a novel human gene encoding an acidic-basic dipeptide repeat.

PubMed

Assier, E; Bouzinba-Segard, H; Stolzenberg, M C; Stephens, R; Bardos, J; Freemont, P; Charron, D; Trowsdale, J; Rich, T

1999-04-16

A novel human gene RED, and the murine homologue, MuRED, were cloned. These genes were named after the extensive stretch of alternating arginine (R) and glutamic acid (E) or aspartic acid (D) residues that they contain. We term this the 'RED' repeat. The genes of both species were expressed in a wide range of tissues and we have mapped the human gene to chromosome 5q22-24. MuRED and RED shared 98% sequence identity at the amino acid level. The open reading frame of both genes encodes a 557 amino acid protein. RED fused to a fluorescent tag was expressed in nuclei of transfected cells and localised to nuclear dots. Co-localisation studies showed that these nuclear dots did not contain either PML or Coilin, which are commonly found in the POD or coiled body nuclear compartments. Deletion of the amino terminal 265 amino acids resulted in a failure to sort efficiently to the nucleus, though nuclear dots were formed. Deletion of a further 50 amino acids from the amino terminus generates a protein that can sort to the nucleus but is unable to generate nuclear dots. Neither construct localised to the nucleolus. The characteristics of RED and its nuclear localisation implicate it as a regulatory protein, possibly involved in transcription.
Single inverted terminal repeats of the Junonia coenia Densovirus promotes somatic chromosomal integration of vector plasmids in insect cells and supports high efficiency expression

USDA-ARS?s Scientific Manuscript database

Plasmids that contain a disrupted genome of the Junonia coenia densovirus (JcDNV) integrate into the chromosomes of the somatic cells of insects. When subcloned individually, both the P9 inverted terminal repeat (P9-ITR) and the P93-ITR promote the chromosomal integration of vector plasmids in insec...
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed

Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed Central

Rehm, Charlotte; Wurmthaler, Lena A.; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S.

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1–5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6–9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria. PMID:26695179
Analysis of simple sequence repeat (SSR) structure and sequence within Epichloë endophyte genomes reveals impacts on gene structure and insights into ancestral hybridization events.

PubMed

Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry

2017-01-01

Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.
3′ terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing

PubMed Central

2013-01-01

Background Post-transcriptional 3′ end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3′ RACE coupled with high-throughput sequencing to characterize the 3′ terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. Results The 3′ terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3′ terminus of an in vitro transcribed MRP RNA control and the differing 3′ terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). Conclusions 3′ RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3′ terminal sequences of noncoding RNAs. PMID:24053768
TRAP: automated classification, quantification and annotation of tandemly repeated sequences.

PubMed

Sobreira, Tiago José P; Durham, Alan M; Gruber, Arthur

2006-02-01

TRAP, the Tandem Repeats Analysis Program, is a Perl program that provides a unified set of analyses for the selection, classification, quantification and automated annotation of tandemly repeated sequences. TRAP uses the results of the Tandem Repeats Finder program to perform a global analysis of the satellite content of DNA sequences, permitting researchers to easily assess the tandem repeat content for both individual sequences and whole genomes. The results can be generated in convenient formats such as HTML and comma-separated values. TRAP can also be used to automatically generate annotation data in the format of feature table and GFF files.
Weight distributions for turbo codes using random and nonrandom permutations

NASA Technical Reports Server (NTRS)

Dolinar, S.; Divsalar, D.

1995-01-01

This article takes a preliminary look at the weight distributions achievable for turbo codes using random, nonrandom, and semirandom permutations. Due to the recursiveness of the encoders, it is important to distinguish between self-terminating and non-self-terminating input sequences. The non-self-terminating sequences have little effect on decoder performance, because they accumulate high encoded weight until they are artificially terminated at the end of the block. From probabilistic arguments based on selecting the permutations randomly, it is concluded that the self-terminating weight-2 data sequences are the most important consideration in the design of constituent codes; higher-weight self-terminating sequences have successively decreasing importance. Also, increasing the number of codes and, correspondingly, the number of permutations makes it more and more likely that the bad input sequences will be broken up by one or more of the permuters. It is possible to design nonrandom permutations that ensure that the minimum distance due to weight-2 input sequences grows roughly as the square root of (2N), where N is the block length. However, these nonrandom permutations amplify the bad effects of higher-weight inputs, and as a result they are inferior in performance to randomly selected permutations. But there are 'semirandom' permutations that perform nearly as well as the designed nonrandom permutations with respect to weight-2 input sequences and are not as susceptible to being foiled by higher-weight inputs.
Regions of conservation and divergence in the 3' untranslated sequences of genomic RNA from Ross River virus isolates.

PubMed

Faragher, S G; Dalgarno, L

1986-07-20

The 3' untranslated (UT) sequences of the genomic RNAs of five geographic variants of the alphavirus Ross River virus (RRV) were determined and compared with the 3' UT sequence of RRV T48, the prototype strain. Part of the 3' UT region of Getah virus, a close serological relative of RRV, was also sequenced. The RRV 3' UT region varies markedly in length between variants. Large deletions or insertions, sequence rearrangements and single nucleotide substitutions are observed. A sequence tract of 49 to 58 nucleotides, which is repeated as four blocks in the RRV T48 3' UT region, occurs only once in the 3' UT region of one RRV strain (NB5092), indicating that the existence of repeat sequence blocks is not essential for RRV replication. However, the precise sequence of the 3' proximal copy of the repeat block and its position relative to the poly(A) tail were identical in all RRV isolates examined, suggesting that it has an important role in RRV replication. Nucleotide substitutions between RRV variants are distributed non-randomly along the length of the 3' UT region. The sequence of 120 to 130 nucleotides adjacent to the poly(A) tail is strongly conserved. Getah virus RNA contains three repeat sequence blocks in the 3' UT region. These are similar in sequence to those in RRV RNA but differ in their arrangement. Homology between the RRV and Getah 3' UT sequences is greatest in the 3' proximal repeat sequence block that shows three differences in 49 nucleotides. The 3' proximal repeat in Getah RNA occurs at the same position, relative to the poly(A) tail, as in all RRV variants. The RRV and Getah virus 3' UT sequences show extensive homology in the region between the 3' proximal repeat and the poly(A) tail but, apart from the repeat blocks themselves, they show no significant homology elsewhere.
Crystal structure of the Xpo1p nuclear export complex bound to the SxFG/PxFG repeats of the nucleoporin Nup42p.

PubMed

Koyama, Masako; Hirano, Hidemi; Shirai, Natsuki; Matsuura, Yoshiyuki

2017-10-01

Xpo1p (yeast CRM1) is the major nuclear export receptor that carries a plethora of proteins and ribonucleoproteins from the nucleus to cytoplasm. The passage of the Xpo1p nuclear export complex through nuclear pore complexes (NPCs) is facilitated by interactions with nucleoporins (Nups) containing extensive repeats of phenylalanine-glycine (so-called FG repeats), although the precise role of each Nup in the nuclear export reaction remains incompletely understood. Here we report structural and biochemical characterization of the interactions between the Xpo1p nuclear export complex and the FG repeats of Nup42p, a nucleoporin localized at the cytoplasmic face of yeast NPCs and has characteristic SxFG/PxFG sequence repeat motif. The crystal structure of Xpo1p-PKI-Nup42p-Gsp1p-GTP complex identified three binding sites for the SxFG/PxFG repeats on HEAT repeats 14-20 of Xpo1p. Mutational analyses of Nup42p showed that the conserved serines and prolines in the SxFG/PxFG repeats contribute to Xpo1p-Nup42p binding. Our structural and biochemical data suggest that SxFG/PxFG-Nups such as Nup42p and Nup159p at the cytoplasmic face of NPCs provide high-affinity docking sites for the Xpo1p nuclear export complex in the terminal stage of NPC passage and that subsequent disassembly of the nuclear export complex facilitates recycling of free Xpo1p back to the nucleus. © 2017 Molecular Biology Society of Japan and John Wiley & Sons Australia, Ltd.
Unique Features of the Loblolly Pine (Pinus taeda L.) Megagenome Revealed Through Sequence Annotation

PubMed Central

Wegrzyn, Jill L.; Liechty, John D.; Stevens, Kristian A.; Wu, Le-Shin; Loopstra, Carol A.; Vasquez-Gross, Hans A.; Dougherty, William M.; Lin, Brian Y.; Zieve, Jacob J.; Martínez-García, Pedro J.; Holt, Carson; Yandell, Mark; Zimin, Aleksey V.; Yorke, James A.; Crepeau, Marc W.; Puiu, Daniela; Salzberg, Steven L.; de Jong, Pieter J.; Mockaitis, Keithanne; Main, Doreen; Langley, Charles H.; Neale, David B.

2014-01-01

The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20–40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%. PMID:24653211
Active site of tripeptidyl peptidase II from human erythrocytes is of the subtilisin type

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tomkinson, B.; Wernstedt, C.; Hellman, U.

1987-11-01

The present report presents evidence that the amino acid sequence around the serine of the active site of human tripeptidyl peptidase II is of the subtilisin type. The enzyme from human erythrocytes was covalently labeled at its active site with (/sup 3/H)diisopropyl fluorophosphate, and the protein was subsequently reduced, alkylated, and digested with trypsin. The labeled tryptic peptides were purified by gel filtration and repeated reversed-phase HPLC, and their amino-terminal sequences were determined. Residue 9 contained the radioactive label and was, therefore, considered to be the active serine residue. The primary structure of the part of the active site (residuesmore » 1-10) containing this residue was concluded to be Xaa-Thr-Gln-Leu-Met-Asx-Gly-Thr-Ser-Met. This amino acid sequence is homologous to the sequence surrounding the active serine of the microbial peptidases subtilisin and thermitase. These data demonstrate that human tripeptidyl peptidase II represents a potentially distinct class of human peptidases and raise the question of an evolutionary relationship between the active site of a mammalian peptidase and that of the subtilisin family of serine peptidases.« less
New dye-labeled terminators for improved DNA sequencing patterns.

PubMed Central

Rosenblum, B B; Lee, L G; Spurgeon, S L; Khan, S H; Menchen, S M; Heiner, C R; Chen, S M

1997-01-01

We have used two new dye sets for automated dye-labeled terminator DNA sequencing. One set consists of four, 4,7-dichlororhodamine dyes (d-rhodamines). The second set consists of energy-transfer dyes that use the 5-carboxy-d-rhodamine dyes as acceptor dyes and the 5- or 6-carboxy isomers of 4'-aminomethylfluorescein as the donor dye. Both dye sets utilize a new linker between the dye and the nucleotide, and both provide more even peak heights in terminator sequencing than the dye-terminators consisting of unsubstituted rhodamine dyes. The unsubstituted rhodamine terminators produced electropherograms in which weak G peaks are observed after A peaks and occasionally C peaks. The number of weak G peaks has been reduced or eliminated with the new dye terminators. The general improvement in peak evenness improves accuracy for the automated base-calling software. The improved signal-to-noise ratio of the energy-transfer dye-labeled terminators combined with more even peak heights results in successful sequencing of high molecular weight DNA templates such as bacterial artificial chromosome DNA. PMID:9358158

Pathogenic Leptospira species express surface-exposed proteins belonging to the bacterial immunoglobulin superfamily

PubMed Central

Matsunaga, James; Barocchi, Michele A.; Croda, Julio; Young, Tracy A.; Sanchez, Yolanda; Siqueira, Isadora; Bolin, Carole A.; Reis, Mitermayer G.; Riley, Lee W.; Haake, David A.; Ko, Albert I.

2005-01-01

Summary Proteins with bacterial immunoglobulin-like (Big) domains, such as the Yersinia pseudotuberculosis invasin and Escherichia coli intimin, are surface-expressed proteins that mediate host mammalian cell invasion or attachment. Here, we report the identification and characterization of a new family of Big domain proteins, referred to as Lig (leptospiral Ig-like) proteins, in pathogenic Leptospira. Screening of L. interrogans and L. kirschneri expression libraries with sera from leptospirosis patients identified 13 lambda phage clones that encode tandem repeats of the 90 amino acid Big domain. Two lig genes, designated ligA and ligB, and one pseudo-gene, ligC, were identified. The ligA and ligB genes encode amino-terminal lipoprotein signal peptides followed by 10 or 11 Big domain repeats and, in the case of ligB, a unique carboxy-terminal non-repeat domain. The organization of ligC is similar to that of ligB but contains mutations that disrupt the reading frame. The lig sequences are present in pathogenic but not saprophytic Leptospira species. LigA and LigB are expressed by a variety of virulent leptospiral strains. Loss of Lig protein and RNA transcript expression is correlated with the observed loss of virulence during culture attenuation of pathogenic strains. High-pressure freeze substitution followed by immunocytochemical electron microscopy confirmed that the Lig proteins were localized to the bacterial surface. Immunoblot studies with patient sera found that the Lig proteins are a major antigen recognized during the acute host infection. These observations demonstrate that the Lig proteins are a newly identified surface protein of pathogenic Leptospira, which by analogy to other bacterial immunoglobulin superfamily virulence factors, may play a role in host cell attachment and invasion during leptospiral pathogenesis. PMID:12890019
Pathogenic Leptospira species express surface-exposed proteins belonging to the bacterial immunoglobulin superfamily.

PubMed

Matsunaga, James; Barocchi, Michele A; Croda, Julio; Young, Tracy A; Sanchez, Yolanda; Siqueira, Isadora; Bolin, Carole A; Reis, Mitermayer G; Riley, Lee W; Haake, David A; Ko, Albert I

2003-08-01

Proteins with bacterial immunoglobulin-like (Big) domains, such as the Yersinia pseudotuberculosis invasin and Escherichia coli intimin, are surface-expressed proteins that mediate host mammalian cell invasion or attachment. Here, we report the identification and characterization of a new family of Big domain proteins, referred to as Lig (leptospiral Ig-like) proteins, in pathogenic Leptospira. Screening of L. interrogans and L. kirschneri expression libraries with sera from leptospirosis patients identified 13 lambda phage clones that encode tandem repeats of the 90 amino acid Big domain. Two lig genes, designated ligA and ligB, and one pseudogene, ligC, were identified. The ligA and ligB genes encode amino-terminal lipoprotein signal peptides followed by 10 or 11 Big domain repeats and, in the case of ligB, a unique carboxy-terminal non-repeat domain. The organization of ligC is similar to that of ligB but contains mutations that disrupt the reading frame. The lig sequences are present in pathogenic but not saprophytic Leptospira species. LigA and LigB are expressed by a variety of virulent leptospiral strains. Loss of Lig protein and RNA transcript expression is correlated with the observed loss of virulence during culture attenuation of pathogenic strains. High-pressure freeze substitution followed by immunocytochemical electron microscopy confirmed that the Lig proteins were localized to the bacterial surface. Immunoblot studies with patient sera found that the Lig proteins are a major antigen recognized during the acute host infection. These observations demonstrate that the Lig proteins are a newly identified surface protein of pathogenic Leptospira, which by analogy to other bacterial immunoglobulin superfamily virulence factors, may play a role in host cell attachment and invasion during leptospiral pathogenesis.
RAG1 Core and V(D)J Recombination Signal Sequences Were Derived from Transib Transposons

PubMed Central

2005-01-01

The V(D)J recombination reaction in jawed vertebrates is catalyzed by the RAG1 and RAG2 proteins, which are believed to have emerged approximately 500 million years ago from transposon-encoded proteins. Yet no transposase sequence similar to RAG1 or RAG2 has been found. Here we show that the approximately 600-amino acid “core” region of RAG1 required for its catalytic activity is significantly similar to the transposase encoded by DNA transposons that belong to the Transib superfamily. This superfamily was discovered recently based on computational analysis of the fruit fly and African malaria mosquito genomes. Transib transposons also are present in the genomes of sea urchin, yellow fever mosquito, silkworm, dog hookworm, hydra, and soybean rust. We demonstrate that recombination signal sequences (RSSs) were derived from terminal inverted repeats of an ancient Transib transposon. Furthermore, the critical DDE catalytic triad of RAG1 is shared with the Transib transposase as part of conserved motifs. We also studied several divergent proteins encoded by the sea urchin and lancelet genomes that are 25%−30% identical to the RAG1 N-terminal domain and the RAG1 core. Our results provide the first direct evidence linking RAG1 and RSSs to a specific superfamily of DNA transposons and indicate that the V(D)J machinery evolved from transposons. We propose that only the RAG1 core was derived from the Transib transposase, whereas the N-terminal domain was assembled from separate proteins of unknown function that may still be active in sea urchin, lancelet, hydra, and starlet sea anemone. We also suggest that the RAG2 protein was not encoded by ancient Transib transposons but emerged in jawed vertebrates as a counterpart of RAG1 necessary for the V(D)J recombination reaction. PMID:15898832
The complete mitochondrial genome of Pomacea canaliculata (Gastropoda: Ampullariidae).

PubMed

Zhou, Xuming; Chen, Yu; Zhu, Shanliang; Xu, Haigen; Liu, Yan; Chen, Lian

2016-01-01

The mitochondrial genome of Pomacea canaliculata (Gastropoda: Ampullariidae) is the first complete mtDNA sequence reported in the genus Pomacea. The total length of mtDNA is 15,707 bp, which containing 13 protein-coding genes, 2 ribosomal RNAs, 22 transfer RNAs, and a 359 bp non-coding region. The A + T content of the overall base composition of H-strand is 71.7% (T: 41%, C: 12.7%, A: 30.7%, G: 15.6%). ATP6, ATP8, CO1, CO2, ND1-3, ND5, ND6, ND4L and Cyt b genes begin with ATG as start codon, CO3 and ND4 begin with ATA. ATP8, CO2-3, ND4L, ND2-6 and Cyt b genes are terminated with TAA as stop codon, ATP6, ND1, and CO1 end with TAG. A long non-coding region is found and a 23 bp repeat unit repeat 11 times in this region.
MERE1, a low-copy-number copia-type retroelement in Medicago truncatula active during tissue culture.

PubMed

Rakocevic, Alexandra; Mondy, Samuel; Tirichine, Leïla; Cosson, Viviane; Brocard, Lysiane; Iantcheva, Anelia; Cayrel, Anne; Devier, Benjamin; Abu El-Heba, Ghada Ahmed; Ratet, Pascal

2009-11-01

We have identified an active Medicago truncatula copia-like retroelement called Medicago RetroElement1-1 (MERE1-1) as an insertion in the symbiotic NSP2 gene. MERE1-1 belongs to a low-copy-number family in the sequenced Medicago genome. These copies are highly related, but only three of them have a complete coding region and polymorphism exists between the long terminal repeats of these different copies. This retroelement family is present in all M. truncatula ecotypes tested but also in other legume species like Lotus japonicus. It is active only during tissue culture in both R108 and Jemalong Medicago accessions and inserts preferentially in genes.
Mitochondrial DNA repairs double-strand breaks in yeast chromosomes.

PubMed

Ricchetti, M; Fairhead, C; Dujon, B

1999-11-04

The endosymbiotic theory for the origin of eukaryotic cells proposes that genetic information can be transferred from mitochondria to the nucleus of a cell, and genes that are probably of mitochondrial origin have been found in nuclear chromosomes. Occasionally, short or rearranged sequences homologous to mitochondrial DNA are seen in the chromosomes of different organisms including yeast, plants and humans. Here we report a mechanism by which fragments of mitochondrial DNA, in single or tandem array, are transferred to yeast chromosomes under natural conditions during the repair of double-strand breaks in haploid mitotic cells. These repair insertions originate from noncontiguous regions of the mitochondrial genome. Our analysis of the Saccharomyces cerevisiae mitochondrial genome indicates that the yeast nuclear genome does indeed contain several short sequences of mitochondrial origin which are similar in size and composition to those that repair double-strand breaks. These sequences are located predominantly in non-coding regions of the chromosomes, frequently in the vicinity of retrotransposon long terminal repeats, and appear as recent integration events. Thus, colonization of the yeast genome by mitochondrial DNA is an ongoing process.
Complete Genome Sequence of Germline Chromosomally Integrated Human Herpesvirus 6A and Analyses Integration Sites Define a New Human Endogenous Virus with Potential to Reactivate as an Emerging Infection.

PubMed

Tweedy, Joshua; Spyrou, Maria Alexandra; Pearson, Max; Lassner, Dirk; Kuhl, Uwe; Gompels, Ursula A

2016-01-15

Human herpesvirus-6A and B (HHV-6A, HHV-6B) have recently defined endogenous genomes, resulting from integration into the germline: chromosomally-integrated "CiHHV-6A/B". These affect approximately 1.0% of human populations, giving potential for virus gene expression in every cell. We previously showed that CiHHV-6A was more divergent than CiHHV-6B by examining four genes in 44 European CiHHV-6A/B cardiac/haematology patients. There was evidence for gene expression/reactivation, implying functional non-defective genomes. To further define the relationship between HHV-6A and CiHHV-6A we used next-generation sequencing to characterize genomes from three CiHHV-6A cardiac patients. Comparisons to known exogenous HHV-6A showed CiHHV-6A genomes formed a separate clade; including all 85 non-interrupted genes and necessary cis-acting signals for reactivation as infectious virus. Greater single nucleotide polymorphism (SNP) density was defined in 16 genes and the direct repeats (DR) terminal regions. Using these SNPs, deep sequencing analyses demonstrated superinfection with exogenous HHV-6A in two of the CiHHV-6A patients with recurrent cardiac disease. Characterisation of the integration sites in twelve patients identified the human chromosome 17p subtelomere as a prevalent site, which had specific repeat structures and phylogenetically related CiHHV-6A coding sequences indicating common ancestral origins. Overall CiHHV-6A genomes were similar, but distinct from known exogenous HHV-6A virus, and have the capacity to reactivate as emerging virus infections.
Comparative Genomics of Carp Herpesviruses

PubMed Central

Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P.; Waltzek, Thomas B.

2013-01-01

Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803
Complete Genome Sequence of Germline Chromosomally Integrated Human Herpesvirus 6A and Analyses Integration Sites Define a New Human Endogenous Virus with Potential to Reactivate as an Emerging Infection

PubMed Central

Tweedy, Joshua; Spyrou, Maria Alexandra; Pearson, Max; Lassner, Dirk; Kuhl, Uwe; Gompels, Ursula A.

2016-01-01

Human herpesvirus-6A and B (HHV-6A, HHV-6B) have recently defined endogenous genomes, resulting from integration into the germline: chromosomally-integrated “CiHHV-6A/B”. These affect approximately 1.0% of human populations, giving potential for virus gene expression in every cell. We previously showed that CiHHV-6A was more divergent than CiHHV-6B by examining four genes in 44 European CiHHV-6A/B cardiac/haematology patients. There was evidence for gene expression/reactivation, implying functional non-defective genomes. To further define the relationship between HHV-6A and CiHHV-6A we used next-generation sequencing to characterize genomes from three CiHHV-6A cardiac patients. Comparisons to known exogenous HHV-6A showed CiHHV-6A genomes formed a separate clade; including all 85 non-interrupted genes and necessary cis-acting signals for reactivation as infectious virus. Greater single nucleotide polymorphism (SNP) density was defined in 16 genes and the direct repeats (DR) terminal regions. Using these SNPs, deep sequencing analyses demonstrated superinfection with exogenous HHV-6A in two of the CiHHV-6A patients with recurrent cardiac disease. Characterisation of the integration sites in twelve patients identified the human chromosome 17p subtelomere as a prevalent site, which had specific repeat structures and phylogenetically related CiHHV-6A coding sequences indicating common ancestral origins. Overall CiHHV-6A genomes were similar, but distinct from known exogenous HHV-6A virus, and have the capacity to reactivate as emerging virus infections. PMID:26784220
Murine recessive hereditary spherocytosis, sph/sph, is caused by a mutation in the erythroid alpha-spectrin gene.

PubMed

Wandersee, N J; Birkenmeier, C S; Gifford, E J; Mohandas, N; Barker, J E

2000-01-01

Spectrin, a heterodimer of alpha- and beta-subunits, is the major protein component of the red blood cell membrane skeleton. The mouse mutation, sph, causes an alpha-spectrin-deficient hereditary spherocytosis with the severe phenotype typical of recessive hereditary spherocytosis in humans. The sph mutation maps to the erythroid alpha-spectrin locus, Spna1, on Chromosome 1. Scanning electron microscopy, osmotic gradient ektacytometry, cDNA cloning, RT-PCR, nucleic acid sequencing, and Northern blot analyses were used to characterize the wild type and sph alleles of the Spna1 locus. Our results confirm the spherocytic nature of sph/sph red blood cells and document a mild spherocytic transition in the +/sph heterozygotes. Sequencing of the full length coding region of the Spna1 wild type allele from the C57BL/6J strain of mice reveals a 2414 residue deduced amino acid sequence that shows the typical 106-amino-acid repeat structure previously described for other members of the spectrin protein family. Sequence analysis of RT-PCR clones from sph/sph alpha-spectrin mRNA identified a single base deletion in repeat 5 that would cause a frame shift and premature termination of the protein. This deletion was confirmed in sph/sph genomic DNA. Northern blot analyses of the distribution of Spna1 mRNA in non-erythroid tissues detects the expression of 8, 2.5 and 2.0 kb transcripts in adult heart. These results predict the heart as an additional site where alpha-spectrin mutations may produce a phenotype and raise the possibility that a novel functional class of small alpha-spectrin isoforms may exist.
Terminator Detection by Support Vector Machine Utilizing aStochastic Context-Free Grammar

DOE Office of Scientific and Technical Information (OSTI.GOV)

Francis-Lyon, Patricia; Cristianini, Nello; Holbrook, Stephen

2006-12-30

A 2-stage detector was designed to find rho-independent transcription terminators in the Escherichia coli genome. The detector includes a Stochastic Context Free Grammar (SCFG) component and a Support Vector Machine (SVM) component. To find terminators, the SCFG searches the intergenic regions of nucleotide sequence for local matches to a terminator grammar that was designed and trained utilizing examples of known terminators. The grammar selects sequences that are the best candidates for terminators and assigns them a prefix, stem-loop, suffix structure using the Cocke-Younger-Kasaami (CYK) algorithm, modified to incorporate energy affects of base pairing. The parameters from this inferred structure aremore » passed to the SVM classifier, which distinguishes terminators from non-terminators that score high according to the terminator grammar. The SVM was trained with negative examples drawn from intergenic sequences that include both featureless and RNA gene regions (which were assigned prefix, stem-loop, suffix structure by the SCFG), so that it successfully distinguishes terminators from either of these. The classifier was found to be 96.4% successful during testing.« less
Amyloid fibril formation from sequences of a natural beta-structured fibrous protein, the adenovirus fiber.

PubMed

Papanikolopoulou, Katerina; Schoehn, Guy; Forge, Vincent; Forsyth, V Trevor; Riekel, Christian; Hernandez, Jean-François; Ruigrok, Rob W H; Mitraki, Anna

2005-01-28

Amyloid fibrils are fibrous beta-structures that derive from abnormal folding and assembly of peptides and proteins. Despite a wealth of structural studies on amyloids, the nature of the amyloid structure remains elusive; possible connections to natural, beta-structured fibrous motifs have been suggested. In this work we focus on understanding amyloid structure and formation from sequences of a natural, beta-structured fibrous protein. We show that short peptides (25 to 6 amino acids) corresponding to repetitive sequences from the adenovirus fiber shaft have an intrinsic capacity to form amyloid fibrils as judged by electron microscopy, Congo Red binding, infrared spectroscopy, and x-ray fiber diffraction. In the presence of the globular C-terminal domain of the protein that acts as a trimerization motif, the shaft sequences adopt a triple-stranded, beta-fibrous motif. We discuss the possible structure and arrangement of these sequences within the amyloid fibril, as compared with the one adopted within the native structure. A 6-amino acid peptide, corresponding to the last beta-strand of the shaft, was found to be sufficient to form amyloid fibrils. Structural analysis of these amyloid fibrils suggests that perpendicular stacking of beta-strand repeat units is an underlying common feature of amyloid formation.
A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation.

PubMed

Aubrey, Wayne; Riley, Michael C; Young, Michael; King, Ross D; Oliver, Stephen G; Clare, Amanda

2015-01-01

Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences), or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1) a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2) software to design the method's primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs) from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome.
Saccharomyces cerevisiae SSB1 protein and its relationship to nucleolar RNA-binding proteins.

PubMed

Jong, A Y; Clark, M W; Gilbert, M; Oehm, A; Campbell, J L

1987-08-01

To better define the function of Saccharomyces cerevisiae SSB1, an abundant single-stranded nucleic acid-binding protein, we determined the nucleotide sequence of the SSB1 gene and compared it with those of other proteins of known function. The amino acid sequence contains 293 amino acid residues and has an Mr of 32,853. There are several stretches of sequence characteristic of other eucaryotic single-stranded nucleic acid-binding proteins. At the amino terminus, residues 39 to 54 are highly homologous to a peptide in calf thymus UP1 and UP2 and a human heterogeneous nuclear ribonucleoprotein. Residues 125 to 162 constitute a fivefold tandem repeat of the sequence RGGFRG, the composition of which suggests a nucleic acid-binding site. Near the C terminus, residues 233 to 245 are homologous to several RNA-binding proteins. Of 18 C-terminal residues, 10 are acidic, a characteristic of the procaryotic single-stranded DNA-binding proteins and eucaryotic DNA- and RNA-binding proteins. In addition, examination of the subcellular distribution of SSB1 by immunofluorescence microscopy indicated that SSB1 is a nuclear protein, predominantly located in the nucleolus. Sequence homologies and the nucleolar localization make it likely that SSB1 functions in RNA metabolism in vivo, although an additional role in DNA metabolism cannot be excluded.
A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation

PubMed Central

Aubrey, Wayne; Riley, Michael C.; Young, Michael; King, Ross D.; Oliver, Stephen G.; Clare, Amanda

2015-01-01

Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences), or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1) a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2) software to design the method’s primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs) from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome. PMID:26630677
The M-T Hook Structure Is Critical for Design of HIV-1 Fusion Inhibitors*

PubMed Central

Chong, Huihui; Yao, Xue; Sun, Jianping; Qiu, Zonglin; Zhang, Meng; Waltersperger, Sandro; Wang, Meitian; Cui, Sheng; He, Yuxian

2012-01-01

CP621-652 is a potent HIV-1 fusion inhibitor peptide derived from the C-terminal heptad repeat of gp41. We recently identified that its N-terminal residues Met-626 and Thr-627 adopt a unique hook-like structure (termed M-T hook) thus stabilizing the interaction of the inhibitor with the deep pocket on the N-terminal heptad repeat. In this study, we further demonstrated that the M-T hook structure is a key determinant of CP621-652 in terms of its thermostability and anti-HIV activity. To directly define the structure and function of the M-T hook, we generated the peptide MT-C34 by incorporating Met-626 and Thr-627 into the N terminus of the C-terminal heptad repeat-derived peptide C34. The high resolution crystal structure (1.9 Å) of MT-C34 complexed by an N-terminal heptad repeat-derived peptide reveals that the M-T hook conformation is well preserved at the N-terminal extreme of the inhibitor. Strikingly, addition of two hook residues could dramatically enhance the binding affinity and thermostability of 6-helix bundle core. Compared with C34, MT-C34 exhibited significantly increased activity to inhibit HIV-1 envelope-mediated cell fusion (6.6-fold), virus entry (4.5-fold), and replication (6-fold). Mechanistically, MT-C34 had a 10.5-fold higher increase than C34 in blocking 6-helix bundle formation. We further showed that MT-C34 possessed higher potency against T20 (Enfuvirtide, Fuzeon)-resistant HIV-1 variants. Therefore, this study provides convincing data for our proposed concept that the M-T hook structure is critical for designing HIV-1 fusion inhibitors. PMID:22879603
Helix Unwinding and Base Flipping Enable Human MTERF1 to Terminate Mitochondrial Transcription

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yakubovskaya, E.; Mejia, E; Byrnes, J

2010-01-01

Defects in mitochondrial gene expression are associated with aging and disease. Mterf proteins have been implicated in modulating transcription, replication and protein synthesis. We have solved the structure of a member of this family, the human mitochondrial transcriptional terminator MTERF1, bound to dsDNA containing the termination sequence. The structure indicates that upon sequence recognition MTERF1 unwinds the DNA molecule, promoting eversion of three nucleotides. Base flipping is critical for stable binding and transcriptional termination. Additional structural and biochemical results provide insight into the DNA binding mechanism and explain how MTERF1 recognizes its target sequence. Finally, we have demonstrated that themore » mitochondrial pathogenic G3249A and G3244A mutations interfere with key interactions for sequence recognition, eliminating termination. Our results provide insight into the role of mterf proteins and suggest a link between mitochondrial disease and the regulation of mitochondrial transcription.« less
Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

PubMed

Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

2017-04-01

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
Characterization of the Igf-II Binding Site of the IGF-II/MAN-6-P Receptor Extracellular Domain.

NASA Astrophysics Data System (ADS)

Garmroudi, Farideh

1995-01-01

In mammals, insulin-like growth factor II (IGF -II) and glycoproteins bearing the mannose 6-phosphate (Man -6-P) recognition marker bind with high affinity to the same receptor. The functional consequences of IGF-II binding to the receptor at the cell surface are not clear. In these studies, we sought to broaden our understanding of the functional regions of the receptor regarding its IGF -II binding site. The IGF-II binding/cross-linking domain of the IGF-II/Man-6-P receptor was mapped by sequencing receptor fragments covalently attached to IGF-II. Purified rat placental or bovine liver receptors were affinity-labeled, with ^{125}I-IGF-II and digested with endoproteinase Glu-C. Analysis of digests by gel electrophoresis revealed a major radiolabeled band of 18 kDa, which was purified by gel filtration chromatography followed by reverse-phase HPLC and electroblotting. Sequence analysis revealed that, the peptide S(H)VNSXPMF, located within extracellular repeat 10 and beginning with serine 1488 of the bovine receptor, was the best candidate for the IGF-II cross-linked peptide. These data indicated that residues within repeats 10-11 were important for IGF -II binding. To define the location of the IGF-II binding site further, a nested set of six human receptor cDNA constructs was designed to produce epitope-tagged fusion proteins encompassing the region between repeats 8 and 11 of the human IGF-II/Man-6-P receptor extracellular domain. These truncated receptors were transiently expressed in COS-7 cells, immunoprecipitated and analyzed for their abilities to bind and cross-link to IGF-II. All of the constructs were capable of binding/cross-linking to IGF-II, except for the 9.0-11 construct. Displacement curve analysis indicated that the truncated receptors were approximately equivalent in IGF-II binding affinity, but were of 5- to 10-fold lower affinity than full-length receptors. Sequencing of the 9.0-11 construct indicated the presence of a point mutation substituting threonine for isoleucine at position 1621, which is located in the N-terminal half of repeat 11, and was found to abrogate IGF-II binding. Collectively, our work indicates that repeat 11 of the IGF-II/Man-6-P receptor's extracellular domain encompasses the elements both for binding and cross-linking to IGF-II.
Amino terminus of substance P potentiates kainic acid-induced activity in the mouse spinal cord.

PubMed

Larson, A A; Sun, X

1992-12-01

Sensitization to the behavioral effects produced by repeated injections of kainic acid (KA) into the mouse spinal cord area has been previously shown to be abolished by pretreatment with capsaicin, a neurotoxin of substance P (SP)-containing primary afferent C-fibers. While SP has a variety of well characterized biological actions that are mediated by interactions of its COOH terminus with neurokinin receptors, more recently we have characterized an amino-terminally directed SP binding site. The present studies were initiated to determine whether behavioral sensitization to repeated injections of intrathecally administered KA is mediated by the COOH or NH2 terminal of SP. In the present studies, pretreatment with SP(1-7), an NH2-terminal fragment of SP, but not SP(5-11), a COOH-terminal fragment, potentiated KA-induced behavioral activity in mice. Pretreatment with [D-Pro2,D-Phe7]SP(1-7), an inhibitor of SP NH2-terminal binding, blocked the potentiative effect of SP(1-7) as well as the sensitization to repeated injections of KA. In contrast, [D-Pro2,D-Trp7,9]SP, a neurokinin antagonist, had little effect on behavioral sensitization to KA. The present study suggests that SP has an important modulatory role on excitatory amino acid activity in the spinal cord that is mediated by an action of the NH2 terminal of SP at a non-neurokinin receptor.

Energy efficiency in wireless communication systems

DOEpatents

Caffrey, Michael Paul; Palmer, Joseph McRae

2012-12-11

Wireless communication systems and methods utilize one or more remote terminals, one or more base terminals, and a communication channel between the remote terminal(s) and base terminal(s). The remote terminal applies a direct sequence spreading code to a data signal at a spreading factor to provide a direct sequence spread spectrum (DSSS) signal. The DSSS signal is transmitted over the communication channel to the base terminal which can be configured to despread the received DSSS signal by a spreading factor matching the spreading factor utilized to spread the data signal. The remote terminal and base terminal can dynamically vary the matching spreading factors to adjust the data rate based on an estimation of operating quality over time between the remote terminal and base terminal such that the amount of data being transmitted is substantially maximized while providing a specified quality of service.
Terminal-Repeat Retrotransposons with GAG Domain in Plant Genomes: A New Testimony on the Complex World of Transposable Elements

PubMed Central

Chaparro, Cristian; Gayraud, Thomas; de Souza, Rogerio Fernandes; Domingues, Douglas Silva; Akaffou, Sélastique; Laforga Vanzela, Andre Luis; de Kochko, Alexandre; Rigoreau, Michel; Crouzillat, Dominique; Hamon, Serge; Hamon, Perla; Guyot, Romain

2015-01-01

A novel structure of nonautonomous long terminal repeat (LTR) retrotransposons called terminal repeat with GAG domain (TR-GAG) has been described in plants, both in monocotyledonous, dicotyledonous and basal angiosperm genomes. TR-GAGs are relatively short elements in length (<4 kb) showing the typical features of LTR-retrotransposons. However, they carry only one open reading frame coding for the GAG precursor protein involved for instance in transposition, the assembly, and the packaging of the element into the virus-like particle. GAG precursors show similarities with both Copia and Gypsy GAG proteins, suggesting evolutionary relationships of TR-GAG elements with both families. Despite the lack of the enzymatic machinery required for their mobility, strong evidences suggest that TR-GAGs are still active. TR-GAGs represent ubiquitous nonautonomous structures that could be involved in the molecular diversities of plant genomes. PMID:25573958
Novel surface attachment mechanism of the Streptococcus pneumoniae protein PspA.

PubMed Central

Yother, J; White, J M

1994-01-01

Pneumococcal surface protein A (PspA) of Streptococcus pneumoniae has been found to utilize a novel mechanism for anchoring to the bacterial cell surface. In contrast to that of surface proteins from other gram-positive bacteria, PspA anchoring required choline-mediated interactions between the membrane-associated lipoteichoic acid and the C-terminal repeat region of PspA. Release of PspA from the cell surface could be effected by deletion of 5 of the 10 C-terminal repeat units, by high concentrations of choline, or by growth in choline-deficient medium. Other pneumococcal proteins, including autolysin, which has a similar C-terminal repeat region, were not released by these treatments. The attachment mechanism utilized by PspA thus appears to be uniquely adapted to exploit the unusual structure of the pneumococcal cell surface. Further, it has provided the means for rapid and simple isolation of immunogenic PspA from S. pneumoniae. Images PMID:7910604
Methods for sequencing GC-rich and CCT repeat DNA templates

DOEpatents

Robinson, Donna L.

2007-02-20

The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes

Treesearch

A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt

2000-01-01

Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
Always look on both sides: Phylogenetic information conveyed by simple sequence repeat allele sequences

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...
Vertical Transmission of the Retrotransposable Elements R1 and R2 during the Evolution of the Drosophila Melanogaster Species Subgroup

PubMed Central

Eickbush, D. G.; Eickbush, T. H.

1995-01-01

R1 and R2 are non-long-terminal repeat retrotransposable elements that insert into specific sequences of insect 28S ribosomal RNA genes. These elements have been extensively described in Drosophila melanogaster. To determine whether these elements have been horizontally or vertically transmitted, we characterized R1 and R2 elements from the seven other members of the melanogaster species subgroup by genomic blotting and nucleotide sequencing. Each species was found to have homogeneous families of R1 and R2 elements with the exception of erecta and orena, which have no R2 elements. The DNA sequences of multiple R1 and R2 copies from each species indicated nucleotide divergence within each species averaged only 0.48% for R1 and 0.35% for R2, well below the level of divergence among the species. Most copies of R1 and R2 (40 of 47) sequenced from the seven species were potentially functional, as indicated by the absence of premature termination codons or translational frameshifts that would destroy the open reading frame of the element. The sequence relationships of both the R1 and R2 elements from the various members of the melanogaster subgroup closely followed that of the species phylogeny, suggesting that R1 and R2 have been stably maintained by vertical transmission since the origin of this species subgroup 17-20 million years ago. The remarkable stability of R1 and R2, compared to what has been suggested for transposable elements that insert at multiple locations in these same species, may be due to their unique specificity for sites in the rRNA gene locus. Under low copy number conditions, when it is essential for any mobile element to transpose, the insertion specificities of R1 and R2 ensure uniform developmentally regulated target sites that can be occupied with little or no detrimental effect on the host. PMID:7713424
[Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

PubMed

Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

2015-04-01

This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.
Phosphoenolpyruvate carboxykinase of Trypanosoma brucei is targeted to the glycosomes by a C-terminal sequence.

PubMed

Sommer, J M; Nguyen, T T; Wang, C C

1994-08-15

Import of proteins into the glycosomes of T. brucei resembles the peroxisomal protein import in that C-terminal SKL-like tripeptide sequences can function as targeting signals. Many of the glycosomal proteins do not, however, possess such C-terminal tripeptide signals. Among these, phosphoenolpyruvate carboxykinase (PEPCK (ATP)) was thought to be targeted to the glycosomes by an N-terminal or an internal targeting signal. A limited similarity to the N-terminal targeting signal of rat peroxisomal thiolase exists at the N-terminus of T. brucei PEPCK. However, we found that this peroxisomal targeting signal does not function for glycosomal protein import in T. brucei. Further studies of the PEPCK gene revealed that the C-terminus of the predicted protein does not correspond to the previously deduced protein sequence of 472 amino acids due to a -1 frame shift error in the original DNA sequence. Readjusting the reading frame of the sequence results in a predicted protein of 525 amino acids in length ending in a tripeptide serine-arginine-leucine (SRL), which is a potential targeting signal for import into the glycosomes. A fusion protein of firefly luciferase, without its own C-terminal SKL targeting signal, and T. brucei PEPCK is efficiently imported into the glycosomes when expressed in procyclic trypanosomes. Deletion of the C-terminal SRL tripeptide or the last 29 amino acids of PEPCK reduced the import only by about 50%, while a deletion of the last 47 amino acids completely abolished the import. These results suggest that T. brucei PEPCK may contain a second, internal glycosomal targeting signal upstream of the C-terminal SRL sequence.
A novel species-specific tandem repeat DNA family from Sinapis arvensis: detection of telomere-like sequences.

PubMed

Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M

1996-08-01

DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.
A genome-wide BAC-end sequence survey provides first insights into sweetpotato (Ipomoea batatas (L.) Lam.) genome composition.

PubMed

Si, Zengzhi; Du, Bing; Huo, Jinxi; He, Shaozhen; Liu, Qingchang; Zhai, Hong

2016-11-21

Sweetpotato, Ipomoea batatas (L.) Lam., is an important food crop widely grown in the world. However, little is known about the genome of this species because it is a highly heterozygous hexaploid. Gaining a more in-depth knowledge of sweetpotato genome is therefore necessary and imperative. In this study, the first bacterial artificial chromosome (BAC) library of sweetpotato was constructed. Clones from the BAC library were end-sequenced and analyzed to provide genome-wide information about this species. The BAC library contained 240,384 clones with an average insert size of 101 kb and had a 7.93-10.82 × coverage of the genome, and the probability of isolating any single-copy DNA sequence from the library was more than 99%. Both ends of 8310 BAC clones randomly selected from the library were sequenced to generate 11,542 high-quality BAC-end sequences (BESs), with an accumulative length of 7,595,261 bp and an average length of 658 bp. Analysis of the BESs revealed that 12.17% of the sweetpotato genome were known repetitive DNA, including 7.37% long terminal repeat (LTR) retrotransposons, 1.15% Non-LTR retrotransposons and 1.42% Class II DNA transposons etc., 18.31% of the genome were identified as sweetpotato-unique repetitive DNA and 10.00% of the genome were predicted to be coding regions. In total, 3,846 simple sequences repeats (SSRs) were identified, with a density of one SSR per 1.93 kb, from which 288 SSRs primers were designed and tested for length polymorphism using 20 sweetpotato accessions, 173 (60.07%) of them produced polymorphic bands. Sweetpotato BESs had significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum than those of Vitis vinifera, Theobroma cacao and Arabidopsis thaliana. The first BAC library for sweetpotato has been successfully constructed. The high quality BESs provide first insights into sweetpotato genome composition, and have significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum. These resources as a robust platform will be used in high-resolution mapping, gene cloning, assembly of genome sequences, comparative genomics and evolution for sweetpotato.
PpRT1: the first complete gypsy-like retrotransposon isolated in Pinus pinaster.

PubMed

Rocheta, Margarida; Cordeiro, Jorge; Oliveira, M; Miguel, Célia

2007-02-01

We have isolated and characterized a complete retrotransposon sequence, named PpRT1, from the genome of Pinus pinaster. PpRT1 is 5,966 bp long and is closely related to IFG7 gypsy retrotransposon from Pinus radiata. The long terminal repeats (LTRs) have 333 bp each and show a 5.4% sequence divergence between them. In addition to the characteristic polypurine tract (PPT) and the primer binding site (PBS), PpRT1 carries internal regions with homology to retroviral genes gag and pol. The pol region contains sequence motifs related to the enzymes protease, reverse transcriptase, RNAseH and integrase in the same typical order known for Ty3/gypsy-like retrotransposons. PpRT1 was extended from an EST database sequence indicating that its transcription is occurring in pine tissues. Southern blot analyses indicate however, that PpRT1 is present in a unique or a low number of copies in the P. pinaster genome. The differences in nucleotide sequence found between PpRT1 and IFG7 may explain the strikingly different copy number in the two pine species genome. Based on the homologies observed when comparing LTR region among different gypsy elements we propose that the highly conserved LTR regions may be useful to amplify other retrotransposon sequences of the same or close retrotransposon family.
Identification of the maize gravitropism gene lazy plant1 by a transposon-tagging genome resequencing strategy.

PubMed

Howard, Thomas P; Hayward, Andrew P; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A; Tohme, Joe; Kausch, Albert P; Mottinger, John P; Dellaporta, Stephen L

2014-01-01

Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform.
Identification of the Maize Gravitropism Gene lazy plant1 by a Transposon-Tagging Genome Resequencing Strategy

PubMed Central

Howard, Thomas P.; Hayward, Andrew P.; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A.; Tohme, Joe; Kausch, Albert P.; Mottinger, John P.; Dellaporta, Stephen L.

2014-01-01

Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform. PMID:24498020
Stability of Tandem Repeats in the Drosophila Melanogaster HSR-Omega Nuclear RNA

PubMed Central

Hogan, N. C.; Slot, F.; Traverse, K. L.; Garbe, J. C.; Bendena, W. G.; Pardue, M. L.

1995-01-01

The Drosophila melanogaster Hsr-omega locus produces a nuclear RNA containing >5 kb of tandem repeat sequences. These repeats are unique to Hsr-omega and show concerted evolution similar to that seen with classical satellite DNAs. In D. melanogaster the monomer is ~280 bp. Sequences of 191/2 monomers differ by 8 +/- 5% (mean +/- SD), when all pairwise comparisons are considered. Differences are single nucleotide substitutions and 1-3 nucleotide deletions/insertions. Changes appear to be randomly distributed over the repeat unit. Outer repeats do not show the decrease in monomer homogeneity that might be expected if homogeneity is maintained by recombination. However, just outside the last complete repeat at each end, there are a few fragments of sequence similar to the monomer. The sequences in these flanking regions are not those predicted for sequences decaying in the absence of recombination. Instead, the fragmentation of the sequence homology suggests that flanking regions have undergone more severe disruptions, possibly during an insertion or amplification event. Hsr-omega alleles differing in the number of repeats are detected and appear to be stable over a few thousand generations; however, both increases and decreases in repeat numbers have been observed. The new alleles appear to be as stable as their predecessors. No alleles of less than ~5 kb nor more than ~16 kb of repeats were seen in any stocks examined. The evidence that there is a limit on the minimum number of repeats is consistent with the suggestion that these repeats are important in the function of the unusual Hsr-omega nuclear RNA. PMID:7540581
The surface glycoprotein of feline leukemia virus isolate FeLV-945 is a determinant of altered pathogenesis in the presence or absence of the unique viral long terminal repeat.

PubMed

Bolin, Lisa L; Ahmad, Shamim; Lobelle-Rich, Patricia A; Ooms, Tara G; Alvarez-Hernandez, Xavier; Didier, Peter J; Levy, Laura S

2013-10-01

Feline leukemia virus (FeLV) is a naturally transmitted gammaretrovirus that infects domestic cats. FeLV-945, the predominant isolate associated with non-T-cell disease in a natural cohort, is a member of FeLV subgroup A but differs in sequence from the FeLV-A prototype, FeLV-A/61E, in the surface glycoprotein (SU) and long terminal repeat (LTR). Substitution of the FeLV-945 LTR into FeLV-A/61E resulted in pathogenesis indistinguishable from that of FeLV-A/61E, namely, thymic lymphoma of T-cell origin. In contrast, substitution of both FeLV-945 LTR and SU into FeLV-A/61E resulted in multicentric lymphoma of non-T-cell origin. These results implicated the FeLV-945 SU as a determinant of pathogenic spectrum. The present study was undertaken to test the hypothesis that FeLV-945 SU can act in the absence of other unique sequence elements of FeLV-945 to determine the disease spectrum. Substitution of FeLV-A/61E SU with that of FeLV-945 altered the clinical presentation and resulted in tumors that demonstrated expression of CD45R in the presence or absence of CD3. Despite the evident expression of CD45R, a typical B-cell marker, T-cell receptor beta (TCRβ) gene rearrangement indicated a T-cell origin. Tumor cells were detectable in bone marrow and blood at earlier times during the disease process, and the predominant SU genes from proviruses integrated in tumor DNA carried markers of genetic recombination. The findings demonstrate that FeLV-945 SU alters pathogenesis, although incompletely, in the absence of FeLV-945 LTR. Evidence demonstrates that FeLV-945 SU and LTR are required together to fully recapitulate the distinctive non-T-cell disease outcome seen in the natural cohort.
Construction of the first genetic linkage map of Japanese gentian (Gentianaceae)

PubMed Central

2012-01-01

Background Japanese gentians (Gentiana triflora and Gentiana scabra) are amongst the most popular floricultural plants in Japan. However, genomic resources for Japanese gentians have not yet been developed, mainly because of the heterozygous genome structure conserved by outcrossing, the long juvenile period, and limited knowledge about the inheritance of important traits. In this study, we developed a genetic linkage map to improve breeding programs of Japanese gentians. Results Enriched simple sequence repeat (SSR) libraries from a G. triflora double haploid line yielded almost 20,000 clones using 454 pyrosequencing technology, 6.7% of which could be used to design SSR markers. To increase the number of molecular markers, we identified three putative long terminal repeat (LTR) sequences using the recently developed inter-primer binding site (iPBS) method. We also developed retrotransposon microsatellite amplified polymorphism (REMAP) markers combining retrotransposon and inter-simple sequence repeat (ISSR) markers. In addition to SSR and REMAP markers, modified amplified fragment length polymorphism (AFLP) and random amplification polymorphic DNA (RAPD) markers were developed. Using 93 BC1 progeny from G. scabra backcrossed with a G. triflora double haploid line, 19 linkage groups were constructed with a total of 263 markers (97 SSR, 97 AFLP, 39 RAPD, and 30 REMAP markers). One phenotypic trait (stem color) and 10 functional markers related to genes controlling flower color, flowering time and cold tolerance were assigned to the linkage map, confirming its utility. Conclusions This is the first reported genetic linkage map for Japanese gentians and for any species belonging to the family Gentianaceae. As demonstrated by mapping of functional markers and the stem color trait, our results will help to explain the genetic basis of agronomic important traits, and will be useful for marker-assisted selection in gentian breeding programs. Our map will also be an important resource for further genetic analyses such as mapping of quantitative trait loci and map-based cloning of genes in this species. PMID:23186361
Typing Clostridium difficile strains based on tandem repeat sequences

PubMed Central

2009-01-01

Background Genotyping of epidemic Clostridium difficile strains is necessary to track their emergence and spread. Portability of genotyping data is desirable to facilitate inter-laboratory comparisons and epidemiological studies. Results This report presents results from a systematic screen for variation in repetitive DNA in the genome of C. difficile. We describe two tandem repeat loci, designated 'TR6' and 'TR10', which display extensive sequence variation that may be useful for sequence-based strain typing. Based on an investigation of 154 C. difficile isolates comprising 75 ribotypes, tandem repeat sequencing demonstrated excellent concordance with widely used PCR ribotyping and equal discriminatory power. Moreover, tandem repeat sequences enabled the reconstruction of the isolates' largely clonal population structure and evolutionary history. Conclusion We conclude that sequence analysis of the two repetitive loci introduced here may be highly useful for routine typing of C. difficile. Tandem repeat sequence typing resolves phylogenetic diversity to a level equivalent to PCR ribotypes. DNA sequences may be stored in databases accessible over the internet, obviating the need for the exchange of reference strains. PMID:19133124
Deletion mutants of Harvey ras p21 protein reveal the absolute requirement of at least two distant regions for GTP-binding and transforming activities.

PubMed Central

Lacal, J C; Anderson, P S; Aaronson, S A

1986-01-01

Deletions of small sequences from the viral Harvey ras gene have been generated, and resulting ras p21 mutants have been expressed in Escherichia coli. Purification of each deleted protein allowed the in vitro characterization of GTP-binding, GTPase and autokinase activity of the proteins. Microinjection of the highly purified proteins into quiescent NIH/3T3 cells, as well as transfection experiments utilizing a long terminal repeat (LTR)-containing vector, were utilized to analyze the biological activity of the deleted proteins. Two small regions located at 6-23 and 152-165 residues are shown to be absolutely required for in vitro and in vivo activities of the ras product. By contrast, the variable region comprising amino acids 165-184 was shown not to be necessary for either in vitro or in vivo activities. Thus, we demonstrate that: (i) amino acid sequences at positions 5-23 and 152-165 of ras p21 protein are probably directly involved in the GTP-binding activity; (ii) GTP-binding is required for the transforming activity of ras p21 and by extension for the normal function of the proto-oncogene product; and (iii) the variable region at the C-terminal end of the ras p21 molecule from amino acids 165 to 184 is not required for transformation. Images Fig.2. Fig.4. PMID:3011420
Complete nucleotide and derived amino acid sequence of cDNA encoding the mitochondrial uncoupling protein of rat brown adipose tissue: lack of a mitochondrial targeting presequence.

PubMed Central

Ridley, R G; Patel, H V; Gerber, G E; Morton, R C; Freeman, K B

1986-01-01

A cDNA clone spanning the entire amino acid sequence of the nuclear-encoded uncoupling protein of rat brown adipose tissue mitochondria has been isolated and sequenced. With the exception of the N-terminal methionine the deduced N-terminus of the newly synthesized uncoupling protein is identical to the N-terminal 30 amino acids of the native uncoupling protein as determined by protein sequencing. This proves that the protein contains no N-terminal mitochondrial targeting prepiece and that a targeting region must reside within the amino acid sequence of the mature protein. Images PMID:3012461

Eggs, embryos and the evolution of imprinting: insights from the platypus genome.

PubMed

Renfree, Marilyn B; Papenfuss, Anthony T; Shaw, Geoff; Pask, Andrew J

2009-01-01

Genomic imprinting is widespread in eutherian and marsupial mammals. Although there have been many hypotheses to explain why genomic imprinting evolved in mammals, few have examined how it arose. The host defence hypothesis suggests that imprinting evolved from existing mechanisms within the cell that act to silence foreign DNA elements that insert into the genome. However, the changes to the mammalian genome that accompanied the evolution of imprinting have been hard to define due to the absence of large-scale genomic resources from all extant classes. The recent release of the platypus genome sequence has provided the first opportunity to make comparisons between prototherian (monotreme, which show no signs of imprinting) and therian (marsupial and eutherian, which have imprinting) mammals. We compared the distribution of repeat elements known to attract epigenetic silencing across the genome from monotremes and therian mammals, particularly focusing on the orthologous imprinted regions. Our analyses show that the platypus has significantly fewer repeats of certain classes in the regions of the genome that have become imprinted in therian mammals. The accumulation of repeats, especially long-terminal repeats and DNA elements, in therian imprinted genes and gene clusters therefore appears to be coincident with, and may have been a potential driving force in, the development of mammalian genomic imprinting. Comparative platypus genome analyses of orthologous imprinted regions have provided strong support for the host defence hypothesis to explain the origin of imprinting.
The Complete Mitochondrial Genome of Coptotermes ‘suzhouensis’ (syn. Coptotermes formosanus) (Isoptera: Rhinotermitidae) and Molecular Phylogeny Analysis

PubMed Central

Li, Juan; Zhu, Jin-long; Lou, Shi-di; Wang, Ping; Zhang, You-sen; Wang, Lin; Yin, Ruo-chun; Zhang, Ping-ping

2018-01-01

Abstract Coptotermes suzhouensis (Isoptera: Rhinotermitidae) is a significant subterranean termite pest of wooden structures and is widely distributed in southeastern China. The complete mitochondrial DNA sequence of C. suzhouensis was analyzed in this study. The mitogenome was a circular molecule of 15,764 bp in length, which contained 13 protein-coding genes (PCGs), 22 transfer RNA genes, two ribosomal RNA genes, and an A+T-rich region with a gene arrangement typical of Isoptera mitogenomes. All PCGs were initiated by ATN codons and terminated by complete termination codons (TAA), except COX2, ND5, and Cytb, which ended with an incomplete termination codon T. All tRNAs displayed a typical clover-leaf structure, except for tRNASer(AGN), which did not contain the stem-loop structure in the DHU arm. The A+T content (69.23%) of the A+T-rich region (949 bp) was higher than that of the entire mitogenome (65.60%), and two different sets of repeat units (A+B) were distributed in this region. Comparison of complete mitogenome sequences with those of Coptotermes formosanus indicated that the two taxa have very high genetic similarity. Forty-one representative termite species were used to construct phylogenetic trees by maximum likelihood, maximum parsimony, and Bayesian inference methods. The phylogenetic analyses also strongly supported (BPP, MLBP, and MPBP = 100%) that all C. suzhouensis and C. formosanus samples gathered into one clade with genetic distances between 0.000 and 0.002. This study provides molecular evidence for a more robust phylogenetic position of C. suzhouensis and inferrs that C. suzhouensis was the synonymy of C. formosanus. PMID:29718488
[Comparative analysis of clustered regularly interspaced short palindromic repeats (CRISPRs) loci in the genomes of halophilic archaea].

PubMed

Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian

2009-11-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.
The complete sequence and structural analysis of human apolipoprotein B-100: relationship between apoB-100 and apoB-48 forms.

PubMed Central

Cladaras, C; Hadzopoulou-Cladaras, M; Nolte, R T; Atkinson, D; Zannis, V I

1986-01-01

We have isolated and sequenced overlapping cDNA clones covering the entire sequence of human apolipoprotein B-100 (apoB-100). DNA sequence analysis and determination of the mRNA transcription initiation site by S1 nuclease mapping showed that the apoB mRNA consists of 14,112 nucleotides including the 5' and 3' untranslated regions which are 128 and 301 nucleotides respectively. The DNA-derived protein sequence shows that apoB-100 is 513,000 daltons and contains 4560 amino acids including a 24-amino-acid-long signal peptide. The mol. wt of apoB-100 implies that there is one apoB molecule per LDL particle. Computer analysis of the predicted secondary structure of the protein showed that some of the potential alpha helical and beta sheet structures are amphipathic, whereas others have non-amphipathic neutral to apolar character. These latter regions may contribute to the formation of the lipid-binding domains of apoB-100. The protein contains 25 cysteines and 20 potential N-glycosylation sites. The majority of cysteines are distributed in the amino terminal portion of the protein. Four of the potential glycosylation sites are in predicted beta turn structures and may represent true glycosylation positions. ApoB lacks the tandem repeats which are characteristic of other apolipoproteins. The mean hydrophobicity the mean value of H1 and helical hydrophobic moment the mean value of microH profiles of apoB showed the presence of several potential helical regions with strong polar character and high hydrophobic moment. The region with the highest hydrophobic moment, between amino acid residues 3352 and 3369, contains five closely spaced, positively charged residues, and has sequence homology to the LDL receptor binding site of apoE. This region is flanked by three neighbouring regions with positively charged amino acids and high hydrophobic moment that are located between residues 3174 and 3681. One or more of these closely spaced apoB sequences may be involved in the formation of the LDL receptor-binding domain of apoB-100. Blotting analysis of intestinal RNA and hybridization of the blots with carboxy apoB cDNA probes produced a single 15-kb hybridization band whereas hybridization with amino terminal probes produced two hybridization bands of 15 and 8 kb. Our data indicate that both forms of apoB mRNA contain common sequences which extend from the amino terminal of apoB-100 to the vicinity of nucleotide residue 6300. These two messages may have resulted from differential splicing of the same primary apoB mRNA transcript. Images Fig. 4. Fig. 6. PMID:3030729
N-Terminal Amino Acid Sequence Determination of Proteins by N-Terminal Dimethyl Labeling: Pitfalls and Advantages When Compared with Edman Degradation Sequence Analysis.

PubMed

Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P; Marians, Kenneth J; Erdjument-Bromage, Hediye

2016-07-01

In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods.
Primer-independent RNA sequencing with bacteriophage phi6 RNA polymerase and chain terminators.

PubMed

Makeyev, E V; Bamford, D H

2001-05-01

Here we propose a new general method for directly determining RNA sequence based on the use of the RNA-dependent RNA polymerase from bacteriophage phi6 and the chain terminators (RdRP sequencing). The following properties of the polymerase render it appropriate for this application: (1) the phi6 polymerase can replicate a number of single-stranded RNA templates in vitro. (2) In contrast to the primer-dependent DNA polymerases utilized in the sequencing procedure by Sanger et al. (Proc Natl Acad Sci USA, 1977, 74:5463-5467), it initiates nascent strand synthesis without a primer, starting the polymerization on the very 3'-terminus of the template. (3) The polymerase can incorporate chain-terminating nucleotide analogs into the nascent RNA chain to produce a set of base-specific termination products. Consequently, 3' proximal or even complete sequence of many target RNA molecules can be rapidly deduced without prior sequence information. The new technique proved useful for sequencing several synthetic ssRNA templates. Furthermore, using genomic segments of the bluetongue virus we show that RdRP sequencing can also be applied to naturally occurring dsRNA templates. This suggests possible uses of the method in the RNA virus research and diagnostics.
Genome Wide Characterization of Simple Sequence Repeats in Cucumber

USDA-ARS?s Scientific Manuscript database

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...
Impact of protein domains on PE_PGRS30 polar localization in Mycobacteria.

PubMed

De Maio, Flavio; Maulucci, Giuseppe; Minerva, Mariachiara; Anoosheh, Saber; Palucci, Ivana; Iantomasi, Raffaella; Palmieri, Valentina; Camassa, Serena; Sali, Michela; Sanguinetti, Maurizio; Bitter, Wilbert; Manganelli, Riccardo; De Spirito, Marco; Delogu, Giovanni

2014-01-01

PE_PGRS proteins are unique to the Mycobacterium tuberculosis complex and a number of other pathogenic mycobacteria. PE_PGRS30, which is required for the full virulence of M. tuberculosis (Mtb), has three main domains, i.e. an N-terminal PE domain, repetitive PGRS domain and the unique C-terminal domain. To investigate the role of these domains, we expressed a GFP-tagged PE_PGRS30 protein and a series of its functional deletion mutants in different mycobacterial species (Mtb, Mycobacterium bovis BCG and Mycobacterium smegmatis) and analysed protein localization by confocal microscopy. We show that PE_PGRS30 localizes at the mycobacterial cell poles in Mtb and M. bovis BCG but not in M. smegmatis and that the PGRS domain of the protein strongly contributes to protein cellular localization in Mtb. Immunofluorescence studies further showed that the unique C-terminal domain of PE_PGRS30 is not available on the surface, except when the PGRS domain is missing. Immunoblot demonstrated that the PGRS domain is required to maintain the protein strongly associated with the non-soluble cellular fraction. These results suggest that the repetitive GGA-GGN repeats of the PGRS domain contain specific sequences that contribute to protein cellular localization and that polar localization might be a key step in the PE_PGRS30-dependent virulence mechanism.
Pause, play, repeat

PubMed Central

Sansó, Miriam; Fisher, Robert P

2013-01-01

Cyclin-dependent kinases (CDKs) play a central role in governing eukaryotic cell division. It is becoming clear that the transcription cycle of RNA polymerase II (RNAP II) is also regulated by CDKs; in metazoans, the cell cycle and transcriptional CDK networks even share an upstream activating kinase, which is itself a CDK. From recent chemical-genetic analyses we know that CDKs and their substrates control events both early in transcription (the transition from initiation to elongation) and late (3′ end formation and transcription termination). Moreover, mutual dependence on CDK activity might couple the “beginning” and “end” of the cycle, to ensure the fidelity of mRNA maturation and the efficient recycling of RNAP II from sites of termination to the transcription start site (TSS). As is the case for CDKs involved in cell cycle regulation, different transcriptional CDKs act in defined sequence on multiple substrates. These phosphorylations are likely to influence gene expression by several mechanisms, including direct, allosteric effects on the transcription machinery, co-transcriptional recruitment of proteins needed for mRNA-capping, splicing and 3′ end maturation, dependent on multisite phosphorylation of the RNAP II C-terminal domain (CTD) and, perhaps, direct regulation of RNA-processing or histone-modifying machinery. Here we review these recent advances, and preview the emerging challenges for transcription-cycle research. PMID:23756342
Identification of Simple Sequence Repeats in Chloroplast Genomes of Magnoliids Through Bioinformatics Approach.

PubMed

Srivastava, Deepika; Shanker, Asheesh

2016-12-01

Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.
The Wnt-1 (int-1) oncogene promoter and its mechanism of activation by insertion of proviral DNA of the mouse mammary tumor virus.

PubMed Central

Nusse, R; Theunissen, H; Wagenaar, E; Rijsewijk, F; Gennissen, A; Otte, A; Schuuring, E; van Ooyen, A

1990-01-01

Wnt-1 (int-1) is a cellular oncogene often activated by insertion of proviral DNA of the mouse mammary tumor virus. We have mapped the 5' end and the promoter area of the Wnt-1 gene by nuclease protection and primer extension assays. In differentiating P19 embryonal carcinoma cells, in which Wnt-1 is naturally expressed, two start sites of transcription were found, one preceded by two TATA boxes and one preceded by several GC boxes. In P19 cells, a 1-kilobase upstream sequence of Wnt-1 was able to confer differentiation-specific expression on a heterologous gene. We have investigated how Wnt-1 transcription was affected by mouse mammary tumor virus proviral integrations in various configurations near the promoters of the gene. One provirus has been inserted in the 5' nontranslated part of Wnt-1, in the same transcriptional orientation, and has functionally replaced the Wnt-1 promoters. Wnt-1 transcription in this tumor starts in the right long terminal repeat of the provirus, with considerable readthrough transcription from the left long terminal repeat. Another provirus has been inserted in the orientation opposite that of Wnt-1 into a GC box, disrupting the first Wnt-1 transcription start site but not the downstream start site. Most insertions have not structurally altered the Wnt-1 transcripts and have enhanced the activity of the normal two promoters. Images PMID:1695322
Proteolytic interconversion and N-terminal sequences of the Citrobacter diversus major beta-lactamases.

PubMed Central

Franceschini, N; Amicosante, G; Perilli, M; Maccarrone, M; Oratore, A; van Beeumen, J; Frère, J M

1991-01-01

The N-terminal sequences of the two major beta-lactamases produced by Citrobacter diversus differed only by the absence of the first residue in form II and the loss of five amino acid residues at the C-terminal end. Limited proteolysis of the homogeneous form I protein yielded a variety of enzymatically active products. In the major product obtained after the action of papain, the first three N-terminal residues of form I had been cleaved, whereas at the C-terminal end the treated enzyme lacked five residues. However, this cannot explain the different behaviours of form I, form II and papain digestion product upon chromatofocusing. Form I, which was sequenced up to position 56, exhibited a very high degree of similarity with a Klebsiella oxytoca beta-lactamase. The determined sequence, which contained the active serine residue, demonstrated that the chromosome-encoded beta-lactamase of Citrobacter diversus belong to class A. Images Fig. 2. PMID:2039443
Changes in tau phosphorylation in hibernating rodents.

PubMed

León-Espinosa, Gonzalo; García, Esther; García-Escudero, Vega; Hernández, Félix; Defelipe, Javier; Avila, Jesús

2013-07-01

Tau is a cytoskeletal protein present mainly in the neurons of vertebrates. By comparing the sequence of tau molecule among different vertebrates, it was found that the variability of the N-terminal sequence in tau protein is higher than that of the C-terminal region. The N-terminal region is involved mainly in the binding of tau to cellular membranes, whereas the C-terminal region of the tau molecule contains the microtubule-binding sites. We have compared the sequence of Syrian hamster tau with the sequences of other hibernating and nonhibernating rodents and investigated how differences in the N-terminal region of tau could affect the phosphorylation level and tau binding to cell membranes. We also describe a change, in tau phosphorylation, on a casein kinase 1 (ck1)-dependent site that is found only in hibernating rodents. This ck1 site seems to play an important role in the regulation of tau binding to membranes. Copyright © 2013 Wiley Periodicals, Inc.
The LINEs and SINEs of Entamoeba histolytica: comparative analysis and genomic distribution.

PubMed

Bakre, Abhijeet A; Rawal, Kamal; Ramaswamy, Ram; Bhattacharya, Alok; Bhattacharya, Sudha

2005-07-01

Autonomous non-long terminal repeat retrotransposons are commonly referred to as long interspersed elements (LINEs). Short non-autonomous elements that borrow the LINE machinery are called SINES. The Entamoeba histolytica genome contains three classes of LINEs and SINEs. Together the EhLINEs/SINEs account for about 6% of the genome. The recognizable functional domains in all three EhLINEs included reverse transcriptase and endonuclease. A novel feature was the presence of two types of members-some with a single long ORF (less frequent) and some with two ORFs (more frequent) in both EhLINE1 and 2. The two ORFs were generated by conserved changes leading to stop codon. Computational analysis of the immediate flanking sequences for each element showed that they inserted in AT-rich sequences, with a preponderance of Ts in the upstream site. The elements were very frequently located close to protein-coding genes and other EhLINEs/SINEs. The possible influence of these elements on expression of neighboring genes needs to be determined.
Evolution of sfbI Encoding Streptococcal Fibronectin-Binding Protein I: Horizontal Genetic Transfer and Gene Mosaic Structure

PubMed Central

Towers, Rebecca J.; Fagan, Peter K.; Talay, Susanne R.; Currie, Bart J.; Sriprakash, Kadaba S.; Walker, Mark J.; Chhatwal, Gursharan S.

2003-01-01

Streptococcal fibronectin-binding protein is an important virulence factor involved in colonization and invasion of epithelial cells and tissues by Streptococcus pyogenes. In order to investigate the mechanisms involved in the evolution of sfbI, the sfbI genes from 54 strains were sequenced. Thirty-four distinct alleles were identified. Three principal mechanisms appear to have been involved in the evolution of sfbI. The amino-terminal aromatic amino acid-rich domain is the most variable region and is apparently generated by intergenic recombination of horizontally acquired DNA cassettes, resulting in a genetic mosaic in this region. Two distinct and divergent sequence types that shared only 61 to 70% identity were identified in the central proline-rich region, while variation at the 3′ end of the gene is due to deletion or duplication of defined repeat units. Potential antigenic and functional variabilities in SfbI imply significant selective pressure in vivo with direct implications for the microbial pathogenesis of S. pyogenes. PMID:14662917
Genomically Intact Endogenous Feline Leukemia Viruses of Recent Origin

PubMed Central

Roca, Alfred L.; Pecon-Slattery, Jill; O'Brien, Stephen J.

2004-01-01

We isolated and sequenced two complete endogenous feline leukemia viruses (enFeLVs), designated enFeLV-AGTT and enFeLV-GGAG. In enFeLV-AGTT, the open reading frames are reminiscent of a functioning FeLV genome, and the 5′ and 3′ long terminal repeat sequences are identical. Neither endogenous provirus is genetically fixed in cats but polymorphic, with 8.9 and 15.2% prevalence for enFeLV-AGTT and enFeLV-GGAG, respectively, among a survey of domestic cats. Neither provirus was found in the genomes of related species of the Felis genus, previously shown to harbor enFeLVs. The absence of mutational divergence, polymorphic incidence in cats, and absence in related species suggest that these enFeLVs may have entered the germ line more recently than previously believed, perhaps coincident with domestication, and reopens the question of whether some enFeLVs might be replication competent. PMID:15047851
Genetic Characterization of Feline Leukemia Virus from Florida Panthers

PubMed Central

Brown, Meredith A.; Cunningham, Mark W.; Roca, Alfred L.; Troyer, Jennifer L.; Johnson, Warren E.

2008-01-01

From 2002 through 2005, an outbreak of feline leukemia virus (FeLV) occurred in Florida panthers (Puma concolor coryi). Clinical signs included lymphadenopathy, anemia, septicemia, and weight loss; 5 panthers died. Not associated with FeLV outcome were the genetic heritage of the panthers (pure Florida vs. Texas/Florida crosses) and co-infection with feline immunodeficiency virus. Genetic analysis of panther FeLV, designated FeLV-Pco, determined that the outbreak likely came from 1 cross-species transmission from a domestic cat. The FeLV-Pco virus was closely related to the domestic cat exogenous FeLV-A subgroup in lacking recombinant segments derived from endogenous FeLV. FeLV-Pco sequences were most similar to the well-characterized FeLV-945 strain, which is highly virulent and strongly pathogenic in domestic cats because of unique long terminal repeat and envelope sequences. These unique features may also account for the severity of the outbreak after cross-species transmission to the panther. PMID:18258118
Endogenous Retrovirus EAV-HP Linked to Blue Egg Phenotype in Mapuche Fowl

PubMed Central

Alcalde, José A.; Wang, Chen; Han, Jian-Lin; Gongora, Jaime; Gourichon, David; Tixier-Boichard, Michèle; Hanotte, Olivier

2013-01-01

Oocyan or blue/green eggshell colour is an autosomal dominant trait found in native chickens (Mapuche fowl) of Chile and in some of their descendants in European and North American modern breeds. We report here the identification of an endogenous avian retroviral (EAV-HP) insertion in oocyan Mapuche fowl and European breeds. Sequencing data reveals 100% retroviral identity between the Mapuche and European insertions. Quantitative real-time PCR analysis of European oocyan chicken indicates over-expression of the SLCO1B3 gene (P<0.05) in the shell gland and oviduct. Predicted transcription factor binding sites in the long terminal repeats (LTR) indicate AhR/Ar, a modulator of oestrogen, as a possible promoter/enhancer leading to reproductive tissue-specific over-expression of the SLCO1B3 gene. Analysis of all jungle fowl species Gallus sp. supports the retroviral insertion to be a post-domestication event, while identical LTR sequences within domestic chickens are in agreement with a recent de novo mutation. PMID:23990950
Endogenous retrovirus EAV-HP linked to blue egg phenotype in Mapuche fowl.

PubMed

Wragg, David; Mwacharo, Joram M; Alcalde, José A; Wang, Chen; Han, Jian-Lin; Gongora, Jaime; Gourichon, David; Tixier-Boichard, Michèle; Hanotte, Olivier

2013-01-01

Oocyan or blue/green eggshell colour is an autosomal dominant trait found in native chickens (Mapuche fowl) of Chile and in some of their descendants in European and North American modern breeds. We report here the identification of an endogenous avian retroviral (EAV-HP) insertion in oocyan Mapuche fowl and European breeds. Sequencing data reveals 100% retroviral identity between the Mapuche and European insertions. Quantitative real-time PCR analysis of European oocyan chicken indicates over-expression of the SLCO1B3 gene (P<0.05) in the shell gland and oviduct. Predicted transcription factor binding sites in the long terminal repeats (LTR) indicate AhR/Ar, a modulator of oestrogen, as a possible promoter/enhancer leading to reproductive tissue-specific over-expression of the SLCO1B3 gene. Analysis of all jungle fowl species Gallus sp. supports the retroviral insertion to be a post-domestication event, while identical LTR sequences within domestic chickens are in agreement with a recent de novo mutation.
Repression of chimeric transcripts emanating from endogenous retrotransposons by a sequence-specific transcription factor

PubMed Central

2014-01-01

Background Retroviral elements are pervasively transcribed and dynamically regulated during development. While multiple histone- and DNA-modifying enzymes have broadly been associated with their global silencing, little is known about how the many diverse retroviral families are each selectively recognized. Results Here we show that the zinc finger protein Krüppel-like Factor 3 (KLF3) specifically silences transcription from the ORR1A0 long terminal repeat in murine fetal and adult erythroid cells. In the absence of KLF3, we detect widespread transcription from ORR1A0 elements driven by the master erythroid regulator KLF1. In several instances these aberrant transcripts are spliced to downstream genic exons. One such chimeric transcript produces a novel, dominant negative isoform of PU.1 that can induce erythroid differentiation. Conclusions We propose that KLF3 ensures the integrity of the murine erythroid transcriptome through the selective repression of a particular retroelement and is likely one of multiple sequence-specific factors that cooperate to achieve global silencing. PMID:24946810

Genetic characterization of feline leukemia virus from Florida panthers.

PubMed

Brown, Meredith A; Cunningham, Mark W; Roca, Alfred L; Troyer, Jennifer L; Johnson, Warren E; O'Brien, Stephen J

2008-02-01

From 2002 through 2005, an outbreak of feline leukemia virus (FeLV) occurred in Florida panthers (Puma concolor coryi). Clinical signs included lymphadenopathy, anemia, septicemia, and weight loss; 5 panthers died. Not associated with FeLV outcome were the genetic heritage of the panthers (pure Florida vs. Texas/Florida crosses) and co-infection with feline immunodeficiency virus. Genetic analysis of panther FeLV, designated FeLV-Pco, determined that the outbreak likely came from 1 cross-species transmission from a domestic cat. The FeLV-Pco virus was closely related to the domestic cat exogenous FeLV-A subgroup in lacking recombinant segments derived from endogenous FeLV. FeLV-Pco sequences were most similar to the well-characterized FeLV-945 strain, which is highly virulent and strongly pathogenic in domestic cats because of unique long terminal repeat and envelope sequences. These unique features may also account for the severity of the outbreak after cross-species transmission to the panther.
Genome plasticity in Streptomyces: identification of 1 Mb TIRs in the S. coelicolor A3(2) chromosome.

PubMed

Weaver, David; Karoonuthaisiri, Nitsara; Tsai, Hsiu-Hwei; Huang, Chih-Hung; Ho, Mai-Lan; Gai, Shuning; Patel, Kedar G; Huang, Jianqiang; Cohen, Stanley N; Hopwood, David A; Chen, Carton W; Kao, Camilla M

2004-03-01

The chromosomes of several widely used laboratory derivatives of Streptomyces coelicolor A3(2) were found to have 1.06 Mb inverted repeat sequences at their termini (i.e. long-terminal inverted repeats; L-TIRs), which are 50 times the length of the 22 kb TIRs of the sequenced S. coelicolor strain M145. The L-TIRs include 1005 annotated genes and increase the overall chromosome size to 9.7 Mb. The 1.06 Mb L-TIRs are the longest reported thus far for an actinomycete, and are proposed to represent the chromosomal state of the original soil isolate of S. coelicolor A3(2). S. coelicolor A3(2), M600 and J1501 possess L-TIRs, whereas approximately half the examined early mutants of A3(2) generated by ultraviolet (UV) or X-ray mutagenesis have truncated their TIRs to the 22 kb length. UV radiation was found to stimulate L-TIR truncation. Two copies of a transposase gene (SCO0020) flank 1.04 Mb of DNA in the right L-TIR, and recombination between them appears to generate strains containing short TIRs. This TIR reduction mechanism may represent a general strategy by which transposable elements can modulate the structure of chromosome ends. The presence of L-TIRs in certain S. coelicolor strains represents a major chromosomal alteration in strains previously thought to be genetically similar.
APE-Type Non-LTR Retrotransposons of Multicellular Organisms Encode Virus-Like 2A Oligopeptide Sequences, Which Mediate Translational Recoding during Protein Synthesis

PubMed Central

Odon, Valerie; Luke, Garry A.; Roulston, Claire; Brown, Jeremy D.; Ryan, Martin D.; Sukhodub, Andriy

2013-01-01

2A oligopeptide sequences (“2As”) mediate a cotranslational recoding event termed “ribosome skipping.” Previously we demonstrated the activity of 2As (and “2A-like sequences”) within a wide range of animal RNA virus genomes and non-long terminal repeat retrotransposons (non-LTRs) in the genomes of the unicellular organisms Trypanosoma brucei (Ingi) and T. cruzi (L1Tc). Here, we report the presence of 2A-like sequences in the genomes of a wide range of multicellular organisms and, as in the trypanosome genomes, within non-LTR retrotransposons (non-LTRs)—clustering in the Rex1, Crack, L2, L2A, and CR1 clades, in addition to Ingi. These 2A-like sequences were tested for translational recoding activity, and highly active sequences were found within the Rex1, L2, CR1, and Ingi clades. The presence of 2A-like sequences within non-LTRs may not only represent a method of controlling protein biogenesis but also shows some correlation with such apurinic/apyrimidinic DNA endonuclease-type non-LTRs encoding one, rather than two, open reading frames (ORFs). Interestingly, such non-LTRs cluster with closely related elements lacking 2A-like recoding elements but retaining ORF1. Taken together, these observations suggest that acquisition of 2A-like translational recoding sequences may have played a role in the evolution of these elements. PMID:23728794
Molecular characterization and distribution of a 145-bp tandem repeat family in the genus Populus.

PubMed

Rajagopal, J; Das, S; Khurana, D K; Srivastava, P S; Lakshmikumaran, M

1999-10-01

This report aims to describe the identification and molecular characterization of a 145-bp tandem repeat family that accounts for nearly 1.5% of the Populus genome. Three members of this repeat family were cloned and sequenced from Populus deltoides and P. ciliata. The dimers of the repeat were sequenced in order to confirm the head-to-tail organization of the repeat. Hybridization-based analysis using the 145-bp tandem repeat as a probe on genomic DNA gave rise to ladder patterns which were identified to be a result of methylation and (or) sequence heterogeneity. Analysis of the methylation pattern of the repeat family using methylation-sensitive isoschizomers revealed variable methylation of the C residues and lack of methylation of the A residues. Sequence comparisons between the monomers revealed a high degree of sequence divergence that ranged between 6% and 11% in P. deltoides and between 4.2% and 8.3% in P. ciliata. This indicated the presence of sub-families within the 145-bp tandem family of repeats. Divergence was mainly due to the accumulation of point mutations and was concentrated in the central region of the repeat. The 145-bp tandem repeat family did not show significant homology to known tandem repeats from plants. A short stretch of 36 bp was found to show homology of 66.7% to a centromeric repeat from Chironomus plumosus. Dot-blot analysis and Southern hybridization data revealed the presence of the repeat family in 13 of the 14 Populus species examined. The absence of the 145-bp repeat from P. euphratica suggested that this species is relatively distant from other members of the genus, which correlates with taxonomic classifications. The widespread occurrence of the tandem family in the genus indicated that this family may be of ancient origin.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

PubMed Central

Benslimane, A A; Dron, M; Hartmann, C; Rode, A

1986-01-01

Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
Structural Studies of Geosmin Synthase, a Bifunctional Sesquiterpene Synthase with Alpha-Alpha Domain Architecture that Catalyzes a Unique Cyclization-Fragmentation Reaction Sequence

PubMed Central

Harris, Golda G.; Lombardi, Patrick M.; Pemberton, Travis A.; Matsui, Tsutomu; Weiss, Thomas M.; Cole, Kathryn E.; Köksal, Mustafa; Murphy, Frank V.; Vedula, L. Sangeetha; Chou, Wayne K.W.; Cane, David E.; Christianson, David W.

2015-01-01

Geosmin synthase from Streptomyces coelicolor (ScGS) catalyzes an unusual, metal-dependent terpenoid cyclization and fragmentation reaction sequence. Two distinct active sites are required for catalysis: the N-terminal domain catalyzes the ionization and cyclization of farnesyl diphosphate to form germacradienol and inorganic pyrophosphate (PPi), and the C-terminal domain catalyzes the protonation, cyclization, and fragmentation of germacradienol to form geosmin and acetone through a retro-Prins reaction. A unique αα domain architecture is predicted for ScGS based on amino acid sequence: each domain contains the metal-binding motifs typical of a class I terpenoid cyclase, and each domain requires Mg2+ for catalysis. Here, we report the X-ray crystal structure of the unliganded N-terminal domain of ScGS and the structure of its complex with 3 Mg2+ ions and alendronate. These structures highlight conformational changes required for active site closure and catalysis. Although neither full-length ScGS nor constructs of the C-terminal domain could be crystallized, homology models of the C-terminal domain were constructed based on ~36% sequence identity with the N-terminal domain. Small-angle X-ray scattering experiments yield low resolution molecular envelopes into which the N-terminal domain crystal structure and the C-terminal domain homology model were fit, suggesting possible αα domain architectures as frameworks for bifunctional catalysis. PMID:26598179
The history and advances of reversible terminators used in new generations of sequencing technology.

PubMed

Chen, Fei; Dong, Mengxing; Ge, Meng; Zhu, Lingxiang; Ren, Lufeng; Liu, Guocheng; Mu, Rong

2013-02-01

DNA sequencing using reversible terminators, as one sequencing by synthesis strategy, has garnered a great deal of interest due to its popular application in the second-generation high-throughput DNA sequencing technology. In this review, we provided its history of development, classification, and working mechanism of this technology. We also outlined the screening strategies for DNA polymerases to accommodate the reversible terminators as substrates during polymerization; particularly, we introduced the "REAP" method developed by us. At the end of this review, we discussed current limitations of this approach and provided potential solutions to extend its application. Copyright © 2013. Production and hosting by Elsevier Ltd.
Development of Pineapple Microsatellite Markers and Germplasm Genetic Diversity Analysis

PubMed Central

Tong, Helin; Chen, You; Wang, Jingyi; Chen, Yeyuan; Sun, Guangming; He, Junhu; Wu, Yaoting

2013-01-01

Two methods were used to develop pineapple microsatellite markers. Genomic library-based SSR development: using selectively amplified microsatellite assay, 86 sequences were generated from pineapple genomic library. 91 (96.8%) of the 94 Simple Sequence Repeat (SSR) loci were dinucleotide repeats (39 AC/GT repeats and 52 GA/TC repeats, accounting for 42.9% and 57.1%, resp.), and the other three were mononucleotide repeats. Thirty-six pairs of SSR primers were designed; 24 of them generated clear bands of expected sizes, and 13 of them showed polymorphism. EST-based SSR development: 5659 pineapple EST sequences obtained from NCBI were analyzed; among 1397 nonredundant EST sequences, 843 were found containing 1110 SSR loci (217 of them contained more than one SSR locus). Frequency of SSRs in pineapple EST sequences is 1SSR/3.73 kb, and 44 types were found. Mononucleotide, dinucleotide, and trinucleotide repeats dominate, accounting for 95.6% in total. AG/CT and AGC/GCT were the dominant type of dinucleotide and trinucleotide repeats, accounting for 83.5% and 24.1%, respectively. Thirty pairs of primers were designed for each of randomly selected 30 sequences; 26 of them generated clear and reproducible bands, and 22 of them showed polymorphism. Eighteen pairs of primers obtained by the one or the other of the two methods above that showed polymorphism were selected to carry out germplasm genetic diversity analysis for 48 breeds of pineapple; similarity coefficients of these breeds were between 0.59 and 1.00, and they can be divided into four groups accordingly. Amplification products of five SSR markers were extracted and sequenced, corresponding repeat loci were found and locus mutations are mainly in copy number of repeats and base mutations in the flanking region. PMID:24024187
Preparation and structural determination of large oligosaccharides derived from acharan sulfate

PubMed Central

Chi, Lianli; Munoz, Eva M.; Choi, Hyung Seok; Ha, Young Wan; Kim, Yeong Shik; Toida, Toshihiko; Linhardt, Robert J.

2014-01-01

The structures of a series of large oligosaccharides derived from acharan sulfate were characterized. Acharan sulfate is an unusual glycosaminoglycan isolated from the giant African snail, Achatina fulica. Oligosaccharides from decasaccharide to hexadecasaccharide were enzymatically prepared using heparin lyase II and purified. Capillary electrophoresis and gel electrophoresis confirmed the purity of these oligosaccharides. Their structures, determined by ESI-MS and NMR, were consistent with the major repeating sequence in acharan sulfate, →4)-α-d-GlcNpAc-(1→4)-α-l-IdoAp2S-(1→, terminated by 4-linked α-d-GlcNpAc residue at the reducing end and by 4,5-unsaturated pyranosyluronic acid 2-sulfate at the non-reducing end. PMID:16530176
Reverse Transcription Quantitative Polymerase Chain Reaction for Detection of and Differentiation Between RNA and DNA of HIV-1-Based Lentiviral Vectors.

PubMed

Pavlovic, Melanie; Koehler, Nina; Anton, Martina; Dinkelmeier, Anna; Haase, Maren; Stellberger, Thorsten; Busch, Ulrich; Baiker, Armin E

2017-08-01

The purpose of the described method is the detection of and differentiation between RNA and DNA of human immunodeficiency virus (HIV)-derived lentiviral vectors (LV) in cell culture supernatants and swab samples. For the analytical surveillance of genetic engineering, operations methods for the detection of the HIV-1-based LV generations are required. Furthermore, for research issues, it is important to prove the absence of LV particles for downgrading experimental settings in terms of the biosafety level. Here, a quantitative polymerase chain reaction method targeting the long terminal repeat U5 subunit and the start sequence of the packaging signal ψ is described. Numerous controls are included in order to monitor the technical procedure.
Fission yeast retrotransposon Tf1 integration is targeted to 5' ends of open reading frames.

PubMed

Behrens, R; Hayles, J; Nurse, P

2000-12-01

Target site selection of transposable elements is usually not random but involves some specificity for a DNA sequence or a DNA binding host factor. We have investigated the target site selection of the long terminal repeat-containing retrotransposon Tf1 from the fission yeast Schizosaccharomyces pombe. By monitoring induced transposition events we found that Tf1 integration sites were distributed throughout the genome. Mapping these insertions revealed that Tf1 did not integrate into open reading frames, but occurred preferentially in longer intergenic regions with integration biased towards a region 100-420 bp upstream of the translation start site. Northern blot analysis showed that transcription of genes adjacent to Tf1 insertions was not significantly changed.
Fission yeast retrotransposon Tf1 integration is targeted to 5′ ends of open reading frames

PubMed Central

Behrens, Ralf; Hayles, Jacky; Nurse, Paul

2000-01-01

Target site selection of transposable elements is usually not random but involves some specificity for a DNA sequence or a DNA binding host factor. We have investigated the target site selection of the long terminal repeat-containing retrotransposon Tf1 from the fission yeast Schizosaccharomyces pombe. By monitoring induced transposition events we found that Tf1 integration sites were distributed throughout the genome. Mapping these insertions revealed that Tf1 did not integrate into open reading frames, but occurred preferentially in longer intergenic regions with integration biased towards a region 100–420 bp upstream of the translation start site. Northern blot analysis showed that transcription of genes adjacent to Tf1 insertions was not significantly changed. PMID:11095681
The C-terminal region of Ge-1 presents conserved structural features required for P-body localization.

PubMed

Jinek, Martin; Eulalio, Ana; Lingel, Andreas; Helms, Sigrun; Conti, Elena; Izaurralde, Elisa

2008-10-01

The removal of the 5' cap structure by the DCP1-DCP2 decapping complex irreversibly commits eukaryotic mRNAs to degradation. In human cells, the interaction between DCP1 and DCP2 is bridged by the Ge-1 protein. Ge-1 contains an N-terminal WD40-repeat domain connected by a low-complexity region to a conserved C-terminal domain. It was reported that the C-terminal domain interacts with DCP2 and mediates Ge-1 oligomerization and P-body localization. To understand the molecular basis for these functions, we determined the three-dimensional crystal structure of the most conserved region of the Drosophila melanogaster Ge-1 C-terminal domain. The region adopts an all alpha-helical fold related to ARM- and HEAT-repeat proteins. Using structure-based mutants we identified an invariant surface residue affecting P-body localization. The conservation of critical surface and structural residues suggests that the C-terminal region adopts a similar fold with conserved functions in all members of the Ge-1 protein family.
Passenger comfort during terminal-area flight maneuvers. M.S. Thesis.

NASA Technical Reports Server (NTRS)

Schoonover, W. E., Jr.

1976-01-01

A series of flight experiments was conducted to obtain passenger subjective responses to closely controlled and repeatable flight maneuvers. In 8 test flights, reactions were obtained from 30 passenger subjects to a wide range of terminal-area maneuvers, including descents, turns, decelerations, and combinations thereof. Analysis of the passenger rating variance indicated that the objective of a repeatable flight passenger environment was achieved. Multiple linear regression models developed from the test data were used to define maneuver motion boundaries for specified degrees of passenger acceptance.
A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.

PubMed

Kogi, M; Fukushige, S; Lefevre, C; Hadano, S; Ikeda, J E

1997-06-01

In an effort to analyze the genomic region of the distal half of human chromosome 4p, to where Huntington disease and other diseases have been mapped, we have isolated the cosmid clone (CRS447) that was likely to contain a region with specific repeat sequences. Clone CRS447 was subjected to detailed analysis, including chromosome mapping, restriction mapping, and DNA sequencing. Chromosome mapping by both a human-CHO hybrid cell panel and FISH revealed that CRS447 was predominantly located in the 4p15.1-15.3 region. CRS447 was shown to consist of tandem repeats of 4.7-kb units present on chromosome 4p. A single EcoRI unit was subcloned (pRS447), and the complete sequence was determined as 4752 nucleotides. When pRS447 was used as a probe, the number of copies of this repeat per haploid genome was estimated to be 50-70. Sequence analysis revealed that it contained two internal CA repeats and one putative ORF. Database search established that this sequence was unreported. However, two homologous STS markers were found in the database. We concluded that CRS447/pRS447 is a novel tandem repeat sequence that is mainly specific to human chromosome 4p.
Isolation of prolactin and growth hormone from the pituitary of the holostean fish Amia calva.

PubMed

Dores, R M; Noso, T; Rand-Weaver, M; Kawauchi, H

1993-06-01

Pituitaries from adult male and female Amia calva (Order Holostei) were acid extracted and fractionated by gel filtration column chromatography and reversed-phase high performance liquid chromatography. This two-step isolation procedure yielded homogeneous pools of Amia prolaction (PRL) and growth hormone (GH). The amino acid composition of both purified polypeptides was determined. Primary sequence analysis of the first 22 positions at the N-terminal of Amia PRL revealed that this region has 63% sequence identity with eel PRL-1. The N-terminal region of Amia PRL lacks the disulfide bridge which is characteristic of tetrapod PRLs. Primary sequence analysis of the first 24 positions at the N-terminal of Amia GH revealed that this region has 62% sequence identity with eel GH and 54% sequence identity with both blue shark GH and sea turtle GH. Based on N-terminal analysis, it appears that Amia PRL and GH are more closely related to teleost PRLs and GHs than they are to tetrapod PRLs and GHs.
Evolution in Action: N and C Termini of Subunits in Related T=4 Viruses Exchange Roles as Molecular Switches

PubMed Central

Speir, Jeffrey A.; Taylor, Derek J.; Natarajan, Padmaja; Pringle, Fiona M.; Ball, L. Andrew; Johnson, John E.

2010-01-01

Summary The T=4 tetravirus and T=3 nodavirus capsid proteins undergo closely similar autoproteolysis to produce the N-terminal ß and C-terminal, lipophilic γ polypeptides. The γ peptides and N-termini of ß also act as molecular switches that determine their quasi-equivalent capsid structures. The crystal structure of Providence virus (PrV), only the second of a tetravirus (the first was NωV), reveals conserved folds and cleavage sites, but the protein termini have completely different structures and the opposite functions of those in N⌉V. N-termini of ß form the molecular switch in PrV, while γ peptides have this role in N⌉V. PrV γ peptides instead interact with packaged RNA at the particle 2-folds using a repeating sequence pattern found in only four other RNA or membrane binding proteins. The disposition of peptide termini in PrV is closely related to those in nodaviruses suggesting that PrV may be closer to the primordial T=4 particle than NωV. PMID:20541507
Errant processing and structural alterations of genomes present in a varicella-zoster virus vaccine.

PubMed Central

Vlazny, D A; Hyman, R W

1985-01-01

Five minority populations of aberrant, varicella-zoster virus (VZV)-derived genomes were identified among the encapsidated DNAs obtained from the nuclear and cytoplasmic fractions of an in vitro infection initiated with a lyophilized sample of the BIKEN VZV vaccine (strain Oka). These were (i) VZV genomes, present within nuclear but not cytoplasmic viral capsids, which had been cleaved at a specific site within the short segment and which were, therefore, 3.15 megadaltons (approximately 4% of the VZV genome length) short of full length; (ii) highly deleted, repetitive VZV genomes which contained the errant cleavage site but not the usual VZV genome terminal sequences; (iii) VZV genomes into which multiples of 1 through 5 defective genome repeat units had been inserted into a homologous site; (iv) VZV genomes with additions of 0.1 or 0.18 megadaltons of DNA at both the terminal and internal ends of the short segment; and (v) VZV DNA which had lost the HindIII restriction site at map position 0.11. Images PMID:2993670
Secretion of CyaA-PrtB and HlyA-PrtB fusion proteins in Escherichia coli: involvement of the glycine-rich repeat domain of Erwinia chrysanthemi protease B.

PubMed Central

Létoffé, S; Wandersman, C

1992-01-01

Protease B from Erwinia chrysanthemi was shown previously to have a C-terminal secretion signal located downstream of a domain that contains six glycine-rich repeats. This domain is conserved in all known bacterial proteins secreted by the signal peptide-independent pathway. The role of these repeats in the secretion process is controversial. We compared the secretion processes of various heterologous polypeptides fused either directly to the signal or separated from it by the glycine-rich domain. Although the repeats are not involved in the secretion of small truncated protease B carboxy-terminal peptides, they are required for the secretion of higher-molecular-weight fusion proteins. Secretion efficiency was also dependent on the size of the passenger polypeptide. Images PMID:1629152
Definition of RNA Polymerase II CoTC Terminator Elements in the Human Genome

PubMed Central

Nojima, Takayuki; Dienstbier, Martin; Murphy, Shona; Proudfoot, Nicholas J.; Dye, Michael J.

2013-01-01

Summary Mammalian RNA polymerase II (Pol II) transcription termination is an essential step in protein-coding gene expression that is mediated by pre-mRNA processing activities and DNA-encoded terminator elements. Although much is known about the role of pre-mRNA processing in termination, our understanding of the characteristics and generality of terminator elements is limited. Whereas promoter databases list up to 40,000 known and potential Pol II promoter sequences, fewer than ten Pol II terminator sequences have been described. Using our knowledge of the human β-globin terminator mechanism, we have developed a selection strategy for mapping mammalian Pol II terminator elements. We report the identification of 78 cotranscriptional cleavage (CoTC)-type terminator elements at endogenous gene loci. The results of this analysis pave the way for the full understanding of Pol II termination pathways and their roles in gene expression. PMID:23562152

The immunoglobulin heavy chain locus of the duck. Genomic organization and expression of D, J, and C region genes.

PubMed

Lundqvist, M L; Middleton, D L; Hazard, S; Warr, G W

2001-12-14

The region of the duck IgH locus extending from upstream of the proximal diversity (D) segment to downstream of the constant gene cluster has been cloned and mapped. A sequence contig of 48,796 base pairs established that the organization of the genes is D-J(H)-mu-alpha-upsilon. No evidence for a functional homologue (or remnant) of a delta gene was found. The alpha gene is in inverted transcriptional orientation; class switch to IgA expression thus requires inversion of the approximately 27-kilobase pair region that includes both mu and alpha genes. The secreted forms of duck alpha and mu are each encoded by 4 constant region exons, and the hydrophobic C-terminal regions of the membrane receptor forms of alpha and mu are encoded by one and two transmembrane exons, respectively. Putative switch (S) regions were identified for duck mu and upsilon by comparison with chicken Smu and Supsilon sequences and for duck alpha by comparison with mouse Salpha. The duck IgH locus is rich in complex variable number tandem repeats, which occupy approximately 60% of the sequenced region, and occur at a much higher frequency in the IgH locus than in other sequenced regions of the duck genome.
In vitro excision of adeno-associated virus DNA from recombinant plasmids: Isolation of an enzyme fraction from HeLa cells that cleaves DNA at poly(G) sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gottlieb, J.; Muzyczka, N.

1988-06-01

When circular recombinant plasmids containing adeno-associated virus (AAV) DNA sequences are transfected into human cells, the AAV provirus is rescued. Using these circular AAV plasmids as substrates, the authors isolated an enzyme fraction from HeLa cell nuclear extracts that excises intact AAV DNA in vitro from vector DNA and produces linear DNA products. The recognition signal for the enzyme is a polypurine-polypyrimidine sequence which is at least 9 residues long and rich in G . C base pairs. Such sequences are present in AAV recombinant plasmids as part of the first 15 base pairs of the AAV terminal repeat andmore » in some cases as the result of cloning the AAV genome by G . C tailing. The isolated enzyme fraction does not have significant endonucleolytic activity on single-stranded or double-stranded DNA. Plasmid DNA that is transfected into tissue culture cells is cleaved in vivo to produce a pattern of DNA fragments similar to that seen with purified enzyme in vitro. The activity has been called endo R for rescue, and its behavior suggests that it may have a role in recombination of cellular chromosomes.« less
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

PubMed

Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

2011-01-01

Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
Survey and Analysis of Microsatellites in the Silkworm, Bombyx mori

PubMed Central

Prasad, M. Dharma; Muthulakshmi, M.; Madhu, M.; Archak, Sunil; Mita, K.; Nagaraju, J.

2005-01-01

We studied microsatellite frequency and distribution in 21.76-Mb random genomic sequences, 0.67-Mb BAC sequences from the Z chromosome, and 6.3-Mb EST sequences of Bombyx mori. We mined microsatellites of ≥15 bases of mononucleotide repeats and ≥5 repeat units of other classes of repeats. We estimated that microsatellites account for 0.31% of the genome of B. mori. Microsatellite tracts of A, AT, and ATT were the most abundant whereas their number drastically decreased as the length of the repeat motif increased. In general, tri- and hexanucleotide repeats were overrepresented in the transcribed sequences except TAA, GTA, and TGA, which were in excess in genomic sequences. The Z chromosome sequences contained shorter repeat types than the rest of the chromosomes in addition to a higher abundance of AT-rich repeats. Our results showed that base composition of the flanking sequence has an influence on the origin and evolution of microsatellites. Transitions/transversions were high in microsatellites of ESTs, whereas the genomic sequence had an equal number of substitutions and indels. The average heterozygosity value for 23 polymorphic microsatellite loci surveyed in 13 diverse silkmoth strains having 2–14 alleles was 0.54. Only 36 (18.2%) of 198 microsatellite loci were polymorphic between the two divergent silkworm populations and 10 (5%) loci revealed null alleles. The microsatellite map generated using these polymorphic markers resulted in 8 linkage groups. B. mori microsatellite loci were the most conserved in its immediate ancestor, B. mandarina, followed by the wild saturniid silkmoth, Antheraea assama. PMID:15371363
Thermal and chemical denaturation of the BRCT functional module of human 53BP1.

PubMed

Thanassoulas, Angelos; Nomikos, Michail; Theodoridou, Maria; Stavros, Philemon; Mastellos, Dimitris; Nounesis, George

2011-10-01

BRCTs are protein-docking modules involved in eukaryotic DNA repair. They are characterized by low sequence homology with generally well-conserved structure organization. In a considerable number of proteins, a pair of BRCT structural repeats occurs, connected with inter-BRCT linkers, variable in length, sequence and structure. Linkers may separate and control the relative position of BRCT domains as well as protect and stabilize the hydrophobic inter-BRCT interface region. Their vital role in protein function has been demonstrated by recent findings associating missense mutations in the inter-repeat linker region of the BRCT domain of BRCA1 (BRCA1-BRCT) to hereditary breast/ovarian cancer. The interaction of 53BP1 with the core domain of the p53 tumor suppressor involves the C-terminal BRCT repeat as well as the inert-BRCT linker of the tandem BRCT domain of 53BP1 (53BP1-BRCT). High-accuracy differential scanning calorimetry (DSC) and circular dichroism (CD) have been employed to characterize the heat-induced unfolding of 53BP1-BRCT domain. The calorimetric results provide evidence for unfolding to an intermediate, only partly unfolded state, which, based on the CD results, retains the secondary structural characteristics of the native protein. A direct comparison with the corresponding thermal processes for BRAC1-BRCT and BARD1-BRCT provides evidence that the observed behavior is analogous to BRCA1-BRCT even though the two domains differ substantially in the linker structure. Moreover, chemical denaturation experiments of the untagged 53BP1-BRCT and comparison with BRCA1 and BARD1 BRCTs show that no clear association can be drawn between the structural organization of the inter-BRCT linkers and the overall stability of the BRCT domains. Copyright © 2011 Elsevier B.V. All rights reserved.
Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

NASA Astrophysics Data System (ADS)

Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

2015-12-01

Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.
Algorithm to find distant repeats in a single protein sequence

PubMed Central

Banerjee, Nirjhar; Sarani, Rangarajan; Ranjani, Chellamuthu Vasuki; Sowmiya, Govindaraj; Michael, Daliah; Balakrishnan, Narayanasamy; Sekar, Kanagaraj

2008-01-01

Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end, an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists, biochemists and researchers involved in phylogenetic and evolutionary studies. PMID:19052663
Circularization of the HIV-1 genome facilitates strand transfer during reverse transcription

PubMed Central

Beerens, Nancy; Kjems, Jørgen

2010-01-01

Two obligatory DNA strand transfers take place during reverse transcription of a retroviral RNA genome. The first strand transfer involves a jump from the 5′ to the 3′ terminal repeat (R) region positioned at each end of the viral genome. The process depends on base pairing between the cDNA synthesized from the 5′ R region and the 3′ R RNA. The tertiary conformation of the viral RNA genome may facilitate strand transfer by juxtaposing the 5′ R and 3′ R sequences that are 9 kb apart in the linear sequence. In this study, RNA sequences involved in an interaction between the 5′ and 3′ ends of the HIV-1 genome were mapped by mutational analysis. This interaction appears to be mediated mainly by a sequence in the extreme 3′ end of the viral genome and in the gag open reading frame. Mutation of 3′ R sequences was found to inhibit the 5′–3′ interaction, which could be restored by a complementary mutation in the 5′ gag region. Furthermore, we find that circularization of the HIV-1 genome does not affect the initiation of reverse transcription, but stimulates the first strand transfer during reverse transcription in vitro, underscoring the functional importance of the interaction. PMID:20430859
Saccharomyces cerevisiae SSB1 protein and its relationship to nucleolar RNA-binding proteins.

PubMed Central

Jong, A Y; Clark, M W; Gilbert, M; Oehm, A; Campbell, J L

1987-01-01

To better define the function of Saccharomyces cerevisiae SSB1, an abundant single-stranded nucleic acid-binding protein, we determined the nucleotide sequence of the SSB1 gene and compared it with those of other proteins of known function. The amino acid sequence contains 293 amino acid residues and has an Mr of 32,853. There are several stretches of sequence characteristic of other eucaryotic single-stranded nucleic acid-binding proteins. At the amino terminus, residues 39 to 54 are highly homologous to a peptide in calf thymus UP1 and UP2 and a human heterogeneous nuclear ribonucleoprotein. Residues 125 to 162 constitute a fivefold tandem repeat of the sequence RGGFRG, the composition of which suggests a nucleic acid-binding site. Near the C terminus, residues 233 to 245 are homologous to several RNA-binding proteins. Of 18 C-terminal residues, 10 are acidic, a characteristic of the procaryotic single-stranded DNA-binding proteins and eucaryotic DNA- and RNA-binding proteins. In addition, examination of the subcellular distribution of SSB1 by immunofluorescence microscopy indicated that SSB1 is a nuclear protein, predominantly located in the nucleolus. Sequence homologies and the nucleolar localization make it likely that SSB1 functions in RNA metabolism in vivo, although an additional role in DNA metabolism cannot be excluded. Images PMID:2823109
The genomic sequence of ectromelia virus, the causative agent of mousepox.

PubMed

Chen, Nanhai; Danila, Maria I; Feng, Zehua; Buller, R Mark L; Wang, Chunlin; Han, Xiaosi; Lefkowitz, Elliot J; Upton, Chris

2003-12-05

Ectromelia virus is the causative agent of mousepox, an acute exanthematous disease of mouse colonies in Europe, Japan, China, and the U.S. The Moscow, Hampstead, and NIH79 strains are the most thoroughly studied with the Moscow strain being the most infectious and virulent for the mouse. In the late 1940s mousepox was proposed as a model for the study of the pathogenesis of smallpox and generalized vaccinia in humans. Studies in the last five decades from a succession of investigators have resulted in a detailed description of the virologic and pathologic disease course in genetically susceptible and resistant inbred and out-bred mice. We report the DNA sequence of the left-hand end, the predicted right-hand terminal repeat, and central regions of the genome of the Moscow strain of ectromelia virus (approximately 177,500 bp), which together with the previously sequenced right-hand end, yields a genome of 209,771 bp. We identified 175 potential genes specifying proteins of between 53 and 1924 amino acids, and 29 regions containing sequences related to genes predicted in other poxviruses, but unlikely to encode for functional proteins in ectromelia virus. The translated protein sequences were compared with the protein database for structure/function relationships, and these analyses were used to investigate poxvirus evolution and to attempt to explain at the cellular and molecular level the well-characterized features of the ectromelia virus natural life cycle.
An H2A Histone Isotype, H2ac, Associates with Telomere and Maintains Telomere Integrity

PubMed Central

Tzeng, Tsai-Yu; Lin, I-Hsuan; Hsu, Ming-Ta

2016-01-01

Telomeres are capped at the ends of eukaryotic chromosomes and are composed of TTAGGG repeats bound to the shelterin complex. Here we report that a replication-dependent histone H2A isotype, H2ac, was associated with telomeres in human cells and co-immunoprecipitates with telomere repeat factor 2 (TRF2) and protection of telomeres protein 1 (POT1), whereas other histone H2A isotypes and mutations of H2ac did not bind to telomeres or these two proteins. The amino terminal basic domain of TRF2 was necessary for the association with H2ac and for the recruitment of H2ac to telomeres. Depletion of H2ac led to loss of telomeric repeat sequences, the appearance of dysfunctional telomeres, and chromosomal instability, including chromosomal breaks and anaphase bridges, as well as accumulation of telomere-associated DNA damage factors in H2ac depleted cells. Additionally, knockdown of H2ac elicits an ATM-dependent DNA damage response at telomeres and depletion of XPF protects telomeres against H2ac-deficiency-induced G-strand overhangs loss and DNA damage response, and prevents chromosomal instability. These findings suggest that the H2A isotype, H2ac, plays an essential role in maintaining telomere functional integrity. PMID:27228173
Molecular architecture of silk fibroin of Indian golden silkmoth, Antheraea assama.

PubMed

Gupta, Adarsh K; Mita, Kazuei; Arunkumar, Kallare P; Nagaraju, Javaregowda

2015-08-03

The golden silk spun by Indian golden silkmoth Antheraea assama, is regarded for its shimmering golden luster, tenacity and value as biomaterial. This report describes the gene coding for golden silk H-fibroin (AaFhc), its expression, full-length sequence and structurally important motifs discerning the underlying genetic and biochemical factors responsible for its much sought-after properties. The coding region, with biased isocodons, encodes highly repetitious crystalline core, flanked by a pair of 5' and 3' non-repetitious ends. AaFhc mRNA expression is strictly territorial, confined to the posterior silk gland, encoding a protein of size 230 kDa, which makes homodimers making the elementary structural units of the fibrous core of the golden silk. Characteristic polyalanine repeats that make tight β-sheet crystals alternate with non-polyalanine repeats that make less orderly antiparallel β-sheets, β-turns and partial α-helices. Phylogenetic analysis of the conserved N-terminal amorphous motif and the comparative analysis of the crystalline region with other saturniid H-fibroins reveal that AaFhc has longer, numerous and relatively uniform repeat motifs with lower serine content that assume tighter β-crystals and denser packing, which are speculated to be responsible for its acclaimed properties of higher tensile strength and higher refractive index responsible for golden luster.
The SpTransformer Gene Family (Formerly Sp185/333) in the Purple Sea Urchin and the Functional Diversity of the Anti-Pathogen rSpTransformer-E1 Protein

PubMed Central

Smith, L. Courtney; Lun, Cheng Man

2017-01-01

The complex innate immune system of sea urchins is underpinned by several multigene families including the SpTransformer family (SpTrf; formerly Sp185/333) with estimates of ~50 members, although the family size is likely variable among individuals of Strongylocentrotus purpuratus. The genes are small with similar structure, are tightly clustered, and have several types of repeats in the second of two exons and that surround each gene. The density of repeats suggests that the genes are positioned within regions of genomic instability, which may be required to drive sequence diversification. The second exon encodes the mature protein and is composed of blocks of sequence called elements that are present in mosaics of defined element patterns and are the major source of sequence diversity. The SpTrf genes respond swiftly to immune challenge, but only a single gene is expressed per phagocyte. Many of the mRNAs appear to be edited and encode proteins with altered and/or missense sequence that are often truncated, of which some may be functional. The standard SpTrf protein structure is an N-terminal glycine-rich region, a central RGD motif, a histidine-rich region, and a C-terminal region. Function is predicted from a recombinant protein, rSpTransformer-E1 (rSpTrf-E1), which binds to Vibrio and Saccharomyces, but not to Bacillus, and binds tightly to lipopolysaccharide, β-1,3-glucan, and flagellin, but not to peptidoglycan. rSpTrf-E1 is intrinsically disordered but transforms to α helical structure in the presence of binding targets including lipopolysaccharide, which may underpin the characteristics of binding to multiple targets. SpTrf proteins associate with coelomocyte membranes, and rSpTrf-E1 binds specifically to phosphatidic acid (PA). When rSpTrf-E1 is bound to PA in liposome membranes, it induces morphological changes in liposomes that correlate with PA clustering and leakage of luminal contents, and it extracts or removes PA from the bilayer. The multitasking activities of rSpTrf-E1 infer multiple and perhaps overlapping activities for the hundreds of native SpTrf proteins that are produced by individual sea urchins. This likely generates a flexible and highly protective immune system for the sea urchin in its marine habitat that it shares with broad arrays of microbes that may be pathogens and opportunists. PMID:28713368
Characterization of a digestive carboxypeptidase from the insect pest corn earworm (Helicoverpa armigera) with novel specificity towards C-terminal glutamate residues.

PubMed

Bown, David P; Gatehouse, John A

2004-05-01

Carboxypeptidases were purified from guts of larvae of corn earworm (Helicoverpa armigera), a lepidopteran crop pest, by affinity chromatography on immobilized potato carboxypeptidase inhibitor, and characterized by N-terminal sequencing. A larval gut cDNA library was screened using probes based on these protein sequences. cDNA HaCA42 encoded a carboxypeptidase with sequence similarity to enzymes of clan MC [Barrett, A. J., Rawlings, N. D. & Woessner, J. F. (1998) Handbook of Proteolytic Enzymes. Academic Press, London.], but with a novel predicted specificity towards C-terminal acidic residues. This carboxypeptidase was expressed as a recombinant proprotein in the yeast Pichia pastoris. The expressed protein could be activated by treatment with bovine trypsin; degradation of bound pro-region, rather than cleavage of pro-region from mature protein, was the rate-limiting step in activation. Activated HaCA42 carboxypeptidase hydrolysed a synthetic substrate for glutamate carboxypeptidases (FAEE, C-terminal Glu), but did not hydrolyse substrates for carboxypeptidase A or B (FAPP or FAAK, C-terminal Phe or Lys) or methotrexate, cleaved by clan MH glutamate carboxypeptidases. The enzyme was highly specific for C-terminal glutamate in peptide substrates, with slow hydrolysis of C-terminal aspartate also observed. Glutamate carboxypeptidase activity was present in larval gut extract from H. armigera. The HaCA42 protein is the first glutamate-specific metallocarboxypeptidase from clan MC to be identified and characterized. The genome of Drosophila melanogaster contains genes encoding enzymes with similar sequences and predicted specificity, and a cDNA encoding a similar enzyme has been isolated from gut tissue in tsetse fly. We suggest that digestive carboxypeptidases with sequence similarity to the classical mammalian enzymes, but with specificity towards C-terminal glutamate, are widely distributed in insects.
Transposable element distribution, abundance and role in genome size variation in the genus Oryza.

PubMed

Zuccolo, Andrea; Sebastian, Aswathy; Talag, Jayson; Yu, Yeisoo; Kim, HyeRan; Collura, Kristi; Kudrna, Dave; Wing, Rod A

2007-08-29

The genus Oryza is composed of 10 distinct genome types, 6 diploid and 4 polyploid, and includes the world's most important food crop - rice (Oryza sativa [AA]). Genome size variation in the Oryza is more than 3-fold and ranges from 357 Mbp in Oryza glaberrima [AA] to 1283 Mbp in the polyploid Oryza ridleyi [HHJJ]. Because repetitive elements are known to play a significant role in genome size variation, we constructed random sheared small insert genomic libraries from 12 representative Oryza species and conducted a comprehensive study of the repetitive element composition, distribution and phylogeny in this genus. Particular attention was paid to the role played by the most important classes of transposable elements (Long Terminal Repeats Retrotransposons, Long interspersed Nuclear Elements, helitrons, DNA transposable elements) in shaping these genomes and in their contributing to genome size variation. We identified the elements primarily responsible for the most strikingly genome size variation in Oryza. We demonstrated how Long Terminal Repeat retrotransposons belonging to the same families have proliferated to very different extents in various species. We also showed that the pool of Long Terminal Repeat Retrotransposons is substantially conserved and ubiquitous throughout the Oryza and so its origin is ancient and its existence predates the speciation events that originated the genus. Finally we described the peculiar behavior of repeats in the species Oryza coarctata [HHKK] whose placement in the Oryza genus is controversial. Long Terminal Repeat retrotransposons are the major component of the Oryza genomes analyzed and, along with polyploidization, are the most important contributors to the genome size variation across the Oryza genus. Two families of Ty3-gypsy elements (RIRE2 and Atlantys) account for a significant portion of the genome size variations present in the Oryza genus.
Characterization of (CA)n microsatellite repeats from large-insert clones.

PubMed

Litt, M; Browne, D

2001-05-01

The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit determination of sequences flanking the microsatellites. When cosmids or large-insert phage clones are used as primary sources of (CA)n repeat markers, they have traditionally been subcloned into plasmid vectors such as pUC18 or M13 mp 18/19 cloning vectors to obtain fragments of suitable size for DNA sequencing. This unit presents an alternative approach whereby a set of degenerate sequencing primers that anneal directly to (CA)n microsatellites can be used to determine sequences that are inaccessible with vector-derived primers. Because the primers anneal to the repeat and not to the vector, they can be used with subclones containing inserts of several kilobases and should, in theory, always give sequence in the regions directly flanking the repeat. Degeneracy at the 3 end of each of these primers prevents elongation of primers that have annealed out-of-register. The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit.
Characterization of Toll-like receptor 3 gene in large yellow croaker, Pseudosciaena crocea.

PubMed

Huang, Xue-Na; Wang, Zhi-Yong; Yao, Cui-Luan

2011-07-01

Toll-like receptor 3 (TLR3) plays an important role in innate immune responses. In this report, the full-length cDNA sequence and genomic structure of Pseudosciaena crocea TLR3 (PcTLR3) were identified and characterized. The full-length cDNA of PcTLR3 was of 3384 bp, including a 5'-terminal untranslated region (UTR) of 65 bp, a 3'-terminal UTR of 589 bp and an open reading frame (ORF) of 2730 bp encoding a polypeptide of 909 amino acid residues. The full-length genome sequence of PcTLR3 was composed of 5721 nucleotides, including five exons and four introns. The putative PcTLR3 protein contained a signal peptide sequence, 16 leucine-rich repeat (LRR) motifs, a transmembrane region and a Toll/interleukin-1 receptor (TIR) domain. Quantitative real-time reverse transcription PCR analysis revealed a broad expression of PcTLR3 in most tissues, with the predominant expression in liver, then intestine, and the weakest expression in blood cells. The expression of PcTLR3 after injection with poly inosinic:cytidylic (I:C) and Vibrio parahemolyticus was tested in spleen, blood cells and liver. The results indicated that PcTLR3 transcripts could be induced in the three tissues by injection with poly I:C. The highest expression was in the blood cells with 43.5 times (at 6h) greater expression than in the control (p<0.05). In addition, after V. parahemolyticus challenge, a moderate up-regulation and down-regulation of PcTLR3 was found in blood cells and liver, respectively. Our results suggested that PcTLR3 might play an important role in fish's defense against both viral and bacterial infection. Copyright © 2011 Elsevier Ltd. All rights reserved.
Comparison between TRF2 and TRF1 of their telomeric DNA-bound structures and DNA-binding activities

PubMed Central

Hanaoka, Shingo; Nagadoi, Aritaka; Nishimura, Yoshifumi

2005-01-01

Mammalian telomeres consist of long tandem arrays of double-stranded telomeric TTAGGG repeats packaged by the telomeric DNA-binding proteins TRF1 and TRF2. Both contain a similar C-terminal Myb domain that mediates sequence-specific binding to telomeric DNA. In a DNA complex of TRF1, only the single Myb-like domain consisting of three helices can bind specifically to double-stranded telomeric DNA. TRF2 also binds to double-stranded telomeric DNA. Although the DNA binding mode of TRF2 is likely identical to that of TRF1, TRF2 plays an important role in the t-loop formation that protects the ends of telomeres. Here, to clarify the details of the double-stranded telomeric DNA-binding modes of TRF1 and TRF2, we determined the solution structure of the DNA-binding domain of human TRF2 bound to telomeric DNA; it consists of three helices, and like TRF1, the third helix recognizes TAGGG sequence in the major groove of DNA with the N-terminal arm locating in the minor groove. However, small but significant differences are observed; in contrast to the minor groove recognition of TRF1, in which an arginine residue recognizes the TT sequence, a lysine residue of TRF2 interacts with the TT part. We examined the telomeric DNA-binding activities of both DNA-binding domains of TRF1 and TRF2 and found that TRF1 binds more strongly than TRF2. Based on the structural differences of both domains, we created several mutants of the DNA-binding domain of TRF2 with stronger binding activities compared to the wild-type TRF2. PMID:15608118
Proteolytic processing of the vitellogenin precursor in the boll weevil, Anthonomus grandis.

PubMed

Heilmann, L J; Trewitt, P M; Kumaran, A K

1993-01-01

The soluble proteins of the eggs of the coleopteran insect Anthonomus grandis Boheman, the cotton boll weevil, consist almost entirely of two vitellin types with M(r)s of 160,000 and 47,000. We sequenced their N-terminal ends and one internal cyanogen bromide fragment of the large vitellin and compared these sequences with the deduced amino acid sequence from the vitellogenin gene. The results suggest that both the boll weevil vitellin proteins are products of the proteolytic cleavage of a single precursor protein. The smaller 47,000 M(r) vitellin protein is derived from the N-terminal portion of the precursor adjacent to an 18 amino acid signal peptide. The cleavage site between the large and small vitellins at amino acid 362 is adjacent to a pentapeptide sequence containing two pairs of arginine residues. Comparison of the boll weevil sequences with limited known sequences from the single 180,000 M(r) honey bee protein show that the honey bee vitellin N-terminal exhibits sequence homology to the N-terminal of the 47,000 M(r) boll weevil vitellin. Treatment of the vitellins with an N-glycosidase results in a decrease in molecular weight of both proteins, from 47,000 to 39,000 and from 160,000 to 145,000, indicating that about 10-15% of the molecular weight of each vitellin consists of N-linked carbohydrate. The molecular weight of the deglycosylated large vitellin is smaller than that predicted from the gene sequence, indicating possible further proteolytic processing at the C-terminal of that protein.
Structural analysis of the rDNA intergenic spacer of Brassica nigra: evolutionary divergence of the spacers of the three diploid Brassica species.

PubMed

Bhatia, S; Singh Negi, M; Lakshmikumaran, M

1996-11-01

EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.

Amino terminal sequence of heavy and light chains from ratfish immunoglobulin.

PubMed

De Ioannes, A E; Aguila, H L

1989-01-01

The ratfish, Callorhinchus callorhinchus, a representative of the Holocephali, has a natural serum hemagglutinin (Mr 960,000), composed of heavy (Mr 71,000), light (Mr 22,500), and J (Mr 16,000) chains. To approach the mechanisms that generate diversity at this level of evolution, the amino terminal sequence of the heavy and light chains was determined by automated microsequencing. The chains are unblocked and have modest internal sequence heterogeneity. The heavy chains show sequence similarity with the terminal region of the heavy chain from the horned shark, Heterodontus francisci, and other species. In contrast to the heavy chain, the ratfish light chains display low sequence similarity with their shark kappa counterparts. However, their similarity with the variable region of the chicken lambda light chains is about 75%.
A Comprehensive Genetic Study of Streptococcal Immunoglobulin A1 Proteases: Evidence for Recombination within and between Species

PubMed Central

Poulsen, Knud; Reinholdt, Jesper; Jespersgaard, Christina; Boye, Kit; Brown, Thomas A.; Hauge, Majbritt; Kilian, Mogens

1998-01-01

An analysis of 13 immunoglobulin A1 (IgA1) protease genes (iga) of strains of Streptococcus pneumoniae, Streptococcus oralis, Streptococcus mitis, and Streptococcus sanguis was carried out to obtain information on the structure, polymorphism, and phylogeny of this specific protease, which enables bacteria to evade functions of the predominant Ig isotype on mucosal surfaces. The analysis included cloning and sequencing of iga genes from S. oralis and S. mitis biovar 1, sequencing of an additional seven iga genes from S. sanguis biovars 1 through 4, and restriction fragment length polymorphism (RFLP) analyses of iga genes of another 10 strains of S. mitis biovar 1 and 6 strains of S. oralis. All 13 genes sequenced had the potential of encoding proteins with molecular masses of approximately 200 kDa containing the sequence motif HEMTH and an E residue 20 amino acids downstream, which are characteristic of Zn metalloproteinases. In addition, all had a typical gram-positive cell wall anchor motif, LPNTG, which, in contrast to such motifs in other known streptococcal and staphylococcal proteins, was located in their N-terminal parts. Repeat structures showing variation in number and sequence were present in all strains and may be of relevance to the immunogenicities of the enzymes. Protease activities in cultures of the streptococcal strains were associated with species of different molecular masses ranging from 130 to 200 kDa, suggesting posttranslational processing possibly as a result of autoproteolysis at post-proline peptide bonds in the N-terminal parts of the molecules. Comparison of deduced amino acid sequences revealed a 94% similarity between S. oralis and S. mitis IgA1 proteases and a 75 to 79% similarity between IgA1 proteases of these species and those of S. pneumoniae and S. sanguis, respectively. Combined with the results of RFLP analyses using different iga gene fragments as probes, the results of nucleotide sequence comparisons provide evidence of horizontal transfer of iga gene sequences among individual strains of S. sanguis as well as among S. mitis and the two species S. pneumoniae and S. oralis. While iga genes of S. sanguis and S. oralis were highly homogeneous, the genes of S. pneumoniae and S. mitis showed extensive polymorphism reflected in different degrees of antigenic diversity. PMID:9423856
A Novel Terminal-Repeat Retrotransposon in Miniature (TRIM) Is Massively Expressed in Echinococcus multilocularis Stem Cells

PubMed Central

Koziol, Uriel; Radio, Santiago; Smircich, Pablo; Zarowiecki, Magdalena; Fernández, Cecilia; Brehm, Klaus

2015-01-01

Taeniid cestodes (including the human parasites Echinococcus spp. and Taenia solium) have very few mobile genetic elements (MGEs) in their genome, despite lacking a canonical PIWI pathway. The MGEs of these parasites are virtually unexplored, and nothing is known about their expression and silencing. In this work, we report the discovery of a novel family of small nonautonomous long terminal repeat retrotransposons (also known as terminal-repeat retrotransposons in miniature, TRIMs) which we have named ta-TRIM (taeniid TRIM). ta-TRIMs are only the second family of TRIM elements discovered in animals, and are likely the result of convergent reductive evolution in different taxonomic groups. These elements originated at the base of the taeniid tree and have expanded during taeniid diversification, including after the divergence of closely related species such as Echinococcus multilocularis and Echinococcus granulosus. They are massively expressed in larval stages, from a small proportion of full-length copies and from isolated terminal repeats that show transcriptional read-through into downstream regions, generating novel noncoding RNAs and transcriptional fusions to coding genes. In E. multilocularis, ta-TRIMs are specifically expressed in the germinative cells (the somatic stem cells) during asexual reproduction of metacestode larvae. This would provide a developmental mechanism for insertion of ta-TRIMs into cells that will eventually generate the adult germ line. Future studies of active and inactive ta-TRIM elements could give the first clues on MGE silencing mechanisms in cestodes. PMID:26133390
Identification of a conserved branched RNA structure that functions as a factor-independent terminator.

PubMed

Johnson, Christopher M; Chen, Yuqing; Lee, Heejin; Ke, Ailong; Weaver, Keith E; Dunny, Gary M

2014-03-04

Anti-Q is a small RNA encoded on pCF10, an antibiotic resistance plasmid of Enterococcus faecalis, which negatively regulates conjugation of the plasmid. In this study we sought to understand how Anti-Q is generated relative to larger transcripts of the same operon. We found that Anti-Q folds into a branched structure that functions as a factor-independent terminator. In vitro and in vivo, termination is dependent on the integrity of this structure as well as the presence of a 3' polyuridine tract, but is not dependent on other downstream sequences. In vitro, terminated transcripts are released from RNA polymerase after synthesis. In vivo, a mutant with reduced termination efficiency demonstrated loss of tight control of conjugation function. A search of bacterial genomes revealed the presence of sequences that encode Anti-Q-like RNA structures. In vitro and in vivo experiments demonstrated that one of these functions as a terminator. This work reveals a previously unappreciated flexibility in the structure of factor-independent terminators and identifies a mechanism for generation of functional small RNAs; it should also inform annotation of bacterial sequence features, such as terminators, functional sRNAs, and operons.
Identification of a conserved branched RNA structure that functions as a factor-independent terminator

PubMed Central

Johnson, Christopher M.; Chen, Yuqing; Lee, Heejin; Ke, Ailong; Weaver, Keith E.; Dunny, Gary M.

2014-01-01

Anti-Q is a small RNA encoded on pCF10, an antibiotic resistance plasmid of Enterococcus faecalis, which negatively regulates conjugation of the plasmid. In this study we sought to understand how Anti-Q is generated relative to larger transcripts of the same operon. We found that Anti-Q folds into a branched structure that functions as a factor-independent terminator. In vitro and in vivo, termination is dependent on the integrity of this structure as well as the presence of a 3′ polyuridine tract, but is not dependent on other downstream sequences. In vitro, terminated transcripts are released from RNA polymerase after synthesis. In vivo, a mutant with reduced termination efficiency demonstrated loss of tight control of conjugation function. A search of bacterial genomes revealed the presence of sequences that encode Anti-Q–like RNA structures. In vitro and in vivo experiments demonstrated that one of these functions as a terminator. This work reveals a previously unappreciated flexibility in the structure of factor-independent terminators and identifies a mechanism for generation of functional small RNAs; it should also inform annotation of bacterial sequence features, such as terminators, functional sRNAs, and operons. PMID:24550474
The crystal structure of a partial mouse Notch-1 ankyrin domain: Repeats 4 through 7 preserve an ankyrin fold

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lubman, Olga Y.; Kopan, Raphael; Waksman, Gabriel

Folding and stability of proteins containing ankyrin repeats (ARs) is of great interest because they mediate numerous protein-protein interactions involved in a wide range of regulatory cellular processes. Notch, an ankyrin domain containing protein, signals by converting a transcriptional repression complex into an activation complex. The Notch ANK domain is essential for Notch function and contains seven ARs. Here, we present the 2.2 {angstrom} crystal structure of ARs 4-7 from mouse Notch 1 (m1ANK). These C-terminal repeats were resistant to degradation during crystallization, and their secondary and tertiary structures are maintained in the absence of repeats 1-3. The crystallized fragmentmore » adopts a typical ankyrin fold including the poorly conserved seventh AR, as seen in the Drosophila Notch ANK domain (dANK). The structural preservation and stability of the C-terminal repeats shed a new light onto the mechanism of hetero-oligomeric assembly during Notch-mediated transcriptional activation.« less
Refunctionalization of the ancient rice blast disease resistance gene Pit by the recruitment of a retrotransposon as a promoter.

PubMed

Hayashi, Keiko; Yoshida, Hitoshi

2009-02-01

The plant genome contains a large number of disease resistance (R) genes that have evolved through diverse mechanisms. Here, we report that a long terminal repeat (LTR) retrotransposon contributed to the evolution of the rice blast resistance gene Pit. Pit confers race-specific resistance against the fungal pathogen Magnaporthe grisea, and is a member of the nucleotide-binding site leucine-rich repeat (NBS-LRR) family of R genes. Compared with the non-functional allele Pit(Npb), the functional allele Pit(K59) contains four amino acid substitutions, and has the LTR retrotransposon Renovator inserted upstream. Pathogenesis assays using chimeric constructs carrying the various regions of Pit(K59) and Pit(Npb) suggest that amino acid substitutions might have a potential effect in Pit resistance; more importantly, the upregulated promoter activity conferred by the Renovator sequence is essential for Pit function. Our data suggest that transposon-mediated transcriptional activation may play an important role in the refunctionalization of additional 'sleeping' R genes in the plant genome.
Telomere maintenance in liquid crystalline chromosomes of dinoflagellates.

PubMed

Fojtová, Miloslava; Wong, Joseph T Y; Dvorácková, Martina; Yan, Kosmo T H; Sýkorová, Eva; Fajkus, Jirí

2010-10-01

The organisation of dinoflagellate chromosomes is exceptional among eukaryotes. Their genomes are the largest in the Eukarya domain, chromosomes lack histones and may exist in liquid crystalline state. Therefore, the study of the structural and functional properties of dinoflagellate chromosomes is of high interest. In this work, we have analysed the telomeres and telomerase in two Dinoflagellata species, Karenia papilionacea and Crypthecodinium cohnii. Active telomerase, synthesising exclusively Arabidopsis-type telomere sequences, was detected in cell extracts. The terminal position of TTTAGGG repeats was determined by in situ hybridisation and BAL31 digestion methods and provides evidence for the linear characteristic of dinoflagellate chromosomes. The length of telomeric tracts, 25-80 kb, is the largest among unicellular eukaryotic organisms to date. Both the presence of long arrays of perfect telomeric repeats at the ends of dinoflagellate chromosomes and the existence of active telomerase as the primary tool for their high-fidelity maintenance demonstrate the general importance of these structures throughout eukaryotes. We conclude that whilst chromosomes of dinoflagellates are unique in many aspects of their structure and composition, their telomere maintenance follows the most common scenario.
Synaptic Targeting and Function of SAPAPs Mediated by Phosphorylation-Dependent Binding to PSD-95 MAGUKs.

PubMed

Zhu, Jinwei; Zhou, Qingqing; Shang, Yuan; Li, Hao; Peng, Mengjuan; Ke, Xiao; Weng, Zhuangfeng; Zhang, Rongguang; Huang, Xuhui; Li, Shawn S C; Feng, Guoping; Lu, Youming; Zhang, Mingjie

2017-12-26

The PSD-95/SAPAP/Shank complex functions as the major scaffold in orchestrating the formation and plasticity of the post-synaptic densities (PSDs). We previously demonstrated that the exquisitely specific SAPAP/Shank interaction is critical for Shank synaptic targeting and Shank-mediated synaptogenesis. Here, we show that the PSD-95/SAPAP interaction, SAPAP synaptic targeting, and SAPAP-mediated synaptogenesis require phosphorylation of the N-terminal repeat sequences of SAPAPs. The atomic structure of the PSD-95 guanylate kinase (GK) in complex with a phosphor-SAPAP repeat peptide, together with biochemical studies, reveals the molecular mechanism underlying the phosphorylation-dependent PSD-95/SAPAP interaction, and it also provides an explanation of a PSD-95 mutation found in patients with intellectual disabilities. Guided by the structural data, we developed potent non-phosphorylated GK inhibitory peptides capable of blocking the PSD-95/SAPAP interaction and interfering with PSD-95/SAPAP-mediated synaptic maturation and strength. These peptides are genetically encodable for investigating the functions of the PSD-95/SAPAP interaction in vivo. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Structure of a designed, right-handed coiled-coil tetramer containing all biological amino acids

PubMed Central

Sales, Mark; Plecs, Joseph J.; Holton, James M.; Alber, Tom

2007-01-01

The previous design of an unprecedented family of two-, three-, and four-helical, right-handed coiled coils utilized nonbiological amino acids to efficiently pack spaces in the oligomer cores. Here we show that a stable, right-handed parallel tetrameric coiled coil, called RH4B, can be designed entirely using biological amino acids. The X-ray crystal structure of RH4B was determined to 1.1 Å resolution using a designed metal binding site to coordinate a single Yb2+ ion per 33-amino acid polypeptide chain. The resulting experimental phases were particularly accurate, and the experimental electron density map provided an especially clear, unbiased view of the molecule. The RH4B structure closely matched the design, with equivalent core rotamers and an overall root-mean-square deviation for the N-terminal repeat of the tetramer of 0.24 Å. The clarity and resolution of the electron density map, however, revealed alternate rotamers and structural differences between the three sequence repeats in the molecule. These results suggest that the RH4B structure populates an unanticipated variety of structures. PMID:17766380
Structure of a designed, right-handed coiled-coil tetramer containing all biological amino acids.

PubMed

Sales, Mark; Plecs, Joseph J; Holton, James M; Alber, Tom

2007-10-01

The previous design of an unprecedented family of two-, three-, and four-helical, right-handed coiled coils utilized nonbiological amino acids to efficiently pack spaces in the oligomer cores. Here we show that a stable, right-handed parallel tetrameric coiled coil, called RH4B, can be designed entirely using biological amino acids. The X-ray crystal structure of RH4B was determined to 1.1 Angstrom resolution using a designed metal binding site to coordinate a single Yb(2+) ion per 33-amino acid polypeptide chain. The resulting experimental phases were particularly accurate, and the experimental electron density map provided an especially clear, unbiased view of the molecule. The RH4B structure closely matched the design, with equivalent core rotamers and an overall root-mean-square deviation for the N-terminal repeat of the tetramer of 0.24 Angstrom. The clarity and resolution of the electron density map, however, revealed alternate rotamers and structural differences between the three sequence repeats in the molecule. These results suggest that the RH4B structure populates an unanticipated variety of structures.
Deletion Mutagenesis Downstream of the 5′ Long Terminal Repeat of Human Immunodeficiency Virus Type 1 Is Compensated for by Point Mutations in both the U5 Region and gag Gene

PubMed Central

Liang, Chen; Rong, Liwei; Russell, Rodney S.; Wainberg, Mark A.

2000-01-01

We have studied the role of an RNA region at nucleotides (nt) +200 to +233, just downstream of the 5′ long terminal repeat, in encapsidation of human immunodeficiency virus type 1 genomic RNA. Three deletion mutations, namely, BH-D0, BH-D1, and BH-D2, were generated to eliminate sequences at positions nt +200 to +219, +200 to +226, and +200 to +233. The result in each case was decreased levels of packaging of viral RNA into the mutated viruses, with the BH-D2 virus being the most severely affected. Consistently, all three deletions resulted in impaired viral infectiousness and the BH-D2 mutation showed the most dramatic impact in this regard. Further analysis revealed additional defects in Gag precursor processing and in the extension efficiency of the tRNA3Lys primer in reverse transcription reactions performed with these mutated viruses. To shed further light on the function of these deleted sequences in viral replication, the mutated viruses were cultured in MT-2 cells over prolonged periods to enable them to reacquire wild-type replication kinetics. Sequencing of the reverted viruses revealed point mutations in both the noncoding region and the gag gene. In the case of the BH-D0 revertant, two mutations were observed at positions G112A in the U5 region, termed M1, and T24I in the nucleocapsid protein, termed MNC, respectively. Either of these two mutations was able to confer wild-type replication capacity on BH-D0. In the case of BH-D1, each of the M1 mutations, a mutation termed M2, i.e., C227T, just downstream of the primer binding site, a mutation termed MP2 (T12I) in the p2 protein, and the MNC mutation were observed. A combination of either M1 and M2 or MP2 and MNC was able to rescue BH-D1. In the case of the BH-D2 deletion-containing viruses, three point mutations, i.e., M1, MP2, and MNC, were observed and the presence of all three was required to restore viral replication to wild-type levels. PMID:10864634
Clustered regularly interspaced short palindromic repeats (CRISPRs) for the genotyping of bacterial pathogens.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2009-01-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) are DNA sequences composed of a succession of repeats (23- to 47-bp long) separated by unique sequences called spacers. Polymorphism can be observed in different strains of a species and may be used for genotyping. We describe protocols and bioinformatics tools that allow the identification of CRISPRs from sequenced genomes, their comparison, and their component determination (the direct repeats and the spacers). A schematic representation of the spacer organization can be produced, allowing an easy comparison between strains.
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

PubMed Central

Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

1993-01-01

Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
The unique C- and N-terminal sequences of Metallothionein isoform 3 mediate growth inhibition and Vectorial active transport in MCF-7 cells.

PubMed

Voels, Brent; Wang, Liping; Sens, Donald A; Garrett, Scott H; Zhang, Ke; Somji, Seema

2017-05-25

The 3rd isoform of the metallothionein (MT3) gene family has been shown to be overexpressed in most ductal breast cancers. A previous study has shown that the stable transfection of MCF-7 cells with the MT3 gene inhibits cell growth. The goal of the present study was to determine the role of the unique C-terminal and N-terminal sequences of MT3 on phenotypic properties and gene expression profiles of MCF-7 cells. MCF-7 cells were transfected with various metallothionein gene constructs which contain the insertion or the removal of the unique MT3 C- and N-terminal domains. Global gene expression analysis was performed on the MCF-7 cells containing the various constructs and the expression of the unique C- and N- terminal domains of MT3 was correlated to phenotypic properties of the cells. The results of the present study demonstrate that the C-terminal sequence of MT3, in the absence of the N-terminal sequence, induces dome formation in MCF-7 cells, which in cell cultures is the phenotypic manifestation of a cell's ability to perform vectorial active transport. Global gene expression analysis demonstrated that the increased expression of the GAGE gene family correlated with dome formation. Expression of the C-terminal domain induced GAGE gene expression, whereas the N-terminal domain inhibited GAGE gene expression and that the effect of the N-terminal domain inhibition was dominant over the C-terminal domain of MT3. Transfection with the metallothionein 1E gene increased the expression of GAGE genes. In addition, both the C- and the N-terminal sequences of the MT3 gene had growth inhibitory properties, which correlated to an increased expression of the interferon alpha-inducible protein 6. Our study shows that the C-terminal domain of MT3 confers dome formation in MCF-7 cells and the presence of this domain induces expression of the GAGE family of genes. The differential effects of MT3 and metallothionein 1E on the expression of GAGE genes suggests unique roles of these genes in the development and progression of breast cancer. The finding that interferon alpha-inducible protein 6 expression is associated with the ability of MT3 to inhibit growth needs further investigation.
Identification of the centromeric repeat in the threespine stickleback fish (Gasterosteus aculeatus).

PubMed

Cech, Jennifer N; Peichel, Catherine L

2015-12-01

Centromere sequences exist as gaps in many genome assemblies due to their repetitive nature. Here we take an unbiased approach utilizing centromere protein A (CENP-A) chomatin immunoprecipitation followed by high-throughput sequencing to identify the centromeric repeat sequence in the threespine stickleback fish (Gasterosteus aculeatus). A 186-bp, AT-rich repeat was validated as centromeric using both fluorescence in situ hybridization (FISH) and immunofluorescence combined with FISH (IF-FISH) on interphase nuclei and metaphase spreads. This repeat hybridizes strongly to the centromere on all chromosomes, with the exception of weak hybridization to the Y chromosome. Together, our work provides the first validated sequence information for the threespine stickleback centromere.
Evaluation of electrical test conditions in MIL-M-38510 slash sheets

NASA Astrophysics Data System (ADS)

Sandgren, K.

1980-08-01

Adequacy of MIL-M-38510 slash sheet requirements for electrical test conditions in an automated test environment were evaluated. Military temperature range commercial devices of 13 types from 6 manufacturers were purchased. Software for testing these devices and for varying the test conditions was written for the Tektronix S-3260 test system. The devices were tested to evaluate the effects of pin-condition settling time, measurement sequence of the same and different D-C parameters, temperature sequence, differently defined temperature ambients, variable measurement conditions, sequence of time measurements, pin-application sequence, and undesignated pin condition ambiguity. An alternative to current tri-state enable and disable time measurements is proposed; S-3260 'open' and 'ground' conditions are characterized; and suggestions for changes in MIL-M-38510 slash sheet specifications and MIL-STD-883 test methods are proposed, both to correct errors and ambiguities and to facilitate the gathering of repeatable data on automated test equipment. Data obtained showed no sensitivity to measurement or temperature sequence nor to temperature ambient, provided that test times were not excessive. V sub ICP tests and some low current measurements required allowance for a pin condition settling time because of the test system speed. Some pin condition application sequences yielded incorrect measurements. Undefined terminal conditions of output pins were found to affect I sub OS and propagation delay time measurements. Truth table test results varied with test frequency and V sub IL for low-power Schottky devices.
Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

PubMed Central

2012-01-01

Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678
Repeatless and repeat-based centromeres in potato: implications for centromere evolution.

PubMed

Gong, Zhiyun; Wu, Yufeng; Koblízková, Andrea; Torres, Giovana A; Wang, Kai; Iovene, Marina; Neumann, Pavel; Zhang, Wenli; Novák, Petr; Buell, C Robin; Macas, Jirí; Jiang, Jiming

2012-09-01

Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats. By contrast, most newly formed centromeres (neocentromeres) do not contain satellite repeats and instead include DNA sequences representative of the genome. An unknown question in centromere evolution is how satellite repeat-based centromeres evolve from neocentromeres. We conducted a genome-wide characterization of sequences associated with CENH3 nucleosomes in potato (Solanum tuberosum). Five potato centromeres (Cen4, Cen6, Cen10, Cen11, and Cen12) consisted primarily of single- or low-copy DNA sequences. No satellite repeats were identified in these five centromeres. At least one transcribed gene was associated with CENH3 nucleosomes. Thus, these five centromeres structurally resemble neocentromeres. By contrast, six potato centromeres (Cen1, Cen2, Cen3, Cen5, Cen7, and Cen8) contained megabase-sized satellite repeat arrays that are unique to individual centromeres. The satellite repeat arrays likely span the entire functional cores of these six centromeres. At least four of the centromeric repeats were amplified from retrotransposon-related sequences and were not detected in Solanum species closely related to potato. The presence of two distinct types of centromeres, coupled with the boom-and-bust cycles of centromeric satellite repeats in Solanum species, suggests that repeat-based centromeres can rapidly evolve from neocentromeres by de novo amplification and insertion of satellite repeats in the CENH3 domains.
Repeatless and Repeat-Based Centromeres in Potato: Implications for Centromere Evolution[C][W

PubMed Central

Gong, Zhiyun; Wu, Yufeng; Koblížková, Andrea; Torres, Giovana A.; Wang, Kai; Iovene, Marina; Neumann, Pavel; Zhang, Wenli; Novák, Petr; Buell, C. Robin; Macas, Jiří; Jiang, Jiming

2012-01-01

Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats. By contrast, most newly formed centromeres (neocentromeres) do not contain satellite repeats and instead include DNA sequences representative of the genome. An unknown question in centromere evolution is how satellite repeat-based centromeres evolve from neocentromeres. We conducted a genome-wide characterization of sequences associated with CENH3 nucleosomes in potato (Solanum tuberosum). Five potato centromeres (Cen4, Cen6, Cen10, Cen11, and Cen12) consisted primarily of single- or low-copy DNA sequences. No satellite repeats were identified in these five centromeres. At least one transcribed gene was associated with CENH3 nucleosomes. Thus, these five centromeres structurally resemble neocentromeres. By contrast, six potato centromeres (Cen1, Cen2, Cen3, Cen5, Cen7, and Cen8) contained megabase-sized satellite repeat arrays that are unique to individual centromeres. The satellite repeat arrays likely span the entire functional cores of these six centromeres. At least four of the centromeric repeats were amplified from retrotransposon-related sequences and were not detected in Solanum species closely related to potato. The presence of two distinct types of centromeres, coupled with the boom-and-bust cycles of centromeric satellite repeats in Solanum species, suggests that repeat-based centromeres can rapidly evolve from neocentromeres by de novo amplification and insertion of satellite repeats in the CENH3 domains. PMID:22968715

Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, H.U.G.; Gray, J.W.

1995-06-27

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, Heinz-Ulrich G.; Gray, Joe W.

1995-01-01

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.
Gene expression of galectin-9/ecalectin, a potent eosinophil chemoattractant, and/or the insertional isoform in human colorectal carcinoma cell lines and detection of frame-shift mutations for protein sequence truncations in the second functional lectin domain.

PubMed

Lahm, H; Hoeflich, A; Andre, S; Sordat, B; Kaltner, H; Wolf, E; Gabius, H J

2000-09-01

The family of Ca2+-independent galactoside-binding lectins with the beta-strand topology of the jelly-roll, referred to as galectins, is known to mediate and modulate a variety of cellular activities. Their functional versatility explains the current interest in monitoring their expression in cancer research, so far primarily focused on galectin-1 and -3. Tandem-repeat-type galectin-9 and its (most probably) allelic variant ecalectin, a potent eosinophil chemoattractant, are known to be human leukocyte products. We show by RT-PCR with primers specific for both that their mRNA is expressed in 17 of 21 human colorectal cancer lines. As also indicated by restriction analysis, in addition to the expected transcript of 571 bp an otherwise identical isoform coding for a 32-amino acid extension of the link peptide was detected. Positive cell lines differentially expressed either one (7 lines) or both transcripts (10 lines). Sequence analysis of RT-PCR products, performed in four cases, allowed to assign the standard transcript to ecalectin in the case of SW480 cells and detected two point mutations in the insert of the link peptide-coding sequence in WiDr and Colo205. Furthermore, this analysis identified the insertion of a single nucleotide into the coding sequence generating a frame-shift mutation, an event which has so far not been reported for any galectin. This alteration encountered in both transcripts of the WiDr line and the isoform transcript of Colo205 cells will most likely truncate the protein part within the second (C-terminal) carbohydrate recognition domain. Our results thus reveal the presence of mRNA for a galectin-9-isoform or a potent eosinophil chemoattractant (ecalectin) or a truncated version thereof with preserved N-terminal carbohydrate recognition domain in established human colon cancer cell lines.
De novo identification of highly diverged protein repeats by probabilistic consistency.

PubMed

Biegert, A; Söding, J

2008-03-15

An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID
Diversity and structure of PIF/Harbinger-like elements in the genome of Medicago truncatula

PubMed Central

Grzebelus, Dariusz; Lasota, Slawomir; Gambin, Tomasz; Kucherov, Gregory; Gambin, Anna

2007-01-01

Background Transposable elements constitute a significant fraction of plant genomes. The PIF/Harbinger superfamily includes DNA transposons (class II elements) carrying terminal inverted repeats and producing a 3 bp target site duplication upon insertion. The presence of an ORF coding for the DDE/DDD transposase, required for transposition, is characteristic for the autonomous PIF/Harbinger-like elements. Based on the above features, PIF/Harbinger-like elements were identified in several plant genomes and divided into several evolutionary lineages. Availability of a significant portion of Medicago truncatula genomic sequence allowed for mining PIF/Harbinger-like elements, starting from a single previously described element MtMaster. Results Twenty two putative autonomous, i.e. carrying an ORF coding for TPase and complete terminal inverted repeats, and 67 non-autonomous PIF/Harbinger-like elements were found in the genome of M. truncatula. They were divided into five families, MtPH-A5, MtPH-A6, MtPH-D,MtPH-E, and MtPH-M, corresponding to three previously identified and two new lineages. The largest families, MtPH-A6 and MtPH-M were further divided into four and three subfamilies, respectively. Non-autonomous elements were usually direct deletion derivatives of the putative autonomous element, however other types of rearrangements, including inversions and nested insertions were also observed. An interesting structural characteristic – the presence of 60 bp tandem repeats – was observed in a group of elements of subfamily MtPH-A6-4. Some families could be related to miniature inverted repeat elements (MITEs). The presence of empty loci (RESites), paralogous to those flanking the identified transposable elements, both autonomous and non-autonomous, as well as the presence of transposon insertion related size polymorphisms, confirmed that some of the mined elements were capable for transposition. Conclusion The population of PIF/Harbinger-like elements in the genome of M. truncatula is diverse. A detailed intra-family comparison of the elements' structure proved that they proliferated in the genome generally following the model of abortive gap repair. However, the presence of tandem repeats facilitated more pronounced rearrangements of the element internal regions. The insertion polymorphism of the MtPH elements and related MITE families in different populations of M. truncatula, if further confirmed experimentally, could be used as a source of molecular markers complementary to other marker systems. PMID:17996080
Bifunctional Anti-Huntingtin Proteasome-Directed Intrabodies Mediate Efficient Degradation of Mutant Huntingtin Exon 1 Protein Fragments

PubMed Central

Butler, David C.; Messer, Anne

2011-01-01

Huntington's disease (HD) is a fatal autosomal dominant neurodegenerative disorder caused by a trinucleotide (CAG)n repeat expansion in the coding sequence of the huntingtin gene, and an expanded polyglutamine (>37Q) tract in the protein. This results in misfolding and accumulation of huntingtin protein (htt), formation of neuronal intranuclear and cytoplasmic inclusions, and neuronal dysfunction/degeneration. Single-chain Fv antibodies (scFvs), expressed as intrabodies that bind htt and prevent aggregation, show promise as immunotherapeutics for HD. Intrastriatal delivery of anti-N-terminal htt scFv-C4 using an adeno-associated virus vector (AAV2/1) significantly reduces the size and number of aggregates in HDR6/1 transgenic mice; however, this protective effect diminishes with age and time after injection. We therefore explored enhancing intrabody efficacy via fusions to heterologous functional domains. Proteins containing a PEST motif are often targeted for proteasomal degradation and generally have a short half life. In ST14A cells, fusion of the C-terminal PEST region of mouse ornithine decarboxylase (mODC) to scFv-C4 reduces htt exon 1 protein fragments with 72 glutamine repeats (httex1-72Q) by ∼80–90% when compared to scFv-C4 alone. Proteasomal targeting was verified by either scrambling the mODC-PEST motif, or via proteasomal inhibition with epoxomicin. For these constructs, the proteasomal degradation of the scFv intrabody proteins themselves was reduced<25% by the addition of the mODC-PEST motif, with or without antigens. The remaining intrabody levels were amply sufficient to target N-terminal httex1-72Q protein fragment turnover. Critically, scFv-C4-PEST prevents aggregation and toxicity of httex1-72Q fragments at significantly lower doses than scFv-C4. Fusion of the mODC-PEST motif to intrabodies is a valuable general approach to specifically target toxic antigens to the proteasome for degradation. PMID:22216210
Detecting and Characterizing Repeating Earthquake Sequences During Volcanic Eruptions

NASA Astrophysics Data System (ADS)

Tepp, G.; Haney, M. M.; Wech, A.

2017-12-01

A major challenge in volcano seismology is forecasting eruptions. Repeating earthquake sequences often precede volcanic eruptions or lava dome activity, providing an opportunity for short-term eruption forecasting. Automatic detection of these sequences can lead to timely eruption notification and aid in continuous monitoring of volcanic systems. However, repeating earthquake sequences may also occur after eruptions or along with magma intrusions that do not immediately lead to an eruption. This additional challenge requires a better understanding of the processes involved in producing these sequences to distinguish those that are precursory. Calculation of the inverse moment rate and concepts from the material failure forecast method can lead to such insights. The temporal evolution of the inverse moment rate is observed to differ for precursory and non-precursory sequences, and multiple earthquake sequences may occur concurrently. These observations suggest that sequences may occur in different locations or through different processes. We developed an automated repeating earthquake sequence detector and near real-time alarm to send alerts when an in-progress sequence is identified. Near real-time inverse moment rate measurements can further improve our ability to forecast eruptions by allowing for characterization of sequences. We apply the detector to eruptions of two Alaskan volcanoes: Bogoslof in 2016-2017 and Redoubt Volcano in 2009. The Bogoslof eruption produced almost 40 repeating earthquake sequences between its start in mid-December 2016 and early June 2017, 21 of which preceded an explosive eruption, and 2 sequences in the months before eruptive activity. Three of the sequences occurred after the implementation of the alarm in late March 2017 and successfully triggered alerts. The nearest seismometers to Bogoslof are over 45 km away, requiring a detector that can work with few stations and a relatively low signal-to-noise ratio. During the Redoubt eruption, earthquake sequences were observed in the months leading up to the eruptive activity beginning in March 2009 as well as immediately preceding 7 of the 19 explosive events. In contrast to Bogoslof, Redoubt has a local monitoring network which allows for better detection and more detailed analysis of the repeating earthquake sequences.
Isolation, propagation, genome analysis and epidemiology of HKU1 betacoronaviruses

PubMed Central

Shrivastava, Susmita; Berglund, Andrew; Qian, Zhaohui; Góes, Luiz Gustavo Bentim; Halpin, Rebecca A.; Fedorova, Nadia; Ransier, Amy; Weston, Philip A.; Durigon, Edison Luiz; Jerez, José Antonio; Robinson, Christine C.; Town, Christopher D.; Holmes, Kathryn V.

2014-01-01

From 1 January 2009 to 31 May 2013, 15 287 respiratory specimens submitted to the Clinical Virology Laboratory at the Children’s Hospital Colorado were tested for human coronavirus RNA by reverse transcription-PCR. Human coronaviruses HKU1, OC43, 229E and NL63 co-circulated during each of the respiratory seasons but with significant year-to-year variability, and cumulatively accounted for 7.4–15.6 % of all samples tested during the months of peak activity. A total of 79 (0.5 % prevalence) specimens were positive for human betacoronavirus HKU1 RNA. Genotypes HKU1 A and B were both isolated from clinical specimens and propagated on primary human tracheal–bronchial epithelial cells cultured at the air–liquid interface and were neutralized in vitro by human intravenous immunoglobulin and by polyclonal rabbit antibodies to the spike glycoprotein of HKU1. Phylogenetic analysis of the deduced amino acid sequences of seven full-length genomes of Colorado HKU1 viruses and the spike glycoproteins from four additional HKU1 viruses from Colorado and three from Brazil demonstrated remarkable conservation of these sequences with genotypes circulating in Hong Kong and France. Within genotype A, all but one of the Colorado HKU1 sequences formed a unique subclade defined by three amino acid substitutions (W197F, F613Y and S752F) in the spike glycoprotein and exhibited a unique signature in the acidic tandem repeat in the N-terminal region of the nsp3 subdomain. Elucidating the function of and mechanisms responsible for the formation of these varying tandem repeats will increase our understanding of the replication process and pathogenicity of HKU1 and potentially of other coronaviruses. PMID:24394697
The interaction of polyglutamine peptides with lipid membranes is regulated by flanking sequences associated with huntingtin.

PubMed

Burke, Kathleen A; Kauffman, Karlina J; Umbaugh, C Samuel; Frey, Shelli L; Legleiter, Justin

2013-05-24

Huntington disease (HD) is caused by an expanded polyglutamine (poly(Q)) repeat near the N terminus of the huntingtin (htt) protein. Expanded poly(Q) facilitates formation of htt aggregates, eventually leading to deposition of cytoplasmic and intranuclear inclusion bodies containing htt. Flanking sequences directly adjacent to the poly(Q) domain, such as the first 17 amino acids on the N terminus (Nt17) and the polyproline (poly(P)) domain on the C-terminal side of the poly(Q) domain, heavily influence aggregation. Additionally, htt interacts with a variety of membraneous structures within the cell, and Nt17 is implicated in lipid binding. To investigate the interaction between htt exon1 and lipid membranes, a combination of in situ atomic force microscopy, Langmuir trough techniques, and vesicle permeability assays were used to directly monitor the interaction of a variety of synthetic poly(Q) peptides with different combinations of flanking sequences (KK-Q35-KK, KK-Q35-P10-KK, Nt17-Q35-KK, and Nt17-Q35-P10-KK) on model membranes and surfaces. Each peptide aggregated on mica, predominately forming extended, fibrillar aggregates. In contrast, poly(Q) peptides that lacked the Nt17 domain did not appreciably aggregate on or insert into lipid membranes. Nt17 facilitated the interaction of peptides with lipid surfaces, whereas the poly(P) region enhanced this interaction. The aggregation of Nt17-Q35-P10-KK on the lipid bilayer closely resembled that of a htt exon1 construct containing 35 repeat glutamines. Collectively, this data suggests that the Nt17 domain plays a critical role in htt binding and aggregation on lipid membranes, and this lipid/htt interaction can be further modulated by the presence of the poly(P) domain.
The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats

PubMed Central

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-01-01

Background In Archeae and Bacteria, the repeated elements called CRISPRs for "clustered regularly interspaced short palindromic repeats" are believed to participate in the defence against viruses. Short sequences called spacers are stored in-between repeated elements. In the current model, motifs comprising spacers and repeats may target an invading DNA and lead to its degradation through a proposed mechanism similar to RNA interference. Analysis of intra-species polymorphism shows that new motifs (one spacer and one repeated element) are added in a polarised fashion. Although their principal characteristics have been described, a lot remains to be discovered on the way CRISPRs are created and evolve. As new genome sequences become available it appears necessary to develop automated scanning tools to make available CRISPRs related information and to facilitate additional investigations. Description We have produced a program, CRISPRFinder, which identifies CRISPRs and extracts the repeated and unique sequences. Using this software, a database is constructed which is automatically updated monthly from newly released genome sequences. Additional tools were created to allow the alignment of flanking sequences in search for similarities between different loci and to build dictionaries of unique sequences. To date, almost six hundred CRISPRs have been identified in 475 published genomes. Two Archeae out of thirty-seven and about half of Bacteria do not possess a CRISPR. Fine analysis of repeated sequences strongly supports the current view that new motifs are added at one end of the CRISPR adjacent to the putative promoter. Conclusion It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution. Subsequent analyses of the intra-species CRISPR polymorphism will be facilitated by CRISPRFinder and the dictionary creator. CRISPRdb is accessible at PMID:17521438
Assembly of a phased diploid Candida albicans genome facilitates allele-specific measurements and provides a simple model for repeat and indel structure

PubMed Central

2013-01-01

Background Candida albicans is a ubiquitous opportunistic fungal pathogen that afflicts immunocompromised human hosts. With rare and transient exceptions the yeast is diploid, yet despite its clinical relevance the respective sequences of its two homologous chromosomes have not been completely resolved. Results We construct a phased diploid genome assembly by deep sequencing a standard laboratory wild-type strain and a panel of strains homozygous for particular chromosomes. The assembly has 700-fold coverage on average, allowing extensive revision and expansion of the number of known SNPs and indels. This phased genome significantly enhances the sensitivity and specificity of allele-specific expression measurements by enabling pooling and cross-validation of signal across multiple polymorphic sites. Additionally, the diploid assembly reveals pervasive and unexpected patterns in allelic differences between homologous chromosomes. Firstly, we see striking clustering of indels, concentrated primarily in the repeat sequences in promoters. Secondly, both indels and their repeat-sequence substrate are enriched near replication origins. Finally, we reveal an intimate link between repeat sequences and indels, which argues that repeat length is under selective pressure for most eukaryotes. This connection is described by a concise one-parameter model that explains repeat-sequence abundance in C. albicans as a function of the indel rate, and provides a general framework to interpret repeat abundance in species ranging from bacteria to humans. Conclusions The phased genome assembly and insights into repeat plasticity will be valuable for better understanding allele-specific phenomena and genome evolution. PMID:24025428
Unrelated sequences at the 5' end of mouse LINE-1 repeated elements define two distinct subfamilies.

PubMed Central

Wincker, P; Jubier-Maurin, V; Roizès, G

1987-01-01

Some full length members of the mouse long interspersed repeated DNA family L1Md have been shown to be associated at their 5' end with a variable number of tandem repetitions, the A repeats, that have been suggested to be transcription controlling elements. We report that the other type of repeat, named F, found at the 5' end of a few L1 elements is also an integral part of full length L1 copies. Sequencing shows that the F repeats are GC rich, and organized in tandem. The L1 copies associated with either A or F repeats can be correlated with two different subsets of L1 sequences distinguished by a series of variant nucleotides specific to each and by unassociated but frequent restriction sites. These findings suggest that sequence replacement has occurred at least once in 5' of L1Md, and is related to the generation of specific subfamilies. Images PMID:3684566
Plant chromosomes from end to end: telomeres, heterochromatin and centromeres.

PubMed

Lamb, Jonathan C; Yu, Weichang; Han, Fangpu; Birchler, James A

2007-04-01

Recent evidence indicates that heterochromatin in plants is composed of heterogeneous sequences, which are usually composed of transposable elements or tandem repeat arrays. These arrays are associated with chromatin modifications that produce a closed configuration that limits transcription. Centromere sequences in plants are usually composed of tandem repeat arrays that are homogenized across the genome. Analysis of such arrays in closely related taxa suggests a rapid turnover of the repeat unit that is typical of a particular species. In addition, two lines of evidence for an epigenetic component of centromere specification have been reported, namely an example of a neocentromere formed over sequences without the typical repeat array and examples of centromere inactivation. Although the telomere repeat unit is quite prevalent in the plant kingdom, unusual repeats have been found in some families. Recently, it was demonstrated that the introduction of telomere sequences into plants cells causes truncation of the chromosomes, and that this technique can be used to produce artificial chromosome platforms.
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats.

PubMed

Anwar, Tamanna; Khan, Asad U

2006-02-20

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com.
The mitochondrial genome of the quiet-calling katydids, Xizicus fascipes (Orthoptera: Tettigoniidae: Meconematinae).

PubMed

Yang, Ming Ru; Zhou, Zhi Jun; Chang, Yan Lin; Zhao, Le Hong

2012-08-01

To help determine whether the typical arthropod arrangement was a synapomorphy for the whole Tettigoniidae, we sequenced the mitochondrial genome (mitogenome) of the quiet-calling katydids, Xizicus fascipes (Orthoptera: Tettigoniidae: Meconematinae). The 16,166-bp nucleotide sequences of X. fascipes mitogenome contains the typical gene content, gene order, base composition, and codon usage found in arthropod mitogenomes. As a whole, the X. fascipes mitogenome contains a lower A+T content (70.2%) found in the complete orthopteran mitogenomes determined to date. All protein-coding genes started with a typical ATN codon. Ten of the 13 protein-coding genes have a complete termination codon, but the remaining three genes (COIII, ND5 and ND4) terminate with incomplete T. All tRNAs have the typical clover-leaf structure of mitogenome tRNA, except for tRNA(Ser(AGN)), in which lengthened anticodon stem (9 bp) with a bulged nuleotide in the middle, an unusual T-stem (6 bp in constrast to the normal 5 bp), a mini DHU arm (2 bp) and no connector nucleotides. In the A+T-rich region, two (TA)n conserved blocks that were previously described in Ensifera and two 150-bp tandem repeats plus a partial copy of the composed at 61 bp of the beginning were present. Phylogenetic analysis found: i) the monophyly of Conocephalinae was interrupted by Elimaea cheni from Phaneropterinae; and ii) Meconematinae was the most basal group among these five subfamilies.
A highly conserved N-terminal sequence for teleost vitellogenin with potential value to the biochemistry, molecular biology and pathology of vitellogenesis

USGS Publications Warehouse

Folmar, L.D.; Denslow, N.D.; Wallace, R.A.; LaFleur, G.; Gross, T.S.; Bonomelli, S.; Sullivan, C.V.

1995-01-01

N-terminal amino acid sequences for vitellogenin (Vtg) from six species of teleost fish (striped bass, mummichog, pinfish, brown bullhead, medaka, yellow perch and the sturgeon) are compared with published N-terminal Vtg sequences for the lamprey, clawed frog and domestic chicken. Striped bass and mummichog had 100% identical amino acids between positions 7 and 21, while pinfish, brown bullhead, sturgeon, lamprey, Xenopus and chicken had 87%, 93%, 60%, 47%, 47-60%) for four transcripts and had 40% identical, respectively, with striped bass for the same positions. Partial sequences obtained for medaka and yellow perch were 100% identical between positions 5 to 10. The potential utility of this conserved sequence for studies on the biochemistry, molecular biology and pathology of vitellogenesis is discussed.
Definition of RNA polymerase II CoTC terminator elements in the human genome.

PubMed

Nojima, Takayuki; Dienstbier, Martin; Murphy, Shona; Proudfoot, Nicholas J; Dye, Michael J

2013-04-25

Mammalian RNA polymerase II (Pol II) transcription termination is an essential step in protein-coding gene expression that is mediated by pre-mRNA processing activities and DNA-encoded terminator elements. Although much is known about the role of pre-mRNA processing in termination, our understanding of the characteristics and generality of terminator elements is limited. Whereas promoter databases list up to 40,000 known and potential Pol II promoter sequences, fewer than ten Pol II terminator sequences have been described. Using our knowledge of the human β-globin terminator mechanism, we have developed a selection strategy for mapping mammalian Pol II terminator elements. We report the identification of 78 cotranscriptional cleavage (CoTC)-type terminator elements at endogenous gene loci. The results of this analysis pave the way for the full understanding of Pol II termination pathways and their roles in gene expression. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Unusually long-lived pause required for regulation of a Rho-dependent transcription terminator.

PubMed

Hollands, Kerry; Sevostiyanova, Anastasia; Groisman, Eduardo A

2014-05-13

Up to half of all transcription termination events in bacteria rely on the RNA-dependent helicase Rho. However, the nucleic acid sequences that promote Rho-dependent termination remain poorly characterized. Defining the molecular determinants that confer Rho-dependent termination is especially important for understanding how such terminators can be regulated in response to specific signals. Here, we identify an extraordinarily long-lived pause at the site where Rho terminates transcription in the 5'-leader region of the Mg(2+) transporter gene mgtA in Salmonella enterica. We dissect the sequence elements required for prolonged pausing in the mgtA leader and establish that the remarkable longevity of this pause is required for a riboswitch to stimulate Rho-dependent termination in the mgtA leader region in response to Mg(2+) availability. Unlike Rho-dependent terminators described previously, where termination occurs at multiple pause sites, there is a single site of transcription termination directed by Rho in the mgtA leader. Our data suggest that Rho-dependent termination events that are subject to regulation may require elements distinct from those operating at constitutive Rho-dependent terminators.
A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

PubMed Central

2010-01-01

Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840
A TALE-inspired computational screen for proteins that contain approximate tandem repeats.

PubMed

Perycz, Malgorzata; Krwawicz, Joanna; Bochtler, Matthias

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.

A TALE-inspired computational screen for proteins that contain approximate tandem repeats

PubMed Central

Krwawicz, Joanna

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen. PMID:28617832
Structure and Misfolding of the Flexible Tripartite Coiled-Coil Domain of Glaucoma-Associated Myocilin

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hill, Shannon E.; Nguyen, Elaine; Donegan, Rebecca K.

2017-11-01

Glaucoma-associated myocilin is a member of the olfactomedins, a protein family involved in neuronal development and human diseases. Molecular studies of the myocilin N-terminal coiled coil demonstrate a unique tripartite architecture: a Y-shaped parallel dimer-of-dimers with distinct tetramer and dimer regions. The structure of the dimeric C-terminal 7-heptad repeats elucidates an unexpected repeat pattern involving inter-strand stabilization by oppositely charged residues. Molecular dynamics simulations reveal an alternate accessible conformation in which the terminal inter-strand disulfide limits the extent of unfolding and results in a kinked configuration. By inference, full-length myocilin is also branched, with two pairs of C-terminal olfactomedin domains.more » Selected variants within the N-terminal region alter the apparent quaternary structure of myocilin but do so without compromising stability or causing aggregation. In addition to increasing our structural knowledge of naturally occurring extracellular coiled coils and biomedically important olfactomedins, this work broadens the scope of protein misfolding in the pathogenesis of myocilin-associated glaucoma.« less
Structure and Misfolding of the Flexible Tripartite Coiled-Coil Domain of Glaucoma-Associated Myocilin.

PubMed

Hill, Shannon E; Nguyen, Elaine; Donegan, Rebecca K; Patterson-Orazem, Athéna C; Hazel, Anthony; Gumbart, James C; Lieberman, Raquel L

2017-11-07

Glaucoma-associated myocilin is a member of the olfactomedins, a protein family involved in neuronal development and human diseases. Molecular studies of the myocilin N-terminal coiled coil demonstrate a unique tripartite architecture: a Y-shaped parallel dimer-of-dimers with distinct tetramer and dimer regions. The structure of the dimeric C-terminal 7-heptad repeats elucidates an unexpected repeat pattern involving inter-strand stabilization by oppositely charged residues. Molecular dynamics simulations reveal an alternate accessible conformation in which the terminal inter-strand disulfide limits the extent of unfolding and results in a kinked configuration. By inference, full-length myocilin is also branched, with two pairs of C-terminal olfactomedin domains. Selected variants within the N-terminal region alter the apparent quaternary structure of myocilin but do so without compromising stability or causing aggregation. In addition to increasing our structural knowledge of naturally occurring extracellular coiled coils and biomedically important olfactomedins, this work broadens the scope of protein misfolding in the pathogenesis of myocilin-associated glaucoma. Copyright © 2017 Elsevier Ltd. All rights reserved.
Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing.

PubMed

Hribová, Eva; Neumann, Pavel; Matsumoto, Takashi; Roux, Nicolas; Macas, Jirí; Dolezel, Jaroslav

2010-09-16

Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection.
Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing

PubMed Central

2010-01-01

Background Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. Results In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. Conclusion A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection. PMID:20846365
The LRRK2 G2385R variant is a partial loss-of-function mutation that affects synaptic vesicle trafficking through altered protein interactions.

PubMed

Carrion, Maria Dolores Perez; Marsicano, Silvia; Daniele, Federica; Marte, Antonella; Pischedda, Francesca; Di Cairano, Eliana; Piovesana, Ester; von Zweydorf, Felix; Kremmer, Elisabeth; Gloeckner, Christian Johannes; Onofri, Franco; Perego, Carla; Piccoli, Giovanni

2017-07-14

Mutations in the Leucine-rich repeat kinase 2 gene (LRRK2) are associated with familial Parkinson's disease (PD). LRRK2 protein contains several functional domains, including protein-protein interaction domains at its N- and C-termini. In this study, we analyzed the functional features attributed to LRRK2 by its N- and C-terminal domains. We combined TIRF microscopy and synaptopHluorin assay to visualize synaptic vesicle trafficking. We found that N- and C-terminal domains have opposite impact on synaptic vesicle dynamics. Biochemical analysis demonstrated that different proteins are bound at the two extremities, namely β3-Cav2.1 at N-terminus part and β-Actin and Synapsin I at C-terminus domain. A sequence variant (G2385R) harboured within the C-terminal WD40 domain increases the risk for PD. Complementary biochemical and imaging approaches revealed that the G2385R variant alters strength and quality of LRRK2 interactions and increases fusion of synaptic vesicles. Our data suggest that the G2385R variant behaves like a loss-of-function mutation that mimics activity-driven events. Impaired scaffolding capabilities of mutant LRRK2 resulting in perturbed vesicular trafficking may arise as a common pathophysiological denominator through which different LRRK2 pathological mutations cause disease.
Identification of MICA alleles with a long Leu-repeat in the transmembrane region and no cytoplasmic tail due to a frameshift-deletion in exon 4.

PubMed

Obuchi, N; Takahashi, M; Nouchi, T; Satoh, M; Arimura, T; Ueda, K; Akai, J; Ota, M; Naruse, T; Inoko, H; Numano, F; Kimura, A

2001-06-01

MHC class I chain-related gene A (MICA) is located close to HLA-B gene and expressed in epithelial cells. The MICA gene is reported to be highly polymorphic as are the classical class I genes. To further assess the polymorphism in the MICA gene, we analyzed a total of 60 HLA-homozygous cells for the sequences spanning exons 2-6. In the analysis, four new MICA alleles were identified and six variations were recognized in exon 6. MICA*017, which was identified in three HLA-B57 homozygous cells (DBB, DEM and WIN), differed from MICA*002 in exon 3 and had a guanine deletion at the 3' end of exon 4. MICA*015 identified in an HLA-B45 homozygous cell (OMW) also had the same deletion that causes a frameshift mutation resulting in complete change of the transmembrane region and premature termination in the cytoplasmic tail; these alleles have a long hydrophobic leucine-rich region instead of the alanine repeat in the transmembrane region and terminate at the second position in the cytoplasmic domain. The frameshift deletion was found only in HLA-B45- or -B57-positive panels tested, suggesting a strong linkage disequilibrium between the deletion and B45 or B57. MICA*048, which was different in exon 5 from MICA*008, was identified in an HLA-B61 homozygous cell (TA21), while MICA*00901 identified in HLA-B51 homozygous cells (LUY and KT2) was distinguished from MICA*009 by exon 6.
Zinc finger protein designed to target 2-long terminal repeat junctions interferes with human immunodeficiency virus integration.

PubMed

Sakkhachornphop, Supachai; Barbas, Carlos F; Keawvichit, Rassamee; Wongworapat, Kanlaya; Tayapiwatana, Chatchai

2012-09-01

Integration of the human immunodeficiency virus type 1 (HIV-1) genome into the host chromosome is a vital step in the HIV life cycle. The highly conserved cytosine-adenine (CA) dinucleotide sequence immediately upstream of the cleavage site is crucial for integrase (IN) activity. As this viral enzyme has an important role early in the HIV-1 replication cycle, interference with the IN substrate has become an attractive strategy for therapeutic intervention. We demonstrated that a designed zinc finger protein (ZFP) fused to green fluorescent protein (GFP) targets the 2-long terminal repeat (2-LTR) circle junctions of HIV-1 DNA with nanomolar affinity. We report now that 2LTRZFP-GFP stably transduced into 293T cells interfered with the expression of vesicular stomatitis virus glycoprotein (VSV-G)-pseudotyped lentiviral red fluorescent protein (RFP), as shown by the suppression of RFP expression. We also used a third-generation lentiviral vector and pCEP4 expression vector to deliver the 2LTRZFP-GFP transgene into human T-lymphocytic cells, and a stable cell line for long-term expression studies was selected for HIV-1 challenge. HIV-1 integration and replication were inhibited as measured by Alu-gag real-time PCR and p24 antigen assay. In addition, the molecular activity of 2LTRZFP-GFP was evaluated in peripheral blood mononuclear cells. The results were confirmed by Alu-gag real-time PCR for integration interference. We suggest that the expression of 2LTRZFP-GFP limited viral integration on intracellular immunization, and that it has potential for use in HIV gene therapy in the future.
Long terminal repeat retrotransposons of Oryza sativa

PubMed Central

McCarthy, Eugene M; Liu, Jingdong; Lizhi, Gao; McDonald, John F

2002-01-01

Background Long terminal repeat (LTR) retrotransposons constitute a major fraction of the genomes of higher plants. For example, retrotransposons comprise more than 50% of the maize genome and more than 90% of the wheat genome. LTR retrotransposons are believed to have contributed significantly to the evolution of genome structure and function. The genome sequencing of selected experimental and agriculturally important species is providing an unprecedented opportunity to view the patterns of variation existing among the entire complement of retrotransposons in complete genomes. Results Using a new data-mining program, LTR_STRUC, (LTR retrotransposon structure program), we have mined the GenBank rice (Oryza sativa) database as well as the more extensive (259 Mb) Monsanto rice dataset for LTR retrotransposons. Almost two-thirds (37) of the 59 families identified consist of copia-like elements, but gypsy-like elements outnumber copia-like elements by a ratio of approximately 2:1. At least 17% of the rice genome consists of LTR retrotransposons. In addition to the ubiquitous gypsy- and copia-like classes of LTR retrotransposons, the rice genome contains at least two novel families of unusually small, non-coding (non-autonomous) LTR retrotransposons. Conclusions Each of the major clades of rice LTR retrotransposons is more closely related to elements present in other species than to the other clades of rice elements, suggesting that horizontal transfer may have occurred over the evolutionary history of rice LTR retrotransposons. Like LTR retrotransposons in other species with relatively small genomes, many rice LTR retrotransposons are relatively young, indicating a high rate of turnover. PMID:12372141
Multiple conserved domains of the nucleoporin Nup124p and its orthologs Nup1p and Nup153 are critical for nuclear import and activity of the fission yeast Tf1 retrotransposon.

PubMed

Sistla, Srivani; Pang, Junxiong Vincent; Wang, Cui Xia; Balasundaram, David

2007-09-01

The nucleoporin Nup124p is a host protein required for the nuclear import of both, retrotransposon Tf1-Gag as well as the retroviral HIV-1 Vpr in fission yeast. The human nucleoporin Nup153 and the Saccharomyces cerevisiae Nup1p were identified as orthologs of Nup124p. In this study, we show that all three nucleoporins share a large FG/FXFG-repeat domain and a C-terminal peptide sequence, GRKIxxxxxRRKx, that are absolutely essential for Tf1 retrotransposition. Though the FXFG domain was essential, the FXFG repeats themselves could be eliminated without loss of retrotransposon activity, suggesting the existence of a common element unrelated to FG/FXFG motifs. The Nup124p C-terminal peptide, GRKIAVPRSRRKR, was extremely sensitive to certain single amino acid changes within stretches of the basic residues. On the basis of our comparative study of Nup124p, Nup1p, and Nup153 domains, we have developed peptides that specifically knockdown retrotransposon activity by disengaging the Tf1-Gag from its host nuclear transport machinery without any harmful consequence to the host itself. Our results imply that those domains challenged a specific pathway affecting Tf1 transposition. Although full-length Nup1p or Nup153 does not complement Nup124p, the functionality of their conserved domains with reference to Tf1 activity suggests that these three proteins evolved from a common ancestor.
Multiple Conserved Domains of the Nucleoporin Nup124p and Its Orthologs Nup1p and Nup153 Are Critical for Nuclear Import and Activity of the Fission Yeast Tf1 Retrotransposon

PubMed Central

Sistla, Srivani; Pang, Junxiong Vincent; Wang, Cui Xia

2007-01-01

The nucleoporin Nup124p is a host protein required for the nuclear import of both, retrotransposon Tf1-Gag as well as the retroviral HIV-1 Vpr in fission yeast. The human nucleoporin Nup153 and the Saccharomyces cerevisiae Nup1p were identified as orthologs of Nup124p. In this study, we show that all three nucleoporins share a large FG/FXFG-repeat domain and a C-terminal peptide sequence, GRKIxxxxxRRKx, that are absolutely essential for Tf1 retrotransposition. Though the FXFG domain was essential, the FXFG repeats themselves could be eliminated without loss of retrotransposon activity, suggesting the existence of a common element unrelated to FG/FXFG motifs. The Nup124p C-terminal peptide, GRKIAVPRSRRKR, was extremely sensitive to certain single amino acid changes within stretches of the basic residues. On the basis of our comparative study of Nup124p, Nup1p, and Nup153 domains, we have developed peptides that specifically knockdown retrotransposon activity by disengaging the Tf1-Gag from its host nuclear transport machinery without any harmful consequence to the host itself. Our results imply that those domains challenged a specific pathway affecting Tf1 transposition. Although full-length Nup1p or Nup153 does not complement Nup124p, the functionality of their conserved domains with reference to Tf1 activity suggests that these three proteins evolved from a common ancestor. PMID:17615301
Identification and Analysis of Mot3, a Zinc Finger Protein That Binds to the Retrotransposon Ty Long Terminal Repeat (δ) in Saccharomyces cerevisiae

PubMed Central

Madison, Jon M.; Dudley, Aimée M.; Winston, Fred

1998-01-01

Spt3 and Mot1 are two transcription factors of Saccharomyces cerevisiae that are thought to act in a related fashion to control the function of TATA-binding protein (TBP). Current models suggest that while Spt3 and Mot1 do not directly interact, they do function in a related fashion to stabilize the TBP-TATA interaction at particular promoters. Consistent with this model, certain combinations of spt3 and mot1 mutations are inviable. To identify additional proteins related to Spt3 and Mot1 functions, we screened for high-copy-number suppressors of the mot1 spt3 inviability. This screen identified a previously unstudied gene, MOT3, that encodes a zinc finger protein. We show that Mot3 binds in vitro to three sites within the retrotransposon Ty long terminal repeat (δ) sequence. One of these sites is immediately 5′ of the δ TATA region. Although a mot3 null mutation causes no strong phenotypes, it does cause some mild phenotypes, including a very modest increase in Ty mRNA levels, partial suppression of transcriptional defects caused by a mot1 mutation, and partial suppression of an spt3 mutation. These results, in conjunction with those of an independent study of Mot3 (A. Grishin, M. Rothenberg, M. A. Downs, and K. J. Blumer, Genetics, in press), suggest that this protein plays a varied role in gene expression that may be largely redundant with other factors. PMID:9528759
Optimization of sequence alignment for simple sequence repeat regions.

PubMed

Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

2011-07-20

Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
Structural and sequence diversity of the transposon Galileo in the Drosophila willistoni genome.

PubMed

Gonçalves, Juliana W; Valiati, Victor Hugo; Delprat, Alejandra; Valente, Vera L S; Ruiz, Alfredo

2014-09-13

Galileo is one of three members of the P superfamily of DNA transposons. It was originally discovered in Drosophila buzzatii, in which three segregating chromosomal inversions were shown to have been generated by ectopic recombination between Galileo copies. Subsequently, Galileo was identified in six of 12 sequenced Drosophila genomes, indicating its widespread distribution within this genus. Galileo is strikingly abundant in Drosophila willistoni, a neotropical species that is highly polymorphic for chromosomal inversions, suggesting a role for this transposon in the evolution of its genome. We carried out a detailed characterization of all Galileo copies present in the D. willistoni genome. A total of 191 copies, including 133 with two terminal inverted repeats (TIRs), were classified according to structure in six groups. The TIRs exhibited remarkable variation in their length and structure compared to the most complete copy. Three copies showed extended TIRs due to internal tandem repeats, the insertion of other transposable elements (TEs), or the incorporation of non-TIR sequences into the TIRs. Phylogenetic analyses of the transposase (TPase)-encoding and TIR segments yielded two divergent clades, which we termed Galileo subfamilies V and W. Target-site duplications (TSDs) in D. willistoni Galileo copies were 7- or 8-bp in length, with the consensus sequence GTATTAC. Analysis of the region around the TSDs revealed a target site motif (TSM) with a 15-bp palindrome that may give rise to a stem-loop secondary structure. There is a remarkable abundance and diversity of Galileo copies in the D. willistoni genome, although no functional copies were found. The TIRs in particular have a dynamic structure and extend in different ways, but their ends (required for transposition) are more conserved than the rest of the element. The D. willistoni genome harbors two Galileo subfamilies (V and W) that diverged ~9 million years ago and may have descended from an ancestral element in the genome. Galileo shows a significant insertion preference for a 15-bp palindromic TSM.
A Dynamic Tandem Repeat in Monocotyledons Inferred from a Comparative Analysis of Chloroplast Genomes in Melanthiaceae.

PubMed

Do, Hoang Dang Khoa; Kim, Joo-Hwan

2017-01-01

Chloroplast genomes (cpDNA) are highly valuable resources for evolutionary studies of angiosperms, since they are highly conserved, are small in size, and play critical roles in plants. Slipped-strand mispairing (SSM) was assumed to be a mechanism for generating repeat units in cpDNA. However, research on the employment of different small repeated sequences through SSM events, which may induce the accumulation of distinct types of repeats within the same region in cpDNA, has not been documented. Here, we sequenced two chloroplast genomes from the endemic species Heloniopsis tubiflora (Korea) and Xerophyllum tenax (USA) to cover the gap between molecular data and explore "hot spots" for genomic events in Melanthiaceae. Comparative analysis of 23 complete cpDNA sequences revealed that there were different stages of deletion in the rps16 region across the Melanthiaceae. Based on the partial or complete loss of rps16 gene in cpDNA, we have firstly reported potential molecular markers for recognizing two sections ( Veratrum and Fuscoveratrum ) of Veratrum . Melathiaceae exhibits a significant change in the junction between large single copy and inverted repeat regions, ranging from trnH_GUG to a part of rps3 . Our results show an accumulation of tandem repeats in the rpl23-ycf2 regions of cpDNAs. Small conserved sequences exist and flank tandem repeats in further observation of this region across most of the examined taxa of Liliales. Therefore, we propose three scenarios in which different small repeated sequences were used during SSM events to generate newly distinct types of repeats. Occasionally, prior to the SSM process, point mutation event and double strand break repair occurred and induced the formation of initial repeat units which are indispensable in the SSM process. SSM may have likely occurred more frequently for short repeats than for long repeat sequences in tribe Parideae (Melanthiaceae, Liliales). Collectively, these findings add new evidence of dynamic results from SSM in chloroplast genomes which can be useful for further evolutionary studies in angiosperms. Additionally, genomics events in cpDNA are potential resources for mining molecular markers in Liliales.
Molecular and bioinformatic analysis of the FB-NOF transposable element.

PubMed

Badal, Martí; Portela, Anna; Xamena, Noel; Cabré, Oriol

2006-04-12

The Drosophila melanogaster transposable element FB-NOF is known to play a role in genome plasticity through the generation of all sort of genomic rearrangements. Moreover, several insertional mutants due to FB mobilizations have been reported. Its structure and sequence, however, have been poorly studied mainly as a consequence of the long, complex and repetitive sequence of FB inverted repeats. This repetitive region is composed of several 154 bp blocks, each with five almost identical repeats. In this paper, we report the sequencing process of 2 kb long FB inverted repeats of a complete FB-NOF element, with high precision and reliability. This achievement has been possible using a new map of the FB repetitive region, which identifies unambiguously each repeat with new features that can be used as landmarks. With this new vision of the element, a list of FB-NOF in the D. melanogaster genomic clones has been done, improving previous works that used only bioinformatic algorithms. The availability of many FB and FB-NOF sequences allowed an analysis of the FB insertion sequences that showed no sequence specificity, but a preference for A/T rich sequences. The position of NOF into FB is also studied, revealing that it is always located after a second repeat in a random block. With the results of this analysis, we propose a model of transposition in which NOF jumps from FB to FB, using an unidentified transposase enzyme that should specifically recognize the second repeat end of the FB blocks.
The repetitive landscape of the chicken genome.

PubMed

Wicker, Thomas; Robertson, Jon S; Schulze, Stefan R; Feltus, F Alex; Magrini, Vincent; Morrison, Jason A; Mardis, Elaine R; Wilson, Richard K; Peterson, Daniel G; Paterson, Andrew H; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7 x coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available.
The repetitive landscape of the chicken genome

PubMed Central

Wicker, Thomas; Robertson, Jon S.; Schulze, Stefan R.; Feltus, F. Alex; Magrini, Vincent; Morrison, Jason A.; Mardis, Elaine R.; Wilson, Richard K.; Peterson, Daniel G.; Paterson, Andrew H.; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7× coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available. PMID:15256510
Mechanism of transcription termination by RNA polymerase III utilizes a nontemplate-strand sequence-specific signal element

PubMed Central

Arimbasseri, Aneeshkumar G.; Maraia, Richard J.

2015-01-01

SUMMARY Understanding the mechanism of transcription termination by a eukaryotic RNA polymerase (RNAP) has been limited by lack of a characterizable intermediate that reflects transition from an elongation complex to a true termination event. While other multisubunit RNAPs require multipartite cis-signals and/or ancillary factors to mediate pausing and release of the nascent transcript from the clutches of these enzymes, RNAP III does so with precision and efficiency on a simple oligo(dT) tract, independent of other cis-elements or trans-factors. We report a RNAP III pre-termination complex that reveals termination mechanisms controlled by sequence-specific elements in the non-template strand. Furthermore, the TFIIF-like, RNAP III subunit, C37 is required for this function of the non-template strand signal. The results reveal the RNAP III terminator as an information-rich control element. While the template strand promotes destabilization via a weak oligo(rU:dA) hybrid, the non-template strand provides distinct sequence-specific destabilizing information through interactions with the C37 subunit. PMID:25959395
The N-terminal sequence of albumin Redhill, a variant of human serum albumin.

PubMed

Hutchinson, D W; Matejtschuk, P

1985-12-02

Albumin Redhill, a variant human albumin, has been isolated by fast protein liquid chromatofocusing. The N-terminal sequence of this protein corresponded to that of albumin A except that one additional arginine residue was attached to the N-terminus.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.