repeat motif present: Topics by Science.gov

Sample records for repeat motif present

Modeling protein homopolymeric repeats: possible polyglutamine structural motifs for Huntington's disease.

PubMed

Lathrop, R H; Casale, M; Tobias, D J; Marsh, J L; Thompson, L M

1998-01-01

We describe a prototype system (Poly-X) for assisting an expert user in modeling protein repeats. Poly-X reduces the large number of degrees of freedom required to specify a protein motif in complete atomic detail. The result is a small number of parameters that are easily understood by, and under the direct control of, a domain expert. The system was applied to the polyglutamine (poly-Q) repeat in the first exon of huntingtin, the gene implicated in Huntington's disease. We present four poly-Q structural motifs: two poly-Q beta-sheet motifs (parallel and antiparallel) that constitute plausible alternatives to a similar previously published poly-Q beta-sheet motif, and two novel poly-Q helix motifs (alpha-helix and pi-helix). To our knowledge, helical forms of polyglutamine have not been proposed before. The motifs suggest that there may be several plausible aggregation structures for the intranuclear inclusion bodies which have been found in diseased neurons, and may help in the effort to understand the structural basis for Huntington's disease.
Conservation of the Human Integrin-Type Beta-Propeller Domain in Bacteria

PubMed Central

Chouhan, Bhanupratap; Denesyuk, Alexander; Heino, Jyrki; Johnson, Mark S.; Denessiouk, Konstantin

2011-01-01

Integrins are heterodimeric cell-surface receptors with key functions in cell-cell and cell-matrix adhesion. Integrin α and β subunits are present throughout the metazoans, but it is unclear whether the subunits predate the origin of multicellular organisms. Several component domains have been detected in bacteria, one of which, a specific 7-bladed β-propeller domain, is a unique feature of the integrin α subunits. Here, we describe a structure-derived motif, which incorporates key features of each blade from the X-ray structures of human αIIbβ3 and αVβ3, includes elements of the FG-GAP/Cage and Ca2+-binding motifs, and is specific only for the metazoan integrin domains. Separately, we searched for the metazoan integrin type β-propeller domains among all available sequences from bacteria and unicellular eukaryotic organisms, which must incorporate seven repeats, corresponding to the seven blades of the β-propeller domain, and so that the newly found structure-derived motif would exist in every repeat. As the result, among 47 available genomes of unicellular eukaryotes we could not find a single instance of seven repeats with the motif. Several sequences contained three repeats, a predicted transmembrane segment, and a short cytoplasmic motif associated with some integrins, but otherwise differ from the metazoan integrin α subunits. Among the available bacterial sequences, we found five examples containing seven sequential metazoan integrin-specific motifs within the seven repeats. The motifs differ in having one Ca2+-binding site per repeat, whereas metazoan integrins have three or four sites. The bacterial sequences are more conserved in terms of motif conservation and loop length, suggesting that the structure is more regular and compact than those example structures from human integrins. Although the bacterial examples are not full-length integrins, the full-length metazoan-type 7-bladed β-propeller domains are present, and sometimes two tandem copies are found. PMID:22022374
Sequences characterization of microsatellite DNA sequences in Pacific abalone ( Haliotis discus hannai)

NASA Astrophysics Data System (ADS)

Li, Qi; Akihiro, Kijima

2007-01-01

The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber (1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats (13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (<20 repeats) were most abundant, accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatellite isolation in other abalone species.
Gibbs motif sampling: detection of bacterial outer membrane protein repeats.

PubMed Central

Neuwald, A. F.; Liu, J. S.; Lawrence, C. E.

1995-01-01

The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles. PMID:8520488
DNA motifs determining the accuracy of repeat duplication during CRISPR adaptation in Haloarcula hispanica

PubMed Central

Wang, Rui; Li, Ming; Gong, Luyao; Hu, Songnian; Xiang, Hua

2016-01-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) acquire new spacers to generate adaptive immunity in prokaryotes. During spacer integration, the leader-preceded repeat is always accurately duplicated, leading to speculations of a repeat-length ruler. Here in Haloarcula hispanica, we demonstrate that the accurate duplication of its 30-bp repeat requires two conserved mid-repeat motifs, AACCC and GTGGG. The AACCC motif was essential and needed to be ∼10 bp downstream from the leader-repeat junction site, where duplication consistently started. Interestingly, repeat duplication terminated sequence-independently and usually with a specific distance from the GTGGG motif, which seemingly served as an anchor site for a molecular ruler. Accordingly, altering the spacing between the two motifs led to an aberrant duplication size (29, 31, 32 or 33 bp). We propose the adaptation complex may recognize these mid-repeat elements to enable measuring the repeat DNA for spacer integration. PMID:27085805
The MiiA motif is a common marker present in polytopic surface proteins of oral and urinary tract invasive bacteria.

PubMed

Martín-Galiano, Antonio J

2017-04-01

Many surface virulence factors of bacterial pathogens show mosaicism and confounding phylogenetic origin. The Streptococcus gordonii platelet-binding GspB protein, the Streptococcus sanguinis SrpA adhesin and the Streptococcus pneumoniae DiiA protein, share an imperfect 27-residue motif. Given the disparate domain architectures of these proteins and its association to invasive disease, this motif was named MiiA from Multiarchitecture invasion-involved motif A. MiiA is predicted to adopt a beta-sheet folding, probably related to the Ig-like fold, with a symmetrical positioning of two conserved aspartic residues. A specific hidden Markov model profiling MiiA was built, which specifically detected the motif in proteins from 58 species, mainly in cell-wall proteins from Gram-positive bacteria. These proteins contained one to ten MiiA motifs, which were embedded within larger repeat units of 70-82 residues. MiiA motifs combined to other domains and elements such as coiled-coils and low-complexity regions. The species carrying MiiA-proteins included commensals from the urogenital tract and the oral cavity, which can cause opportunistic endocarditis and sepsis. Intra-protein MiiA repeats showed a complex mixture of orthologal, paralogal and inter-species relationships, suggestive of a multistep origin. Presence of these repeats in proteins involved in oligosaccharide recognition and lifestyle of species suggest a putative function for MiiA repeats in sugars binding, probably those present in receptors of epithelial and blood cells. MiiA modules appear to have been transferred horizontally between species co-habiting in the same niche to create their own MiiA-containing determinants. The present work provides a global study and a catalog of potential MiiA virulence factors that should be analyzed experimentally. Copyright © 2017 Elsevier B.V. All rights reserved.
Unitary circular code motifs in genomes of eukaryotes.

PubMed

El Soufi, Karim; Michel, Christian J

A set X of 20 trinucleotides was identified in genes of bacteria, eukaryotes, plasmids and viruses, which has in average the highest occurrence in reading frame compared to its two shifted frames (Michel, 2015; Arquès and Michel, 1996). This set X has an interesting mathematical property as X is a circular code (Arquès and Michel, 1996). Thus, the motifs from this circular code X, called X motifs, have the property to always retrieve, synchronize and maintain the reading frame in genes. The origin of this circular code X in genes is an open problem since its discovery in 1996. Here, we first show that the unitary circular codes (UCC), i.e. sets of one word, allow to generate unitary circular code motifs (UCC motifs), i.e. a concatenation of the same motif (simple repeats) leading to low complexity DNA. Three classes of UCC motifs are studied here: repeated dinucleotides (D + motifs), repeated trinucleotides (T + motifs) and repeated tetranucleotides (T + motifs). Thus, the D + , T + and T + motifs allow to retrieve, synchronize and maintain a frame modulo 2, modulo 3 and modulo 4, respectively, and their shifted frames (1 modulo 2; 1 and 2 modulo 3; 1, 2 and 3 modulo 4 according to the C 2 , C 3 and C 4 properties, respectively) in the DNA sequences. The statistical distribution of the D + , T + and T + motifs is analyzed in the genomes of eukaryotes. A UCC motif and its comp lementary UCC motif have the same distribution in the eukaryotic genomes. Furthermore, a UCC motif and its complementary UCC motif have increasing occurrences contrary to their number of hydrogen bonds, very significant with the T + motifs. The longest D + , T + and T + motifs in the studied eukaryotic genomes are also given. Surprisingly, a scarcity of repeated trinucleotides (T + motifs) in the large eukaryotic genomes is observed compared to the D + and T + motifs. This result has been investigated and may be explained by two outcomes. Repeated trinucleotides (T + motifs) are identified in the X motifs of low composition (cardinality less than 10) in the genomes of eukaryotes. Furthermore, identical trinucleotide pairs of the circular code X are preferentially used in the gene sequences of eukaryotes. These two results suggest that the unitary circular codes of trinucleotides may have been involved in the formation of the trinucleotide circular code X. Indeed, repeated trinucleotides in the X motifs in the genomes of eukaryotes may represent an intermediary evolution from repeated trinucleotides of cardinality 1 (T + motifs) in the genomes of eukaryotes up to the X motifs of cardinality 20 in the gene sequences of eukaryotes. Copyright © 2017 Elsevier B.V. All rights reserved.
Slipped-strand mispairing at noncontiguous repeats in Poecilia reticulata: a model for minisatellite birth.

PubMed Central

Taylor, J S; Breden, F

2000-01-01

The standard slipped-strand mispairing (SSM) model for the formation of variable number tandem repeats (VNTRs) proposes that a few tandem repeats, produced by chance mutations, provide the "raw material" for VNTR expansion. However, this model is unlikely to explain the formation of VNTRs with long motifs (e.g., minisatellites), because the likelihood of a tandem repeat forming by chance decreases rapidly as the length of the repeat motif increases. Phylogenetic reconstruction of the birth of a mitochondrial (mt) DNA minisatellite in guppies suggests that VNTRs with long motifs can form as a consequence of SSM at noncontiguous repeats. VNTRs formed in this manner have motifs longer than the noncontiguous repeat originally formed by chance and are flanked by one unit of the original, noncontiguous repeat. SSM at noncontiguous repeats can therefore explain the birth of VNTRs with long motifs and the "imperfect" or "short direct" repeats frequently observed adjacent to both mtDNA and nuclear VNTRs. PMID:10880490
Topological characteristics of helical repeat proteins.

PubMed

Groves, M R; Barford, D

1999-06-01

The recent elucidation of protein structures based upon repeating amino acid motifs, including the armadillo motif, the HEAT motif and tetratricopeptide repeats, reveals that they belong to the class of helical repeat proteins. These proteins share the common property of being assembled from tandem repeats of an alpha-helical structural unit, creating extended superhelical structures that are ideally suited to create a protein recognition interface.
A Genome-Wide Survey of the Microsatellite Content of the Globe Artichoke Genome and the Development of a Web-Based Database

PubMed Central

Portis, Ezio; Portis, Flavio; Valente, Luisa; Moglia, Andrea; Barchi, Lorenzo; Lanteri, Sergio; Acquadro, Alberto

2016-01-01

The recently acquired genome sequence of globe artichoke (Cynara cardunculus var. scolymus) has been used to catalog the genome’s content of simple sequence repeat (SSR) markers. More than 177,000 perfect SSRs were revealed, equivalent to an overall density across the genome of 244.5 SSRs/Mbp, but some 224,000 imperfect SSRs were also identified. About 21% of these SSRs were complex (two stretches of repeats separated by <100 nt). Some 73% of the SSRs were composed of dinucleotide motifs. The SSRs were categorized for the numbers of repeats present, their overall length and were allocated to their linkage group. A total of 4,761 perfect and 6,583 imperfect SSRs were present in 3,781 genes (14.11% of the total), corresponding to an overall density across the gene space of 32,5 and 44,9 SSRs/Mbp for perfect and imperfect motifs, respectively. A putative function has been assigned, using the gene ontology approach, to the set of genes harboring at least one SSR. The same search parameters were applied to reveal the SSR content of 14 other plant species for which genome sequence is available. Certain species-specific SSR motifs were identified, along with a hexa-nucleotide motif shared only with the other two Compositae species (sunflower (Helianthus annuus) and horseweed (Conyza canadensis)) included in the study. Finally, a database, called “Cynara cardunculus MicroSatellite DataBase” (CyMSatDB) was developed to provide a searchable interface to the SSR data. CyMSatDB facilitates the retrieval of SSR markers, as well as suggested forward and reverse primers, on the basis of genomic location, genomic vs genic context, perfect vs imperfect repeat, motif type, motif sequence and repeat number. The SSR markers were validated via an in silico based PCR analysis adopting two available assembled transcriptomes, derived from contrasting globe artichoke accessions, as templates. PMID:27648830
Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools.

PubMed

Cer, Regina Z; Donohue, Duncan E; Mudunuri, Uma S; Temiz, Nuri A; Loss, Michael A; Starner, Nathan J; Halusa, Goran N; Volfovsky, Natalia; Yi, Ming; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M

2013-01-01

The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance.
Structural and biophysical properties of h-FANCI ARM repeat protein.

PubMed

Siddiqui, Mohd Quadir; Choudhary, Rajan Kumar; Thapa, Pankaj; Kulkarni, Neha; Rajpurohit, Yogendra S; Misra, Hari S; Gadewal, Nikhil; Kumar, Satish; Hasan, Syed K; Varma, Ashok K

2017-11-01

Fanconi anemia complementation groups - I (FANCI) protein facilitates DNA ICL (Inter-Cross-link) repair and plays a crucial role in genomic integrity. FANCI is a 1328 amino acids protein which contains armadillo (ARM) repeats and EDGE motif at the C-terminus. ARM repeats are functionally diverse and evolutionarily conserved domain that plays a pivotal role in protein-protein and protein-DNA interactions. Considering the importance of ARM repeats, we have explored comprehensive in silico and in vitro approach to examine folding pattern. Size exclusion chromatography, dynamic light scattering (DLS) and glutaraldehyde crosslinking studies suggest that FANCI ARM repeat exist as monomer as well as in oligomeric forms. Circular dichroism (CD) and fluorescence spectroscopy results demonstrate that protein has predominantly α- helices and well-folded tertiary structure. DNA binding was analysed using electrophoretic mobility shift assay by autoradiography. Temperature-dependent CD, Fluorescence spectroscopy and DLS studies concluded that protein unfolds and start forming oligomer from 30°C. The existence of stable portion within FANCI ARM repeat was examined using limited proteolysis and mass spectrometry. The normal mode analysis, molecular dynamics and principal component analysis demonstrated that helix-turn-helix (HTH) motif present in ARM repeat is highly dynamic and has anti-correlated motion. Furthermore, FANCI ARM repeat has HTH structural motif which binds to double-stranded DNA.
Ligand binding by repeat proteins: natural and designed

PubMed Central

Grove, Tijana Z; Cortajarena, Aitziber L; Regan, Lynne

2012-01-01

Repeat proteins contain tandem arrays of small structural motifs. As a consequence of this architecture, they adopt non-globular, extended structures that present large, highly specific surfaces for ligand binding. Here we discuss recent advances toward understanding the functional role of this unique modular architecture. We showcase specific examples of natural repeat proteins interacting with diverse ligands and also present examples of designed repeat protein–ligand interactions. PMID:18602006
Methods for sequencing GC-rich and CCT repeat DNA templates

DOEpatents

Robinson, Donna L.

2007-02-20

The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.
Motif mismatches in microsatellites: insights from genome-wide investigation among 20 insect species.

PubMed

Behura, Susanta K; Severson, David W

2015-02-01

We present a detailed genome-wide comparative study of motif mismatches of microsatellites among 20 insect species representing five taxonomic orders. The results show that varying proportions (∼15-46%) of microsatellites identified in these species are imperfect in motif structure, and that they also vary in chromosomal distribution within genomes. It was observed that the genomic abundance of imperfect repeats is significantly associated with the length and number of motif mismatches of microsatellites. Furthermore, microsatellites with a higher number of mismatches tend to have lower abundance in the genome, suggesting that sequence heterogeneity of repeat motifs is a key determinant of genomic abundance of microsatellites. This relationship seems to be a general feature of microsatellites even in unrelated species such as yeast, roundworm, mouse and human. We provide a mechanistic explanation of the evolutionary link between motif heterogeneity and genomic abundance of microsatellites by examining the patterns of motif mismatches and allele sequences of single-nucleotide polymorphisms identified within microsatellite loci. Using Drosophila Reference Genetic Panel data, we further show that pattern of allelic variation modulates motif heterogeneity of microsatellites, and provide estimates of allele age of specific imperfect microsatellites found within protein-coding genes. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Comparison of simple sequence repeats in 19 Archaea.

PubMed

Trivedi, S

2006-12-05

All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.
SSR allelic variation in almond (Prunus dulcis Mill.).

PubMed

Xie, Hua; Sui, Yi; Chang, Feng-Qi; Xu, Yong; Ma, Rong-Cai

2006-01-01

Sixteen SSR markers including eight EST-SSR and eight genomic SSRs were used for genetic diversity analysis of 23 Chinese and 15 international almond cultivars. EST- and genomic SSR markers previously reported in species of Prunus, mainly peach, proved to be useful for almond genetic analysis. DNA sequences of 117 alleles of six of the 16 SSR loci were analysed to reveal sequence variation among the 38 almond accessions. For the four SSR loci with AG/CT repeats, no insertions or deletions were observed in the flanking regions of the 98 alleles sequenced. Allelic size variation of these loci resulted exclusively from differences in the structures of repeat motifs, which involved interruptions or occurrences of new motif repeats in addition to varying number of AG/CT repeats. Some alleles had a high number of uninterrupted repeat motifs, indicating that SSR mutational patterns differ among alleles at a given SSR locus within the almond species. Allelic homoplasy was observed in the SSR loci because of base substitutions, interruptions or compound repeat motifs. Substitutions in the repeat regions were found at two SSR loci, suggesting that point mutations operate on SSRs and hinder the further SSR expansion by introducing repeat interruptions to stabilize SSR loci. Furthermore, it was shown that some potential point mutations in the flanking regions are linked with new SSR repeat motif variation in almond and peach.
More Lessons from a Master Teacher! Frank Wachowiak.

ERIC Educational Resources Information Center

Morris, Jimmy

1989-01-01

Presents two printmaking lessons for children inspired by master art teacher, Frank Wachowiak. "Repeated Motifs and Designs" uses vegetables and found objects to make prints emphasizing repeat patterns. "Fish Under the Sea" uses white liquid glue to make line prints with strong linear compositions. (LS)
A naturally occurring, noncanonical GTP aptamer made of simple tandem repeats

PubMed Central

Curtis, Edward A; Liu, David R

2014-01-01

Recently, we used in vitro selection to identify a new class of naturally occurring GTP aptamer called the G motif. Here we report the discovery and characterization of a second class of naturally occurring GTP aptamer, the “CA motif.” The primary sequence of this aptamer is unusual in that it consists entirely of tandem repeats of CA-rich motifs as short as three nucleotides. Several active variants of the CA motif aptamer lack the ability to form consecutive Watson-Crick base pairs in any register, while others consist of repeats containing only cytidine and adenosine residues, indicating that noncanonical interactions play important roles in its structure. The circular dichroism spectrum of the CA motif aptamer is distinct from that of A-form RNA and other major classes of nucleic acid structures. Bioinformatic searches indicate that the CA motif is absent from most archaeal and bacterial genomes, but occurs in at least 70 percent of approximately 400 eukaryotic genomes examined. These searches also uncovered several phylogenetically conserved examples of the CA motif in rodent (mouse and rat) genomes. Together, these results reveal the existence of a second class of naturally occurring GTP aptamer whose sequence requirements, like that of the G motif, are not consistent with those of a canonical secondary structure. They also indicate a new and unexpected potential biochemical activity of certain naturally occurring tandem repeats. PMID:24824832
In-silico mining, type and frequency analysis of genic microsatellites of finger millet (Eleusine coracana (L.) Gaertn.): a comparative genomic analysis of NBS-LRR regions of finger millet with rice.

PubMed

Kalyana Babu, B; Pandey, Dinesh; Agrawal, P K; Sood, Salej; Kumar, Anil

2014-05-01

In recent years, the increased availability of the DNA sequences has given the possibility to develop and explore the expressed sequence tags (ESTs) derived SSR markers. In the present study, a total of 1956 ESTs of finger millet were used to find the microsatellite type, distribution, frequency and developed a total of 545 primer pairs from the ESTs of finger millet. Thirty-two EST sequences had more than two microsatellites and 1357 sequences did not have any SSR repeats. The most frequent type of repeats was trimeric motif, however the second place was occupied by dimeric motif followed by tetra-, hexa- and penta repeat motifs. The most common dimer repeat motif was GA and in case of trimeric SSRs, it was CGG. The EST sequences of NBS-LRR region of finger millet and rice showed higher synteny and were found on nearly same positions on the rice chromosome map. A total of eight, out of 15 EST based SSR primers were polymorphic among the selected resistant and susceptible finger millet genotypes. The primer FMBLEST5 could able to differentiate them into resistant and susceptible genotypes. The alleles specific to the resistant and susceptible genotypes were sequenced using the ABI 3130XL genetic analyzer and found similarity to NBS-LRR regions of rice and finger millet and contained the characteristic kinase-2 and kinase 3a motifs of plant R-genes belonged to NBS-LRR region. The In-silico and comparative analysis showed that the genes responsible for blast resistance can be identified, mapped and further introgressed through molecular breeding approaches for enhancing the blast resistance in finger millet.

Evolution of the tRNALeu (UAA) Intron and Congruence of Genetic Markers in Lichen-Symbiotic Nostoc

PubMed Central

Kaasalainen, Ulla; Olsson, Sanna; Rikkinen, Jouko

2015-01-01

The group I intron interrupting the tRNALeu UAA gene (trnL) is present in most cyanobacterial genomes as well as in the plastids of many eukaryotic algae and all green plants. In lichen symbiotic Nostoc, the P6b stem-loop of trnL intron always involves one of two different repeat motifs, either Class I or Class II, both with unresolved evolutionary histories. Here we attempt to resolve the complex evolution of the two different trnL P6b region types. Our analysis indicates that the Class II repeat motif most likely appeared first and that independent and unidirectional shifts to the Class I motif have since taken place repeatedly. In addition, we compare our results with those obtained with other genetic markers and find strong evidence of recombination in the 16S rRNA gene, a marker widely used in phylogenetic studies on Bacteria. The congruence of the different genetic markers is successfully evaluated with the recently published software Saguaro, which has not previously been utilized in comparable studies. PMID:26098760
Evolution of the tRNALeu (UAA) Intron and Congruence of Genetic Markers in Lichen-Symbiotic Nostoc.

PubMed

Kaasalainen, Ulla; Olsson, Sanna; Rikkinen, Jouko

2015-01-01

The group I intron interrupting the tRNALeu UAA gene (trnL) is present in most cyanobacterial genomes as well as in the plastids of many eukaryotic algae and all green plants. In lichen symbiotic Nostoc, the P6b stem-loop of trnL intron always involves one of two different repeat motifs, either Class I or Class II, both with unresolved evolutionary histories. Here we attempt to resolve the complex evolution of the two different trnL P6b region types. Our analysis indicates that the Class II repeat motif most likely appeared first and that independent and unidirectional shifts to the Class I motif have since taken place repeatedly. In addition, we compare our results with those obtained with other genetic markers and find strong evidence of recombination in the 16S rRNA gene, a marker widely used in phylogenetic studies on Bacteria. The congruence of the different genetic markers is successfully evaluated with the recently published software Saguaro, which has not previously been utilized in comparable studies.
Insights on genome size evolution from a miniature inverted repeat transposon driving a satellite DNA.

PubMed

Scalvenzi, Thibault; Pollet, Nicolas

2014-12-01

The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.
Massive GGAAs in genomic repetitive sequences serve as a nuclear reservoir of NF-κB.

PubMed

Wu, Jian; Wang, Qiao; Dai, Wei; Wang, Wei; Yue, Ming; Wang, Jinke

2018-04-13

Nuclear factor κB (NF-κB) is a DNA-binding transcription factor. Characterizing its genomic binding sites is crucial for understanding its gene regulatory function and mechanism in cells. This study characterized the binding sites of NF-κB RelA/p65 in the tumor neurosis factor-α (TNFα) stimulated HeLa cells by a precise chromatin immunoprecipitation-sequencing (ChIP-seq). The results revealed that NF-κB binds nontraditional motifs (nt-motifs) containing conserved GGAA quadruplet. Moreover, nt-motifs mainly distribute in the peaks nearby centromeres that contain a larger number of repetitive elements such as satellite, simple repeats and short interspersed nuclear elements (SINEs). This intracellular binding pattern was then confirmed by the in vitro detection, indicating that NF-κB dimers can bind the nontraditional κB (nt-κB) sites with low affinity. However, this binding hardly activates transcription. This study thus deduced that NF-κB binding nt-motifs may realize functions other than gene regulation as NF-κB binding traditional motifs (t-motifs). To testify the deduction, many ChIP-seq data of other cell lines were then analyzed. The results indicate that NF-κB binding nt-motifs is also widely present in other cells. The ChIP-seq data analysis also revealed that nt-motifs more widely distribute in the peaks with low-fold enrichment. Importantly, it was also found that NF-κB binding nt-motifs is mainly present in the resting cells, whereas NF-κB binding t-motifs is mainly present in the stimulated cells. Astonishingly, no known function was enriched by the gene annotation of nt-motif peaks. Based on these results, this study proposed that the nt-κB sites that extensively distribute in larger numbers of repeat elements function as a nuclear reservoir of NF-κB. The nuclear NF-κB proteins stored at nt-κB sites in the resting cells may be recruited to the t-κB sites for regulating its target genes upon stimulation. Copyright © 2018 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
Crystal structure of yeast allantoicase reveals a repeated jelly roll motif.

PubMed

Leulliot, Nicolas; Quevillon-Cheruel, Sophie; Sorel, Isabelle; Graille, Marc; Meyer, Philippe; Liger, Dominique; Blondeau, Karine; Janin, Joël; van Tilbeurgh, Herman

2004-05-28

Allantoicase (EC 3.5.3.4) catalyzes the conversion of allantoate into ureidoglycolate and urea, one of the final steps in the degradation of purines to urea. The mechanism of most enzymes involved in this pathway, which has been known for a long time, is unknown. In this paper we describe the three-dimensional crystal structure of the yeast allantoicase determined at a resolution of 2.6 A by single anomalous diffraction. This constitutes the first structure for an enzyme of this pathway. The structure reveals a repeated jelly roll beta-sheet motif, also present in proteins of unrelated biochemical function. Allantoicase has a hexameric arrangement in the crystal (dimer of trimers). Analysis of the protein sequence against the structural data reveals the presence of two totally conserved surface patches, one on each jelly roll motif. The hexameric packing concentrates these patches into conserved pockets that probably constitute the active site.
The structure of the protein phosphatase 2A PR65/A subunit reveals the conformation of its 15 tandemly repeated HEAT motifs.

PubMed

Groves, M R; Hanlon, N; Turowski, P; Hemmings, B A; Barford, D

1999-01-08

The PR65/A subunit of protein phosphatase 2A serves as a scaffolding molecule to coordinate the assembly of the catalytic subunit and a variable regulatory B subunit, generating functionally diverse heterotrimers. Mutations of the beta isoform of PR65 are associated with lung and colon tumors. The crystal structure of the PR65/Aalpha subunit, at 2.3 A resolution, reveals the conformation of its 15 tandemly repeated HEAT sequences, degenerate motifs of approximately 39 amino acids present in a variety of proteins, including huntingtin and importin beta. Individual motifs are composed of a pair of antiparallel alpha helices that assemble in a mainly linear, repetitive fashion to form an elongated molecule characterized by a double layer of alpha helices. Left-handed rotations at three interrepeat interfaces generate a novel left-hand superhelical conformation. The protein interaction interface is formed from the intrarepeat turns that are aligned to form a continuous ridge.
Top surface blade residues and the central channel water molecules are conserved in every repeat of the integrin-like β-propeller structures.

PubMed

Denesyuk, Alexander; Denessiouk, Konstantin; Johnson, Mark S

2018-02-01

An integrin-like β-propeller domain contains seven repeats of a four-stranded antiparallel β-sheet motif (blades). Previously we described a 3D structural motif within each blade of the integrin-type β-propeller. Here, we show unique structural links that join different blades of the β-propeller structure, which together with the structural motif for a single blade are repeated in a β-propeller to provide the functional top face of the barrel, found to be involved in protein-protein interactions and substrate recognition. We compare functional top face diagrams of the integrin-type β-propeller domain and two non-integrin type β-propeller domains of virginiamycin B lyase and WD Repeat-Containing Protein 5. Copyright © 2017 Elsevier Inc. All rights reserved.
A Large Complement of the Predicted Arabidopsis ARM Repeat Proteins Are Members of the U-Box E3 Ubiquitin Ligase Family1[w

PubMed Central

Mudgil, Yashwanti; Shiu, Shin-Han; Stone, Sophia L.; Salt, Jennifer N.; Goring, Daphne R.

2004-01-01

The Arabidopsis genome was searched to identify predicted proteins containing armadillo (ARM) repeats, a motif known to mediate protein-protein interactions in a number of different animal proteins. Using domain database predictions and models generated in this study, 108 Arabidopsis proteins were identified that contained a minimum of two ARM repeats with the majority of proteins containing four to eight ARM repeats. Clustering analysis showed that the 108 predicted Arabidopsis ARM repeat proteins could be divided into multiple groups with wide differences in their domain compositions and organizations. Interestingly, 41 of the 108 Arabidopsis ARM repeat proteins contained a U-box, a motif present in a family of E3 ligases, and these proteins represented the largest class of Arabidopsis ARM repeat proteins. In 14 of these U-box/ARM repeat proteins, there was also a novel conserved domain identified in the N-terminal region. Based on the phylogenetic tree, representative U-box/ARM repeat proteins were selected for further study. RNA-blot analyses revealed that these U-box/ARM proteins are expressed in a variety of tissues in Arabidopsis. In addition, the selected U-box/ARM proteins were found to be functional E3 ubiquitin ligases. Thus, these U-box/ARM proteins represent a new family of E3 ligases in Arabidopsis. PMID:14657406
A large complement of the predicted Arabidopsis ARM repeat proteins are members of the U-box E3 ubiquitin ligase family.

PubMed

Mudgil, Yashwanti; Shiu, Shin-Han; Stone, Sophia L; Salt, Jennifer N; Goring, Daphne R

2004-01-01

The Arabidopsis genome was searched to identify predicted proteins containing armadillo (ARM) repeats, a motif known to mediate protein-protein interactions in a number of different animal proteins. Using domain database predictions and models generated in this study, 108 Arabidopsis proteins were identified that contained a minimum of two ARM repeats with the majority of proteins containing four to eight ARM repeats. Clustering analysis showed that the 108 predicted Arabidopsis ARM repeat proteins could be divided into multiple groups with wide differences in their domain compositions and organizations. Interestingly, 41 of the 108 Arabidopsis ARM repeat proteins contained a U-box, a motif present in a family of E3 ligases, and these proteins represented the largest class of Arabidopsis ARM repeat proteins. In 14 of these U-box/ARM repeat proteins, there was also a novel conserved domain identified in the N-terminal region. Based on the phylogenetic tree, representative U-box/ARM repeat proteins were selected for further study. RNA-blot analyses revealed that these U-box/ARM proteins are expressed in a variety of tissues in Arabidopsis. In addition, the selected U-box/ARM proteins were found to be functional E3 ubiquitin ligases. Thus, these U-box/ARM proteins represent a new family of E3 ligases in Arabidopsis.
High-Pressure NMR and SAXS Reveals How Capping Modulates Folding Cooperativity of the pp32 Leucine-rich Repeat Protein.

PubMed

Zhang, Yi; Berghaus, Melanie; Klein, Sean; Jenkins, Kelly; Zhang, Siwen; McCallum, Scott A; Morgan, Joel E; Winter, Roland; Barrick, Doug; Royer, Catherine A

2018-04-27

Many repeat proteins contain capping motifs, which serve to shield the hydrophobic core from solvent and maintain structural integrity. While the role of capping motifs in enhancing the stability and structural integrity of repeat proteins is well documented, their contribution to folding cooperativity is not. Here we examined the role of capping motifs in defining the folding cooperativity of the leucine-rich repeat protein, pp32, by monitoring the pressure- and urea-induced unfolding of an N-terminal capping motif (N-cap) deletion mutant, pp32-∆N-cap, and a C-terminal capping motif destabilization mutant pp32-Y131F/D146L, using residue-specific NMR and small-angle X-ray scattering. Destabilization of the C-terminal capping motif resulted in higher cooperativity for the unfolding transition compared to wild-type pp32, as these mutations render the stability of the C-terminus similar to that of the rest of the protein. In contrast, deletion of the N-cap led to strong deviation from two-state unfolding. In both urea- and pressure-induced unfolding, residues in repeats 1-3 of pp32-ΔN-cap lost their native structure first, while the C-terminal half was more stable. The residue-specific free energy changes in all regions of pp32-ΔN-cap were larger in urea compared to high pressure, indicating a less cooperative destabilization by pressure. Moreover, in contrast to complete structural disruption of pp32-ΔN-cap at high urea concentration, its pressure unfolded state remained compact. The contrasting effects of the capping motifs on folding cooperativity arise from the differential local stabilities of pp32, whereas the contrasting effects of pressure and urea on the pp32-ΔN-cap variant arise from their distinct mechanisms of action. Copyright © 2018 Elsevier Ltd. All rights reserved.
Design of a bioactive small molecule that targets the myotonic dystrophy type 1 RNA via an RNA motif-ligand database and chemical similarity searching.

PubMed

Parkesh, Raman; Childs-Disney, Jessica L; Nakamori, Masayuki; Kumar, Amit; Wang, Eric; Wang, Thomas; Hoskins, Jason; Tran, Tuan; Housman, David; Thornton, Charles A; Disney, Matthew D

2012-03-14

Myotonic dystrophy type 1 (DM1) is a triplet repeating disorder caused by expanded CTG repeats in the 3'-untranslated region of the dystrophia myotonica protein kinase (DMPK) gene. The transcribed repeats fold into an RNA hairpin with multiple copies of a 5'CUG/3'GUC motif that binds the RNA splicing regulator muscleblind-like 1 protein (MBNL1). Sequestration of MBNL1 by expanded r(CUG) repeats causes splicing defects in a subset of pre-mRNAs including the insulin receptor, the muscle-specific chloride ion channel, sarco(endo)plasmic reticulum Ca(2+) ATPase 1, and cardiac troponin T. Based on these observations, the development of small-molecule ligands that target specifically expanded DM1 repeats could be of use as therapeutics. In the present study, chemical similarity searching was employed to improve the efficacy of pentamidine and Hoechst 33258 ligands that have been shown previously to target the DM1 triplet repeat. A series of in vitro inhibitors of the RNA-protein complex were identified with low micromolar IC(50)'s, which are >20-fold more potent than the query compounds. Importantly, a bis-benzimidazole identified from the Hoechst query improves DM1-associated pre-mRNA splicing defects in cell and mouse models of DM1 (when dosed with 1 mM and 100 mg/kg, respectively). Since Hoechst 33258 was identified as a DM1 binder through analysis of an RNA motif-ligand database, these studies suggest that lead ligands targeting RNA with improved biological activity can be identified by using a synergistic approach that combines analysis of known RNA-ligand interactions with chemical similarity searching.
Design of a Bioactive Small Molecule that Targets the Myotonic Dystrophy Type 1 RNA Via an RNA Motif-Ligand Database & Chemical Similarity Searching

PubMed Central

Parkesh, Raman; Childs-Disney, Jessica L.; Nakamori, Masayuki; Kumar, Amit; Wang, Eric; Wang, Thomas; Hoskins, Jason; Tran, Tuan; Housman, David; Thornton, Charles A.; Disney, Matthew D.

2012-01-01

Myotonic dystrophy type 1 (DM1) is a triplet repeating disorder caused by expanded CTG repeats in the 3′ untranslated region of the dystrophia myotonica protein kinase (DMPK) gene. The transcribed repeats fold into an RNA hairpin with multiple copies of a 5′CUG/3′GUC motif that binds the RNA splicing regulator muscleblind-like 1 protein (MBNL1). Sequestration of MBNL1 by expanded r(CUG) repeats causes splicing defects in a subset of pre-mRNAs including the insulin receptor, the muscle-specific chloride ion channel, Sarco(endo)plasmic reticulum Ca2+ ATPase 1 (Serca1/Atp2a1), and cardiac troponin T (cTNT). Based on these observations, the development of small molecule ligands that target specifically expanded DM1 repeats could serve as therapeutics. In the present study, computational screening was employed to improve the efficacy of pentamidine and Hoechst 33258 ligands that have been shown previously to target the DM1 triplet repeat. A series of inhibitors of the RNA-protein complex with low micromolar IC50’s, which are >20-fold more potent than the query compounds, were identified. Importantly, a bis-benzimidazole identified from the Hoechst query improves DM1-associated pre-mRNA splicing defects in cell and mouse models of DM1 (when dosed with 1 mM and 100 mg/kg, respectively). Since Hoechst 33258 was identified as a DM1 binder through analysis of an RNA motif-ligand database, these studies suggest that lead ligands targeting RNA with improved biological activity can be identified by using a synergistic approach that combines analysis of known RNA-ligand interactions with virtual screening. PMID:22300544
Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.

PubMed

Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M

2011-01-01

Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.
Slide-and-exchange mechanism for rapid and selective transport through the nuclear pore complex.

PubMed

Raveh, Barak; Karp, Jerome M; Sparks, Samuel; Dutta, Kaushik; Rout, Michael P; Sali, Andrej; Cowburn, David

2016-05-03

Nucleocytoplasmic transport is mediated by the interaction of transport factors (TFs) with disordered phenylalanine-glycine (FG) repeats that fill the central channel of the nuclear pore complex (NPC). However, the mechanism by which TFs rapidly diffuse through multiple FG repeats without compromising NPC selectivity is not yet fully understood. In this study, we build on our recent NMR investigations showing that FG repeats are highly dynamic, flexible, and rapidly exchanging among TF interaction sites. We use unbiased long timescale all-atom simulations on the Anton supercomputer, combined with extensive enhanced sampling simulations and NMR experiments, to characterize the thermodynamic and kinetic properties of FG repeats and their interaction with a model transport factor. Both the simulations and experimental data indicate that FG repeats are highly dynamic random coils, lack intrachain interactions, and exhibit significant entropically driven resistance to spatial confinement. We show that the FG motifs reversibly slide in and out of multiple TF interaction sites, transitioning rapidly between a strongly interacting state and a weakly interacting state, rather than undergoing a much slower transition between strongly interacting and completely noninteracting (unbound) states. In the weakly interacting state, FG motifs can be more easily displaced by other competing FG motifs, providing a simple mechanism for rapid exchange of TF/FG motif contacts during transport. This slide-and-exchange mechanism highlights the direct role of the disorder within FG repeats in nucleocytoplasmic transport, and resolves the apparent conflict between the selectivity and speed of transport.
Multiple intermediates on the energy landscape of a 15-HEAT-repeat protein

PubMed Central

Tsytlonok, Maksym; Craig, Patricio O.; Sivertsson, Elin; Serquera, David; Perrett, Sarah; Best, Robert B.; Wolynes, Peter G.; Itzhaki, Laura S.

2014-01-01

Repeat proteins are a special class of modular, non-globular proteins composed of small structural motifs arrayed to form elongated architectures and stabilised solely by short-range contacts. We find a remarkable complexity in the unfolding of the large HEAT repeat protein PR65/A. In contrast to what has been seen for small repeat proteins in which unfolding propagates from one end, the HEAT array of PR65/A ruptures at multiple distant sites, leading to intermediate states with non-contiguous folded subdomains. Kinetic analysis allows us to define a network of intermediates and to delineate the pathways that connect them. There is a dominant sequence of unfolding, reflecting a non-uniform distribution of stability across the repeat array; however the unfolding of certain intermediates is competitive, leading to parallel pathways. Theoretical models accounting for the heterogeneous contact density in the folded structure are able to rationalize the variation in stability across the array. This variation in stability also suggests how folding may direct function in a large repeat protein: The stability distribution enables certain regions to present rigid motifs for molecular recognition while affording others flexibility to broaden the search area as in a fly-casting mechanism. Thus PR65/A uses the two ends of the repeat array to bind diverse partners and thereby coordinate the dephosphorylation of many different substrates and of multiple sites within hyperphosphorylated substrates. PMID:24120762
Evolution of genes and repeats in the Nimrod superfamily.

PubMed

Somogyi, Kálmán; Sipos, Botond; Pénzes, Zsolt; Kurucz, Eva; Zsámboki, János; Hultmark, Dan; Andó, István

2008-11-01

The recently identified Nimrod superfamily is characterized by the presence of a special type of EGF repeat, the NIM repeat, located right after a typical CCXGY/W amino acid motif. On the basis of structural features, nimrod genes can be divided into three types. The proteins encoded by Draper-type genes have an EMI domain at the N-terminal part and only one copy of the NIM motif, followed by a variable number of EGF-like repeats. The products of Nimrod B-type and Nimrod C-type genes (including the eater gene) have different kinds of N-terminal domains, and lack EGF-like repeats but contain a variable number of NIM repeats. Draper and Nimrod C-type (but not Nimrod B-type) proteins carry a transmembrane domain. Several members of the superfamily were claimed to function as receptors in phagocytosis and/or binding of bacteria, which indicates an important role in the cellular immunity and the elimination of apoptotic cells. In this paper, the evolution of the Nimrod superfamily is studied with various methods on the level of genes and repeats. A hypothesis is presented in which the NIM repeat, along with the EMI domain, emerged by structural reorganizations at the end of an EGF-like repeat chain, suggesting a mechanism for the formation of novel types of repeats. The analyses revealed diverse evolutionary patterns in the sequences containing multiple NIM repeats. Although in the Nimrod B and Nimrod C proteins show characteristics of independent evolution, many internal NIM repeats in Eater sequences seem to have undergone concerted evolution. An analysis of the nimrod genes has been performed using phylogenetic and other methods and an evolutionary scenario of the origin and diversification of the Nimrod superfamily is proposed. Our study presents an intriguing example how the evolution of multigene families may contribute to the complexity of the innate immune response.
Complexity of the 5' Untranslated Region of EIF4A3, a Critical Factor for Craniofacial and Neural Development.

PubMed

Hsia, Gabriella S P; Musso, Camila M; Alvizi, Lucas; Brito, Luciano A; Kobayashi, Gerson S; Pavanello, Rita C M; Zatz, Mayana; Gardham, Alice; Wakeling, Emma; Zechi-Ceide, Roseli M; Bertola, Debora; Passos-Bueno, Maria Rita

2018-01-01

Repeats in coding and non-coding regions have increasingly been associated with many human genetic disorders, such as Richieri-Costa-Pereira syndrome (RCPS). RCPS, mostly characterized by midline cleft mandible, Robin sequence and limb defects, is an autosomal-recessive acrofacial dysostosis mainly reported in Brazilian patients. This disorder is caused by decreased levels of EIF4A3 , mostly due to an increased number of repeats at the EIF4A3 5'UTR. EIF4A3 5'UTR alleles are CG-rich and vary in size and organization of three types of motifs. An exclusive allelic pattern was identified among affected individuals, in which the CGCA-motif is the most prevalent, herein referred as "disease-associated CGCA-20nt motif." The origin of the pathogenic alleles containing the disease-associated motif, as well as the functional effects of the 5'UTR motifs on EIF4A3 expression, to date, are entirely unknown. Here, we characterized 43 different EIF4A3 5'UTR alleles in a cohort of 380 unaffected individuals. We identified eight heterozygous unaffected individuals harboring the disease-associated CGCA-20nt motif and our haplotype analyses indicate that there are more than one haplotype associated with RCPS. The combined analysis of number, motif organization and haplotypic diversity, as well as the observation of two apparently distinct haplotypes associated with the disease-associated CGCA-20nt motif, suggest that the RCPS alleles might have arisen from independent unequal crossing-over events between ancient alleles at least twice. Moreover, we have shown that the number and sequence of motifs in the 5'UTR region is associated with EIF4A3 repression, which is not mediated by CpG methylation. In conclusion, this study has shown that the large number of repeats in EIF4A3 does not represent a dynamic mutation and RCPS can arise in any population harboring alleles with the CGCA-20nt motif. We also provided further evidence that EIF4A3 5'UTR is a regulatory region and the size and sequence type of the repeats at 5'UTR may contribute to clinical variability in RCPS.
A set of tetra-nucleotide core motif SSR markers for efficient identification of potato (Solanum tuberosum) cultivars.

PubMed

Kishine, Masahiro; Tsutsumi, Katsuji; Kitta, Kazumi

2017-12-01

Simple sequence repeat (SSR) is a popular tool for individual fingerprinting. The long-core motif (e.g. tetra-, penta-, and hexa-nucleotide) simple sequence repeats (SSRs) are preferred because they make it easier to separate and distinguish neighbor alleles. In the present study, a new set of 8 tetra-nucleotide SSRs in potato ( Solanum tuberosum ) is reported. By using these 8 markers, 72 out of 76 cultivars obtained from Japan and the United States were clearly discriminated, while two pairs, both of which arose from natural variation, showed identical profiles. The combined probability of identity between two random cultivars for the set of 8 SSR markers was estimated to be 1.10 × 10 -8 , confirming the usefulness of the proposed SSR markers for fingerprinting analyses of potato.
Human telomeric DNA: G-quadruplex, i-motif and Watson–Crick double helix

PubMed Central

Phan, Anh Tuân; Mergny, Jean-Louis

2002-01-01

Human telomeric DNA composed of (TTAGGG/CCCTAA)n repeats may form a classical Watson–Crick double helix. Each individual strand is also prone to quadruplex formation: the G-rich strand may adopt a G-quadruplex conformation involving G-quartets whereas the C-rich strand may fold into an i-motif based on intercalated C·C+ base pairs. Using an equimolar mixture of the telomeric oligonucleotides d[AGGG(TTAGGG)3] and d[(CCCTAA)3CCCT], we defined which structures existed and which would be the predominant species under a variety of experimental conditions. Under near-physiological conditions of pH, temperature and salt concentration, telomeric DNA was predominantly in a double-helix form. However, at lower pH values or higher temperatures, the G-quadruplex and/or the i-motif efficiently competed with the duplex. We also present kinetic and thermodynamic data for duplex association and for G-quadruplex/i-motif unfolding. PMID:12409451
A relational extension of the notion of motifs: application to the common 3D protein substructures searching problem.

PubMed

Pisanti, Nadia; Soldano, Henry; Carpentier, Mathilde; Pothier, Joel

2009-12-01

The geometrical configurations of atoms in protein structures can be viewed as approximate relations among them. Then, finding similar common substructures within a set of protein structures belongs to a new class of problems that generalizes that of finding repeated motifs. The novelty lies in the addition of constraints on the motifs in terms of relations that must hold between pairs of positions of the motifs. We will hence denote them as relational motifs. For this class of problems, we present an algorithm that is a suitable extension of the KMR paradigm and, in particular, of the KMRC as it uses a degenerate alphabet. Our algorithm contains several improvements that become especially useful when-as it is required for relational motifs-the inference is made by partially overlapping shorter motifs, rather than concatenating them. The efficiency, correctness and completeness of the algorithm is ensured by several non-trivial properties that are proven in this paper. The algorithm has been applied in the important field of protein common 3D substructure searching. The methods implemented have been tested on several examples of protein families such as serine proteases, globins and cytochromes P450 additionally. The detected motifs have been compared to those found by multiple structural alignments methods.

Solution structure and base pair opening kinetics of the i-motif dimer of d(5mCCTTTACC): a noncanonical structure with possible roles in chromosome stability.

PubMed

Nonin, S; Phan, A T; Leroy, J L

1997-09-15

Repetitive cytosine-rich DNA sequences have been identified in telomeres and centromeres of eukaryotic chromosomes. These sequences play a role in maintaining chromosome stability during replication and may be involved in chromosome pairing during meiosis. The C-rich repeats can fold into an 'i-motif' structure, in which two parallel-stranded duplexes with hemiprotonated C.C+ pairs are intercalated. Previous NMR studies of naturally occurring repeats have produced poor NMR spectra. This led us to investigate oligonucleotides, based on natural sequences, to produce higher quality spectra and thus provide further information as to the structure and possible biological function of the i-motif. NMR spectroscopy has shown that d(5mCCTTTACC) forms an i-motif dimer of symmetry-related and intercalated folded strands. The high-definition structure is computed on the basis of the build-up rates of 29 intraresidue and 35 interresidue nuclear Overhauser effect (NOE) connectivities. The i-motif core includes intercalated interstrand C.C+ pairs stacked in the order 2*.8/1.7*/1*.7/2.8* (where one strand is distinguished by an asterisk and the numbers relate to the base positions within the repeat). The TTTA sequences form two loops which span the two wide grooves on opposite sides of the i-motif core; the i-motif core is extended at both ends by the stacking of A6 onto C2.C8+. The lifetimes of pairs C2.C8+ and 5mC1.C7+ are 1 ms and 1 s, respectively, at 15 degrees C. Anomalous exchange properties of the T3 imino proton indicate hydrogen bonding to A6 N7 via a water bridge. The d(5mCCTTTTCC) deoxyoligonucleotide, in which position 6 is occupied by a thymidine instead of an adenine, also forms a symmetric i-motif dimer. However, in this structure the two TTTT loops are located on the same side of the i-motif core and the C.C+ pairs are formed by equivalent cytidines stacked in the order 8*.8/1.1*/7*.7/2.2*. Oligodeoxynucleotides containing two C-rich repeats can fold and dimerize into an i-motif. The change of folding topology resulting from the substitution of a single nucleoside emphasizes the influence of the loop residues on the i-motif structure formed by two folded strands.
Gene Isolation Using Degenerate Primers Targeting Protein Motif: A Laboratory Exercise

ERIC Educational Resources Information Center

Yeo, Brandon Pei Hui; Foong, Lian Chee; Tam, Sheh May; Lee, Vivian; Hwang, Siaw San

2018-01-01

Structures and functions of protein motifs are widely included in many biology-based course syllabi. However, little emphasis is placed to link this knowledge to applications in biotechnology to enhance the learning experience. Here, the conserved motifs of nucleotide binding site-leucine rich repeats (NBS-LRR) proteins, successfully used for the…
Rationally designed small molecules targeting the RNA that causes myotonic dystrophy type 1 are potently bioactive.

PubMed

Childs-Disney, Jessica L; Hoskins, Jason; Rzuczek, Suzanne G; Thornton, Charles A; Disney, Matthew D

2012-05-18

RNA is an important drug target, but it is difficult to design or discover small molecules that modulate RNA function. In the present study, we report that rationally designed, modularly assembled small molecules that bind the RNA that causes myotonic dystrophy type 1 (DM1) are potently bioactive in cell culture models. DM1 is caused when an expansion of r(CUG) repeats, or r(CUG)(exp), is present in the 3' untranslated region (UTR) of the dystrophia myotonica protein kinase (DMPK) mRNA. r(CUG)(exp) folds into a hairpin with regularly repeating 5'CUG/3'GUC motifs and sequesters muscleblind-like 1 protein (MBNL1). A variety of defects are associated with DM1, including (i) formation of nuclear foci, (ii) decreased translation of DMPK mRNA due to its nuclear retention, and (iii) pre-mRNA splicing defects due to inactivation of MBNL1, which controls the alternative splicing of various pre-mRNAs. Previously, modularly assembled ligands targeting r(CUG)(exp) were designed using information in an RNA motif-ligand database. These studies showed that a bis-benzimidazole (H) binds the 5'CUG/3'GUC motif in r(CUG)(exp.) Therefore, we designed multivalent ligands to bind simultaneously multiple copies of this motif in r(CUG)(exp). Herein, we report that the designed compounds improve DM1-associated defects including improvement of translational and pre-mRNA splicing defects and the disruption of nuclear foci. These studies may establish a foundation to exploit other RNA targets in genomic sequence.
Ensemble characterization of an intrinsically disordered FG-Nup peptide and its F>A mutant in DMSO-d6.

PubMed

Reid, Korey M; Sunanda, Punnepalli; Raghothama, S; Krishnan, V V

2017-11-01

Intrinsically disordered proteins (IDP) lack a well-defined 3D-structure under physiological conditions, yet, the inherent disorder represented by an ensemble of conformation plays a critical role in many cellular and regulatory processes. Nucleoporins, or Nups, are the proteins found in the nuclear pore complex (NPC). The central pore of the NPC is occupied by Nups, which have phenylalanine-glycine domain repeats and are intrinsically disordered, and therefore are termed FG-Nups. These FG-domain repeats exhibit differing cohesiveness character and differ from least (FG) to most (GLFG) cohesive. The designed FG-Nup is a 25 AA model peptide containing a noncohesive FG-motif flanked by two cohesive GLFG-motifs (WT peptide). Complete NMR-based ensemble characterization of this peptide along with a control peptide with an F>A substitution (MU peptide) are discussed. Ensemble characterization of the NMR-determined models suggests that both the peptides do not have consistent secondary structures and continue to be disordered. Nonetheless, the role of cohesive elements mediated by the GLFG motifs is evident in the WT ensemble of structures that are more compact than the MU peptide. The approach presented here allows an alternate way to investigate the specific roles of distinct amino acid motifs that translate into the long-range organization of the ensemble of structures and in general on the nature of IDPs. © 2017 Wiley Periodicals, Inc.
Identification and Characterization of Functionally Critical, Conserved Motifs in the Internal Repeats and N-terminal Domain of Yeast Translation Initiation Factor 4B (yeIF4B)*

PubMed Central

Zhou, Fujun; Walker, Sarah E.; Mitchell, Sarah F.; Lorsch, Jon R.; Hinnebusch, Alan G.

2014-01-01

eIF4B has been implicated in attachment of the 43 S preinitiation complex (PIC) to mRNAs and scanning to the start codon. We recently determined that the internal seven repeats (of ∼26 amino acids each) of Saccharomyces cerevisiae eIF4B (yeIF4B) compose the region most critically required to enhance mRNA recruitment by 43 S PICs in vitro and stimulate general translation initiation in yeast. Moreover, although the N-terminal domain (NTD) of yeIF4B contributes to these activities, the RNA recognition motif is dispensable. We have now determined that only two of the seven internal repeats are sufficient for wild-type (WT) yeIF4B function in vivo when all other domains are intact. However, three or more repeats are needed in the absence of the NTD or when the functions of eIF4F components are compromised. We corroborated these observations in the reconstituted system by demonstrating that yeIF4B variants with only one or two repeats display substantial activity in promoting mRNA recruitment by the PIC, whereas additional repeats are required at lower levels of eIF4A or when the NTD is missing. These findings indicate functional overlap among the 7-repeats and NTD domains of yeIF4B and eIF4A in mRNA recruitment. Interestingly, only three highly conserved positions in the 26-amino acid repeat are essential for function in vitro and in vivo. Finally, we identified conserved motifs in the NTD and demonstrate functional overlap of two such motifs. These results provide a comprehensive description of the critical sequence elements in yeIF4B that support eIF4F function in mRNA recruitment by the PIC. PMID:24285537
Ankyrin repeats of ANKRA2 recognize a PxLPxL motif on the 3M syndrome protein CCDC8.

PubMed

Nie, Jianyun; Xu, Chao; Jin, Jing; Aka, Juliette A; Tempel, Wolfram; Nguyen, Vivian; You, Linya; Weist, Ryan; Min, Jinrong; Pawson, Tony; Yang, Xiang-Jiao

2015-04-07

Peptide motifs are often used for protein-protein interactions. We have recently demonstrated that ankyrin repeats of ANKRA2 and the paralogous bare lymphocyte syndrome transcription factor RFXANK recognize PxLPxL/I motifs shared by megalin, three histone deacetylases, and RFX5. We show here that that CCDC8 is a major partner of ANKRA2 but not RFXANK in cells. The CCDC8 gene is mutated in 3M syndrome, a short-stature disorder with additional facial and skeletal abnormalities. Two other genes mutated in this syndrome encode CUL7 and OBSL1. While CUL7 is a ubiquitin ligase and OBSL1 associates with the cytoskeleton, little is known about CCDC8. Binding and structural analyses reveal that the ankyrin repeats of ANKRA2 recognize a PxLPxL motif at the C-terminal region of CCDC8. The N-terminal part interacts with OBSL1 to form a CUL7 ligase complex. These results link ANKRA2 unexpectedly to 3M syndrome and suggest novel regulatory mechanisms for histone deacetylases and RFX7. Copyright © 2015 Elsevier Ltd. All rights reserved.
Isolation and characterization of microsatellite loci in the intertidal sponge Halichondria panicea

USGS Publications Warehouse

Knowlton, Anne L.; Pierson, Barbara J.; Talbot, S.L.; Highsmith, Ray C.

2003-01-01

GA- and CA-enriched genomic libraries were constructed for the intertidal sponge Halichondria panicea. Unique repeat motifs identified varied from the expected simple dinucleotide repeats to more complex repeat units. All sequences tended to be highly repetitive but did not necessarily contain the targeted motifs. Seven microsatellite loci were evaluated on sponges from the clone source population. All seven were polymorphic with 5.43 ± 0.92 mean number of alleles. Six of the seven loci that could be resolved had mean heterozygosities of 0.14–0.68. The loci identified here will be useful for population studies.
Molecular cloning and characterization of sea bass (Dicentrarchus labrax, L.) calreticulin.

PubMed

Pinto, Rute D; Moreira, Ana R; Pereira, Pedro J B; dos Santos, Nuno M S

2013-06-01

Mammalian calreticulin (CRT) is a key molecular chaperone and regulator of Ca(2+) homeostasis in endoplasmic reticulum (ER), also being implicated in a variety of physiological/pathological processes outside the ER. Importantly, it is involved in assembly of MHC class I molecules. In this work, sea bass (Dicentrarchus labrax) CRT (Dila-CRT) gene and cDNA have been isolated and characterized. The mature protein retains two conserved motifs, three structural/functional domains (N, P and C), three type 1 and 2 motifs repeated in tandem, a conserved pair of cysteines and ER-retention motif. It is a single-copy gene composed of 9 exons. Dila-CRT three-dimensional homology models are consistent with the structural features described for mammalian molecules. Together, these results are supportive of a highly conserved structure of CRT through evolution. Moreover, the present data provides information that will allow further studies on sea bass CRT involvement in immunity and in particular class I antigen presentation. Copyright © 2013 Elsevier Ltd. All rights reserved.
Pyrene functionalized molecular beacon with pH-sensitive i-motif in a loop.

PubMed

Dembska, Anna; Juskowiak, Bernard

2015-01-01

In this work, we present a spectral characterization of pH-sensitive system, which combines the i-motif properties with the spatially sensitive fluorescence signal of pyrene molecules attached to hairpin ends. The excimer production (fluorescence max. ∼480 nm) by pyrene labels at the ends of the molecular beacon is driven by pH-dependent i-motif formation in the loop. To illustrate the performance and reversible work of our systems, we performed the experiments with repeatedly pH cycling between pH values of 7.5±0.3 and 6.5±0.3. The sensor gives analytical response in excimer-monomer switching mode in narrow pH range (1.5 pH units) and exhibits high pH resolution (0.1 pH unit). Copyright © 2015 Elsevier B.V. All rights reserved.
The proliferation marker pKi-67 becomes masked to MIB-1 staining after expression of its tandem repeats.

PubMed

Schmidt, Mirko H H; Broll, Rainer; Bruch, Hans-Peter; Duchrow, Michael

2002-11-01

The Ki-67 antigen, pKi-67, is one of the most commonly used markers of proliferating cells. The protein can only be detected in dividing cells (G(1)-, S-, G(2)-, and M-phase) but not in quiescent cells (G(0)). The standard antibody to detect pKi-67 is MIB-1, which detects the so-called 'Ki-67 motif' FKELF in 9 of the protein's 16 tandem repeats. To investigate the function of these repeats we expressed three of them in an inducible gene expression system in HeLa cells. Surprisingly, addition of a nuclear localization sequence led to a complete absence of signal in the nuclei of MIB-1-stained cells. At the same time antibodies directed against different epitopes of pKi-67 did not fail to detect the protein. We conclude that the overexpression of the 'Ki-67 motif', which is present in the repeats, can lead to inability of MIB-1 to detect its antigen as demonstrated in adenocarcinoma tissue samples. Thereafter, in order to prevent the underestimation of Ki-67 proliferation indices in MIB-1-labeled preparations, additional antibodies (for example, MIB-21) should be used. Additionally, we could show in a mammalian two-hybrid assay that recombinant pKi-67 repeats are capable of self-associating with endogenous pKi-67. Speculating that the tandem repeats are intimately involved in its protein-protein interactions, this offers new insights in how access to these repeats is regulated by pKi-67 itself.
Classification of proteins with shared motifs and internal repeats in the ECOD database

PubMed Central

Kinch, Lisa N.; Liao, Yuxing

2016-01-01

Abstract Proteins and their domains evolve by a set of events commonly including the duplication and divergence of small motifs. The presence of short repetitive regions in domains has generally constituted a difficult case for structural domain classifications and their hierarchies. We developed the Evolutionary Classification Of protein Domains (ECOD) in part to implement a new schema for the classification of these types of proteins. Here we document the ways in which ECOD classifies proteins with small internal repeats, widespread functional motifs, and assemblies of small domain‐like fragments in its evolutionary schema. We illustrate the ways in which the structural genomics project impacted the classification and characterization of new structural domains and sequence families over the decade. PMID:26833690
Maternal lineages of peach genotypes

USDA-ARS?s Scientific Manuscript database

Simple sequence repeats (SSRs) in chloroplast genomes are useful markers to determine maternal lineages. The SSR mining results revealed that most chloroplast SSRs among three Prunus chloroplast genomes were conserved in locations and motif types, but polymorphic in motif and/or amplicon lengths. Fi...
Ciliate telomerase RNA loop IV nucleotides promote hierarchical RNP assembly and holoenzyme stability.

PubMed

Robart, Aaron R; O'Connor, Catherine M; Collins, Kathleen

2010-03-01

Telomerase adds simple-sequence repeats to chromosome 3' ends to compensate for the loss of repeats with each round of genome replication. To accomplish this de novo DNA synthesis, telomerase uses a template within its integral RNA component. In addition to providing the template, the telomerase RNA subunit (TER) also harbors nontemplate motifs that contribute to the specialized telomerase catalytic cycle of reiterative repeat synthesis. Most nontemplate TER motifs function through linkage with the template, but in ciliate and vertebrate telomerases, a stem-loop motif binds telomerase reverse transcriptase (TERT) and reconstitutes full activity of the minimal recombinant TERT+TER RNP, even when physically separated from the template. Here, we resolve the functional requirements for this motif of ciliate TER in physiological RNP context using the Tetrahymena thermophila p65-TER-TERT core RNP reconstituted in vitro and the holoenzyme reconstituted in vivo. Contrary to expectation based on assays of the minimal recombinant RNP, we find that none of a panel of individual loop IV nucleotide substitutions impacts the profile of telomerase product synthesis when reconstituted as physiological core RNP or holoenzyme RNP. However, loop IV nucleotide substitutions do variably reduce assembly of TERT with the p65-TER complex in vitro and reduce the accumulation and stability of telomerase RNP in endogenous holoenzyme context. Our results point to a unifying model of a conformational activation role for this TER motif in the telomerase RNP enzyme.
Molecular cloning and characterization of an Hsp90/70 organizing protein gene from Frankliniella occidentalis (Insecta: Thysanoptera, Thripidae).

PubMed

Li, Hong-Bo; Du, Yu-Zhou

2013-05-15

The heat shock 90/70 organizing protein (Hop), also known as Sti-1 (stress-induced protein-1), is a co-chaperone that usually mediates the interaction of Hsp90 and Hsp70 and has been extensively characterized in mammals and plants. However, its role in insects remains unknown. In the present study, we isolated and characterized a Hop homologue gene from Frankliniella occidentalis (Fohop). The Fohop contains a 1659bp ORF encoding a protein of 552 amino acids with a caculated molecular mass of approximately 62.25kDa, which displays a reasonable degree of identity with the known Hops and shares several canonical motifs, including three tetratricopeptide repeated motif domains (TPR1, TPR2A and TPR2B) and two aspartic acid-proline (DP) repeat motifs (DP1 and DP2). As in other hops, Fohop contains introns, but the number and the position are quite variable. The mRNA expression patterns indicated that Fohop was constitutively expressed throughout the developmental stages, but was obviously upregulated by heat stress both in larvae and adults. Our studies imply that Hop, as in other Hsps, may play an important role in heat shock response of F. occidentalis. Copyright © 2013 Elsevier B.V. All rights reserved.
Analysis of sequence repeats of proteins in the PDB.

PubMed

Mary Rajathei, David; Selvaraj, Samuel

2013-12-01

Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes

Treesearch

A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt

2000-01-01

Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
The role of collagen charge clusters in the modulation of matrix metalloproteinase activity.

PubMed

Lauer, Janelle L; Bhowmick, Manishabrata; Tokmina-Roszyk, Dorota; Lin, Yan; Van Doren, Steven R; Fields, Gregg B

2014-01-24

Members of the matrix metalloproteinase (MMP) family selectively cleave collagens in vivo. Several substrate structural features that direct MMP collagenolysis have been identified. The present study evaluated the role of charged residue clusters in the regulation of MMP collagenolysis. A series of 10 triple-helical peptide (THP) substrates were constructed in which either Lys-Gly-Asp or Gly-Asp-Lys motifs replaced Gly-Pro-Hyp (where Hyp is 4-hydroxy-L-proline) repeats. The stabilities of THPs containing the two different motifs were analyzed, and kinetic parameters for substrate hydrolysis by six MMPs were determined. A general trend for virtually all enzymes was that, as Gly-Asp-Lys motifs were moved from the extreme N and C termini to the interior next to the cleavage site sequence, kcat/Km values increased. Additionally, all Gly-Asp-Lys THPs were as good or better substrates than the parent THP in which Gly-Asp-Lys was not present. In turn, the Lys-Gly-Asp THPs were also always better substrates than the parent THP, but the magnitude of the difference was considerably less compared with the Gly-Asp-Lys series. Of the MMPs tested, MMP-2 and MMP-9 most greatly favored the presence of charged residues with preference for the Gly-Asp-Lys series. Lys-Gly-(Asp/Glu) motifs are more commonly found near potential MMP cleavage sites than Gly-(Asp/Glu)-Lys motifs. As Lys-Gly-Asp is not as favored by MMPs as Gly-Asp-Lys, the Lys-Gly-Asp motif appears advantageous over the Gly-Asp-Lys motif by preventing unwanted MMP hydrolysis. More specifically, the lack of Gly-Asp-Lys clusters may diminish potential MMP-2 and MMP-9 collagenolytic activity. The present study indicates that MMPs have interactions spanning the P23-P23' subsites of collagenous substrates.
Versatile communication strategies among tandem WW domain repeats

PubMed Central

Dodson, Emma Joy; Fishbain-Yoskovitz, Vered; Rotem-Bamberger, Shahar

2015-01-01

Interactions mediated by short linear motifs in proteins play major roles in regulation of cellular homeostasis since their transient nature allows for easy modulation. We are still far from a full understanding and appreciation of the complex regulation patterns that can be, and are, achieved by this type of interaction. The fact that many linear-motif-binding domains occur in tandem repeats in proteins indicates that their mutual communication is used extensively to obtain complex integration of information toward regulatory decisions. This review is an attempt to overview, and classify, different ways by which two and more tandem repeats cooperate in binding to their targets, in the well-characterized family of WW domains and their corresponding polyproline ligands. PMID:25710931
Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species.

PubMed

Larsen, Svend Arild; Mogensen, Line; Dietz, Rune; Baagøe, Hans Jørgen; Andersen, Mogens; Werge, Thomas; Rasmussen, Henrik Berg

2005-12-01

In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of potential functional sites varied pronouncedly between species. Our observations provide a platform for future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that differences in the structure of this tandem repeat contribute to specialization and generation of diversity in receptor function.
Transcriptional activity of the homopurine-homopyrimidine repeat of the c-Ki-ras promoter is independent of its H-forming potential.

PubMed Central

Raghu, G; Tevosian, S; Anant, S; Subramanian, K N; George, D L; Mirkin, S M

1994-01-01

The mouse c-Ki-ras protooncogene promoter contains an unusual DNA element consisting of a 27 bp-long homopurine-homopyrimidine mirror repeat (H-motif) adjacent to a d(C-G)5 repeat. We have previously shown that in vitro these repeats may adopt H and Z conformations, respectively, causing nuclease and chemical hypersensitivity. Here we have studied the functional role of these DNA stretches using fine deletion analysis of the promoter and a transient transcription assay in vivo. We found that while the H-motif is responsible for approximately half of the promoter activity in both mouse and human cell lines, the Z-forming sequence exhibits little, if any, such activity. Mutational changes introduced within the homopurine-homopyrimidine stretch showed that its sequence integrity, rather than its H-forming potential, is responsible for its effect on transcription. Electrophoretic mobility shift assays revealed that the putative H-motif tightly binds several nuclear proteins, one of which is likely to be transcription factor Sp1, as determined by competition experiments. Southwestern hybridization studies detected two major proteins specifically binding to the H-motif: a 97 kD protein which presumably corresponds to Sp1 and another protein of 60 kD in human and 64 kD in mouse cells. We conclude that the homopurine-homopyrimidine stretch is required for full transcriptional activity of the c-Ki-ras promoter and at least two distinct factors, Sp1 and an unidentified protein, potentially contribute to the positive effect on transcription. Images PMID:8078760

Characterization and evolution of the mitochondrial DNA control region in hornbills (Bucerotiformes).

PubMed

Delport, Wayne; Ferguson, J Willem H; Bloomer, Paulette

2002-06-01

We determined the mitochondrial DNA control region sequences of six Bucerotiformes. Hornbills have the typical avian gene order and their control region is similar to other avian control regions in that it is partitioned into three domains: two variable domains that flank a central conserved domain. Two characteristics of the hornbill control region sequence differ from that of other birds. First, domain I is AT rich as opposed to AC rich, and second, the control region is approximately 500 bp longer than that of other birds. Both these deviations from typical avian control region sequence are explainable on the basis of repeat motifs in domain I of the hornbill control region. The repeat motifs probably originated from a duplication of CSB-1 as has been determined in chicken, quail, and snowgoose. Furthermore, the hornbill repeat motifs probably arose before the divergence of hornbills from each other but after the divergence of hornbills from other avian taxa. The mitochondrial control region of hornbills is suitable for both phylogenetic and population studies, with domains I and II probably more suited to population and phylogenetic analyses, respectively.
Modeling of DNA local parameters predicts encrypted architectural motifs in Xenopus laevis ribosomal gene promoter.

PubMed

Roux-Rouquie, M; Marilley, M

2000-09-15

We have modeled local DNA sequence parameters to search for DNA architectural motifs involved in transcription regulation and promotion within the Xenopus laevis ribosomal gene promoter and the intergenic spacer (IGS) sequences. The IGS was found to be shaped into distinct topological domains. First, intrinsic bends split the IGS into domains of common but different helical features. Local parameters at inter-domain junctions exhibit a high variability with respect to intrinsic curvature, bendability and thermal stability. Secondly, the repeated sequence blocks of the IGS exhibit right-handed supercoiled structures which could be related to their enhancer properties. Thirdly, the gene promoter presents both inherent curvature and minor groove narrowing which may be viewed as motifs of a structural code for protein recognition and binding. Such pre-existing deformations could simply be remodeled during the binding of the transcription complex. Alternatively, these deformations could pre-shape the promoter in such a way that further remodeling is facilitated. Mutations shown to abolish promoter curvature as well as intrinsic minor groove narrowing, in a variant which maintained full transcriptional activity, bring circumstantial evidence for structurally-preorganized motifs in relation to transcription regulation and promotion. Using well documented X. laevis rDNA regulatory sequences we showed that computer modeling may be of invaluable assistance in assessing encrypted architectural motifs. The evidence of these DNA topological motifs with respect to the concept of structural code is discussed.
Presence of the canonical TTAGG insect telomeric repeat in the Tenthredinidae (Symphyta) suggests its ancestral nature in the order Hymenoptera.

PubMed

Gokhman, Vladimir E; Kuznetsova, Valentina G

2018-06-01

Telomeric repeats in two members of the sawfly family Tenthredinidae (Hymenoptera), namely, Tenthredo omissa (Förster, 1844) and Taxonus agrorum (Fallén, 1808) (both have n = 10), were studied using fluorescence in situ hybridization (FISH). Chromosomes of both species were demonstrated to contain the canonical TTAGG insect telomeric repeat, which constitutes the first report of the (TTAGG) n telomeric motif for the Tenthredinidae as well as for the clade Eusymphyta and the suborder Symphyta in general. Taken together with the presence of this repeat in many other Holometabola as well as in the hymenopteran families Formicidae and Apidae from the suborder Apocrita, these results collectively suggest the ancestral nature of the (TTAGG) n telomeric motif in the Hymenoptera as well as its subsequent loss within the clade Unicalcarida and independent reappearance in ants and bees. If this is true, the loss of the TTAGG repeat can be considered as a synapomorphy of the corresponding clade.
Low abundance of microsatellite repeats in the genome of the Brown-headed Cowbird (Molothrus ater)

USGS Publications Warehouse

Longmire, Jonathan L.; Hahn, D.C.; Roach, J.L.

1999-01-01

A cosmid library made from brown-headed cowbird (Molothrus ater) DNA was examined for representation of 17 distinct microsatellite motifs including all possible mono-, di-, and trinucleotide microsatellites, and the tetranucleotide repeat (GATA)n. The overall density of microsatellites within cowbird DNA was found to be one repeat per 89 kb and the frequency of the most abundant motif, (AGC)n, was once every 382 kb. The abundance of microsatellites within the cowbird genome is estimated to be reduced approximately 15-fold compared to humans. The reduced frequency of microsatellites seen in this study is consistent with previous observations indicating reduced numbers of microsatellites and other interspersed repeats in avian DNA. In addition to providing new information concerning the abundance of microsatellites within an avian genome, these results provide useful insights for selecting cloning strategies that might be used in the development of locus-specific microsatellite markers for avian studies.
E-motif formed by extrahelical cytosine bases in DNA homoduplexes of trinucleotide and hexanucleotide repeats

PubMed Central

Pan, Feng; Zhang, Yuan; Man, Viet Hoang; Roland, Christopher

2018-01-01

Abstract Atypical DNA secondary structures play an important role in expandable trinucleotide repeat (TR) and hexanucleotide repeat (HR) diseases. The cytosine mismatches in C-rich homoduplexes and hairpin stems are weakly bonded; experiments show that for certain sequences these may flip out of the helix core, forming an unusual structure termed an ‘e-motif’. We have performed molecular dynamics simulations of C-rich TR and HR DNA homoduplexes in order to characterize the conformations, stability and dynamics of formation of the e-motif, where the mismatched cytosines symmetrically flip out in the minor groove, pointing their base moieties towards the 5′-direction in each strand. TRs have two non-equivalent reading frames, (GCC)n and (CCG)n; while HRs have three: (CCCGGC)n, (CGGCCC)n, (CCCCGG)n. We define three types of pseudo basepair steps related to the mismatches and show that the e-motif is only stable in (GCC)n and (CCCGGC)n homoduplexes due to the favorable stacking of pseudo GpC steps (whose nature depends on whether TRs or HRs are involved) and the formation of hydrogen bonds between the mismatched cytosine at position i and the cytosine (TRs) or guanine (HRs) at position i − 2 along the same strand. We also characterize the extended e-motif, where all mismatched cytosines are extruded, their extra-helical stacking additionally stabilizing the homoduplexes. PMID:29190385
Inheritance patterns of ATCCT repeat interruptions in spinocerebellar ataxia type 10 (SCA10) expansions.

PubMed

Landrian, Ivette; McFarland, Karen N; Liu, Jilin; Mulligan, Connie J; Rasmussen, Astrid; Ashizawa, Tetsuo

2017-01-01

Spinocerebellar ataxia type 10 (SCA10), an autosomal dominant cerebellar ataxia disorder, is caused by a non-coding ATTCT microsatellite repeat expansion in the ataxin 10 gene. In a subset of SCA10 families, the 5'-end of the repeat expansion contains a complex sequence of penta- and heptanucleotide interruption motifs which is followed by a pure tract of tandem ATCCT repeats of unknown length at its 3'-end. Intriguingly, expansions that carry these interruption motifs correlate with an epileptic seizure phenotype and are unstable despite the theory that interruptions are expected to stabilize expanded repeats. To examine the apparent contradiction of unstable, interruption-positive SCA10 expansion alleles and to determine whether the instability originates outside of the interrupted region, we sequenced approximately 1 kb of the 5'-end of SCA10 expansions using the ATCCT-PCR product in individuals across multiple generations from four SCA10 families. We found that the greatest instability within this region occurred in paternal transmissions of the allele in stretches of pure ATTCT motifs while the intervening interrupted sequences were stable. Overall, the ATCCT interruption changes by only one to three repeat units and therefore cannot account for the instability across the length of the disease allele. We conclude that the AT-rich interruptions locally stabilize the SCA10 expansion at the 5'-end but do not completely abolish instability across the entire span of the expansion. In addition, analysis of the interruption alleles across these families support a parsimonious single origin of the mutation with a shared distant ancestor.
The presence of the ancestral insect telomeric motif in kissing bugs (Triatominae) rules out the hypothesis of its loss in evolutionarily advanced Heteroptera (Cimicomorpha)

PubMed Central

Pita, Sebastián; Panzera, Francisco; Mora, Pablo; Vela, Jesús; Palomeque, Teresa; Lorite, Pedro

2016-01-01

Abstract Next-generation sequencing data analysis on Triatoma infestans Klug, 1834 (Heteroptera, Cimicomorpha, Reduviidae) revealed the presence of the ancestral insect (TTAGG)n telomeric motif in its genome. Fluorescence in situ hybridization confirms that chromosomes bear this telomeric sequence in their chromosomal ends. Furthermore, motif amount estimation was about 0.03% of the total genome, so that the average telomere length in each chromosomal end is almost 18 kb long. We also detected the presence of (TTAGG)n telomeric repeat in mitotic and meiotic chromosomes in other three species of Triatominae: Triatoma dimidiata Latreille, 1811, Dipetalogaster maxima Uhler, 1894, and Rhodnius prolixus Ståhl, 1859. This is the first report of the (TTAGG)n telomeric repeat in the infraorder Cimicomorpha, contradicting the currently accepted hypothesis that evolutionarily recent heteropterans lack this ancestral insect telomeric sequence. PMID:27830050
Multiple TPR motifs characterize the Fanconi anemia FANCG protein.

PubMed

Blom, Eric; van de Vrugt, Henri J; de Vries, Yne; de Winter, Johan P; Arwert, Fré; Joenje, Hans

2004-01-05

The genome protection pathway that is defective in patients with Fanconi anemia (FA) is controlled by at least eight genes, including BRCA2. A key step in the pathway involves the monoubiquitylation of FANCD2, which critically depends on a multi-subunit nuclear 'core complex' of at least six FANC proteins (FANCA, -C, -E, -F, -G, and -L). Except for FANCL, which has WD40 repeats and a RING finger domain, no significant domain structure has so far been recognized in any of the core complex proteins. By using a homology search strategy comparing the human FANCG protein sequence with its ortholog sequences in Oryzias latipes (Japanese rice fish) and Danio rerio (zebrafish) we identified at least seven tetratricopeptide repeat motifs (TPRs) covering a major part of this protein. TPRs are degenerate 34-amino acid repeat motifs which function as scaffolds mediating protein-protein interactions, often found in multiprotein complexes. In four out of five TPR motifs tested (TPR1, -2, -5, and -6), targeted missense mutagenesis disrupting the motifs at the critical position 8 of each TPR caused complete or partial loss of FANCG function. Loss of function was evident from failure of the mutant proteins to complement the cellular FA phenotype in FA-G lymphoblasts, which was correlated with loss of binding to FANCA. Although the TPR4 mutant fully complemented the cells, it showed a reduced interaction with FANCA, suggesting that this TPR may also be of functional importance. The recognition of FANCG as a typical TPR protein predicts this protein to play a key role in the assembly and/or stabilization of the nuclear FA protein core complex.
Site-specific Isopeptide Bridge Tethering of Chimeric gp41 N-terminal Heptad Repeat Helical Trimers for the Treatment of HIV-1 Infection.

PubMed

Wang, Chao; Li, Xue; Yu, Fei; Lu, Lu; Jiang, Xifeng; Xu, Xiaoyu; Wang, Huixin; Lai, Wenqing; Zhang, Tianhong; Zhang, Zhenqing; Ye, Ling; Jiang, Shibo; Liu, Keliang

2016-08-26

Peptides derived from the N-terminal heptad repeat (NHR) of HIV-1 gp41 can be potent inhibitors against viral entry when presented in a nonaggregating trimeric coiled-coil conformation via the introduction of exogenous trimerization motifs and intermolecular disulfide bonds. We recently discovered that crosslinking isopeptide bridges within the de novo helical trimers added exceptional resistance to unfolding. Herein, we attempted to optimize (CCIZN17)3, a representative disulfide bond-stabilized chimeric NHR-trimer, by incorporating site-specific interhelical isopeptide bonds as the redox-sensitive disulfide surrogate. In this process, we systematically examined the effect of isopeptide bond position and molecular sizes of auxiliary trimeric coiled-coil motif and NHR fragments on the antiviral potency of these NHR-trimers. Pleasingly, (IZ14N24N)3 possessed promising inhibitory activity against HIV-1 infection and markedly increased proteolytic stability relative to its disulfide-tethered counterpart, suggesting good potential for further development as an effective antiviral agent for treatment of HIV-1 infection.
A comprehensive analysis of three Asiatic black bear mitochondrial genomes (subspecies ussuricus, formosanus and mupinensis), with emphasis on the complete mtDNA sequence of Ursus thibetanus ussuricus (Ursidae).

PubMed

Hwang, Dae-Sik; Ki, Jang-Seu; Jeong, Dong-Hyuk; Kim, Bo-Hyun; Lee, Bae-Keun; Han, Sang-Hoon; Lee, Jae-Seong

2008-08-01

In the present paper, we describe the mitochondrial genome sequence of the Asiatic black bear (Ursus thibetanus ussuricus) with particular emphasis on the control region (CR), and compared with mitochondrial genomes on molecular relationships among the bears. The mitochondrial genome sequence of U. thibetanus ussuricus was 16,700 bp in size with mostly conserved structures (e.g. 13 protein-coding, two rRNA genes, 22 tRNA genes). The CR consisted of several typical conserved domains such as F, E, D, and C boxes, and a conserved sequence block. Nucleotide sequences and the repeated motifs in the CR were different among the bear species, and their copy numbers were also variable according to populations, even within F1 generations of U. thibetanus ussuricus. Comparative analyses showed that the CR D1 region was highly informative for the discrimination of the bear family. These findings suggest that nucleotide sequences of both repeated motifs and CR D1 in the bear family are good markers for species discriminations.
Site-specific Isopeptide Bridge Tethering of Chimeric gp41 N-terminal Heptad Repeat Helical Trimers for the Treatment of HIV-1 Infection

PubMed Central

Wang, Chao; Li, Xue; Yu, Fei; Lu, Lu; Jiang, Xifeng; Xu, Xiaoyu; Wang, Huixin; Lai, Wenqing; Zhang, Tianhong; Zhang, Zhenqing; Ye, Ling; Jiang, Shibo; Liu, Keliang

2016-01-01

Peptides derived from the N-terminal heptad repeat (NHR) of HIV-1 gp41 can be potent inhibitors against viral entry when presented in a nonaggregating trimeric coiled-coil conformation via the introduction of exogenous trimerization motifs and intermolecular disulfide bonds. We recently discovered that crosslinking isopeptide bridges within the de novo helical trimers added exceptional resistance to unfolding. Herein, we attempted to optimize (CCIZN17)3, a representative disulfide bond-stabilized chimeric NHR-trimer, by incorporating site-specific interhelical isopeptide bonds as the redox-sensitive disulfide surrogate. In this process, we systematically examined the effect of isopeptide bond position and molecular sizes of auxiliary trimeric coiled-coil motif and NHR fragments on the antiviral potency of these NHR-trimers. Pleasingly, (IZ14N24N)3 possessed promising inhibitory activity against HIV-1 infection and markedly increased proteolytic stability relative to its disulfide-tethered counterpart, suggesting good potential for further development as an effective antiviral agent for treatment of HIV-1 infection. PMID:27562370
The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats

PubMed Central

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-01-01

Background In Archeae and Bacteria, the repeated elements called CRISPRs for "clustered regularly interspaced short palindromic repeats" are believed to participate in the defence against viruses. Short sequences called spacers are stored in-between repeated elements. In the current model, motifs comprising spacers and repeats may target an invading DNA and lead to its degradation through a proposed mechanism similar to RNA interference. Analysis of intra-species polymorphism shows that new motifs (one spacer and one repeated element) are added in a polarised fashion. Although their principal characteristics have been described, a lot remains to be discovered on the way CRISPRs are created and evolve. As new genome sequences become available it appears necessary to develop automated scanning tools to make available CRISPRs related information and to facilitate additional investigations. Description We have produced a program, CRISPRFinder, which identifies CRISPRs and extracts the repeated and unique sequences. Using this software, a database is constructed which is automatically updated monthly from newly released genome sequences. Additional tools were created to allow the alignment of flanking sequences in search for similarities between different loci and to build dictionaries of unique sequences. To date, almost six hundred CRISPRs have been identified in 475 published genomes. Two Archeae out of thirty-seven and about half of Bacteria do not possess a CRISPR. Fine analysis of repeated sequences strongly supports the current view that new motifs are added at one end of the CRISPR adjacent to the putative promoter. Conclusion It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution. Subsequent analyses of the intra-species CRISPR polymorphism will be facilitated by CRISPRFinder and the dictionary creator. CRISPRdb is accessible at PMID:17521438
Regulatory elements involved in tax-mediated transactivation of the HTLV-I LTR.

PubMed

Seeler, J S; Muchardt, C; Podar, M; Gaynor, R B

1993-10-01

HTLV-I is the etiologic agent of adult T-cell leukemia. In this study, we investigated the regulatory elements and cellular transcription factors which function in modulating HTLV-I gene expression in response to the viral transactivator protein, tax. Transfection experiments into Jurkat cells of a variety of site-directed mutants in the HTLV-1 LTR indicated that each of the three motifs A, B, and C within the 21-bp repeats, the binding sites for the Ets family of proteins, and the TATA box all influenced the degree of tax-mediated activation. Tax is also able to activate gene expression of other viral and cellular promoters. Tax activation of the IL-2 receptor and the HIV-1 LTR is mediated through NF-kappa B motifs. Interestingly, sequences in the 21-bp repeat B and C motifs contain significant homology with NF-kappa B regulatory elements. We demonstrated that an NF-kappa B binding protein, PRDII-BF1, but not the rel protein, bound to the B and C motifs in the 21-bp repeat. PRDII-BF1 was also able to stimulate activation of HTLV-I gene expression by tax. The role of the Ets proteins on modulating tax activation was also studied. Ets 1 but not Ets 2 was capable of increasing the degree of tax activation of the HTLV-I LTR. These results suggest that tax activates gene expression by either direct or indirect interaction with several cellular transcription factors that bind to the HTLV-I LTR.
Identification of a "glycine-loop"-like coiled structure in the 34 AA Pro,Gly,Met repeat domain of the biomineral-associated protein, PM27.

PubMed

Wustman, Brandon A; Santos, Rudolpho; Zhang, Bo; Evans, John Spencer

2002-12-05

Fracture resistance in biomineralized structures has been linked to the presence of proteins, some of which possess sequences that are associated with elastic behavior. One such protein superfamily, the Pro,Gly-rich sea urchin intracrystalline spicule matrix proteins, form protein-protein supramolecular assemblies that modify the microstructure and fracture-resistant properties of the calcium carbonate mineral phase within embryonic sea urchin spicules and adult sea urchin spines. In this report, we detail the identification of a repetitive keratin-like "glycine-loop"- or coil-like structure within the 34-AA (AA: amino acid) N-terminal domain, (PGMG)(8)PG, of the spicule matrix protein, PM27. The identification of this repetitive structural motif was accomplished using two capped model peptides: a 9-AA sequence, GPGMGPGMG, and a 34-AA peptide representing the entire motif. Using CD, NMR spectrometry, and molecular dynamics simulated annealing/minimization simulations, we have determined that the 9-AA model peptide adopts a loop-like structure at pH 7.4. The structure of the 34-AA polypeptide resembles a coil structure consisting of repeating loop motifs that do not exhibit long-range ordering. Given that loop structures have been associated with protein elastic behavior and protein motion, it is plausible that the 34-AA Pro,Gly,Met repeat sequence motif in PM27 represents a putative elastic or mobile domain. Copyright 2002 Wiley Periodicals, Inc.
Modeling of DNA local parameters predicts encrypted architectural motifs in Xenopus laevis ribosomal gene promoter

PubMed Central

Roux-Rouquie, Magali; Marilley, Monique

2000-01-01

We have modeled local DNA sequence parameters to search for DNA architectural motifs involved in transcription regulation and promotion within the Xenopus laevis ribosomal gene promoter and the intergenic spacer (IGS) sequences. The IGS was found to be shaped into distinct topological domains. First, intrinsic bends split the IGS into domains of common but different helical features. Local parameters at inter-domain junctions exhibit a high variability with respect to intrinsic curvature, bendability and thermal stability. Secondly, the repeated sequence blocks of the IGS exhibit right-handed supercoiled structures which could be related to their enhancer properties. Thirdly, the gene promoter presents both inherent curvature and minor groove narrowing which may be viewed as motifs of a structural code for protein recognition and binding. Such pre-existing deformations could simply be remodeled during the binding of the transcription complex. Alternatively, these deformations could pre-shape the promoter in such a way that further remodeling is facilitated. Mutations shown to abolish promoter curvature as well as intrinsic minor groove narrowing, in a variant which maintained full transcriptional activity, bring circumstantial evidence for structurally-preorganized motifs in relation to transcription regulation and promotion. Using well documented X.laevis rDNA regulatory sequences we showed that computer modeling may be of invaluable assistance in assessing encrypted architectural motifs. The evidence of these DNA topological motifs with respect to the concept of structural code is discussed. PMID:10982860
Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

PubMed

Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.
Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

PubMed Central

Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551
Placement of molecules in (not out of) the cell

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dauter, Zbigniew, E-mail: dauter@anl.gov

2013-01-01

The importance of presenting macromolecular structures in unified, standard ways is discussed. To uniquely describe a crystal structure, it is sufficient to specify the crystal unit cell and symmetry, and describe the unique structural motif which is repeated by the space-group symmetry throughout the whole crystal. It is somewhat arbitrary how such a unique motif can be defined and positioned with respect to the unit-cell origin. As a result of such freedom, some isomorphous structures are presented in the Protein Data Bank in different locations and appear as if they have different atomic coordinates, despite being completely equivalent structurally. Thismore » may easily confuse those users of the PDB who are less familiar with crystallographic symmetry transformations. It would therefore be beneficial for the community of PDB users to introduce standard rules for locating crystal structures of macromolecules in the unit cells of various space groups.« less
Leucine-rich Repeats of Bacterial Surface Proteins Serve as Common Pattern Recognition Motifs of Human Scavenger Receptor gp340*

PubMed Central

Loimaranta, Vuokko; Hytönen, Jukka; Pulliainen, Arto T.; Sharma, Ashu; Tenovuo, Jorma; Strömberg, Nicklas; Finne, Jukka

2009-01-01

Scavenger receptors are innate immune molecules recognizing and inducing the clearance of non-host as well as modified host molecules. To recognize a wide pattern of invading microbes, many scavenger receptors bind to common pathogen-associated molecular patterns, such as lipopolysaccharides and lipoteichoic acids. Similarly, the gp340/DMBT1 protein, a member of the human scavenger receptor cysteine-rich protein family, displays a wide ligand repertoire. The peptide motif VEVLXXXXW derived from its scavenger receptor cysteine-rich domains is involved in some of these interactions, but most of the recognition mechanisms are unknown. In this study, we used mass spectrometry sequencing, gene inactivation, and recombinant proteins to identify Streptococcus pyogenes protein Spy0843 as a recognition receptor of gp340. Antibodies against Spy0843 are shown to protect against S. pyogenes infection, but no function or host receptor have been identified for the protein. Spy0843 belongs to the leucine-rich repeat (Lrr) family of eukaryotic and prokaryotic proteins. Experiments with truncated forms of the recombinant proteins confirmed that the Lrr region is needed in the binding of Spy0843 to gp340. The same motif of two other Lrr proteins, LrrG from the Gram-positive S. agalactiae and BspA from the Gram-negative Tannerella forsythia, also mediated binding to gp340. Moreover, inhibition of Spy0843 binding occurred with peptides containing the VEVLXXXXW motif, but also peptides devoid of the XXXXW motif inhibited binding of Lrr proteins. These results thus suggest that the conserved Lrr motif in bacterial proteins serves as a novel pattern recognition motif for unique core peptides of human scavenger receptor gp340. PMID:19465482
Analysis decorating design on Perahu Buatan Barat, the Malay traditional boat by using frieze pattern

NASA Astrophysics Data System (ADS)

Wahab, Mohd Rohaizat Abdul; Ramli, Zuliskandar; Zakaria, Ros Mahwati Ahmad; Samad, Mohammad Anis Abdul

2017-01-01

Boat building tradition is one of the skills mastered by Malay craftsmen. Decoration on the Perahu Buatan Barat, the Malay traditional boat is one of the uniqueness of the production of traditional boats in East Coast of Malaysia. The tradition of Malay boat building, each plank was given specific names based on the line of planks. There is one line called `papan tarik' or `papan cantik' was usually decorated with paintings by a variety of motifs and patterns from the bow to the stern of the boat. The motifs usually taken from the surrounding environment as well as flora and fauna will be painted with motifs repeated but with differing formations. The aim of this study is to identify the motifs and analyze the formation of motifs by using mathematical methods of frieze pattern.

Conserved DNA motifs in the type II-A CRISPR leader region.

PubMed

Van Orden, Mason J; Klein, Peter; Babu, Kesavan; Najar, Fares Z; Rajan, Rakhi

2017-01-01

The Clustered Regularly Interspaced Short Palindromic Repeats associated (CRISPR-Cas) systems consist of RNA-protein complexes that provide bacteria and archaea with sequence-specific immunity against bacteriophages, plasmids, and other mobile genetic elements. Bacteria and archaea become immune to phage or plasmid infections by inserting short pieces of the intruder DNA (spacer) site-specifically into the leader-repeat junction in a process called adaptation. Previous studies have shown that parts of the leader region, especially the 3' end of the leader, are indispensable for adaptation. However, a comprehensive analysis of leader ends remains absent. Here, we have analyzed the leader, repeat, and Cas proteins from 167 type II-A CRISPR loci. Our results indicate two distinct conserved DNA motifs at the 3' leader end: ATTTGAG (noted previously in the CRISPR1 locus of Streptococcus thermophilus DGCC7710) and a newly defined CTRCGAG, associated with the CRISPR3 locus of S. thermophilus DGCC7710. A third group with a very short CG DNA conservation at the 3' leader end is observed mostly in lactobacilli. Analysis of the repeats and Cas proteins revealed clustering of these CRISPR components that mirrors the leader motif clustering, in agreement with the coevolution of CRISPR-Cas components. Based on our analysis of the type II-A CRISPR loci, we implicate leader end sequences that could confer site-specificity for the adaptation-machinery in the different subsets of type II-A CRISPR loci.
Conserved DNA motifs in the type II-A CRISPR leader region

PubMed Central

Babu, Kesavan; Najar, Fares Z.

2017-01-01

The Clustered Regularly Interspaced Short Palindromic Repeats associated (CRISPR-Cas) systems consist of RNA-protein complexes that provide bacteria and archaea with sequence-specific immunity against bacteriophages, plasmids, and other mobile genetic elements. Bacteria and archaea become immune to phage or plasmid infections by inserting short pieces of the intruder DNA (spacer) site-specifically into the leader-repeat junction in a process called adaptation. Previous studies have shown that parts of the leader region, especially the 3′ end of the leader, are indispensable for adaptation. However, a comprehensive analysis of leader ends remains absent. Here, we have analyzed the leader, repeat, and Cas proteins from 167 type II-A CRISPR loci. Our results indicate two distinct conserved DNA motifs at the 3′ leader end: ATTTGAG (noted previously in the CRISPR1 locus of Streptococcus thermophilus DGCC7710) and a newly defined CTRCGAG, associated with the CRISPR3 locus of S. thermophilus DGCC7710. A third group with a very short CG DNA conservation at the 3′ leader end is observed mostly in lactobacilli. Analysis of the repeats and Cas proteins revealed clustering of these CRISPR components that mirrors the leader motif clustering, in agreement with the coevolution of CRISPR-Cas components. Based on our analysis of the type II-A CRISPR loci, we implicate leader end sequences that could confer site-specificity for the adaptation-machinery in the different subsets of type II-A CRISPR loci. PMID:28392985
Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

PubMed Central

Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

2007-01-01

We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
Molecular architecture of silk fibroin of Indian golden silkmoth, Antheraea assama.

PubMed

Gupta, Adarsh K; Mita, Kazuei; Arunkumar, Kallare P; Nagaraju, Javaregowda

2015-08-03

The golden silk spun by Indian golden silkmoth Antheraea assama, is regarded for its shimmering golden luster, tenacity and value as biomaterial. This report describes the gene coding for golden silk H-fibroin (AaFhc), its expression, full-length sequence and structurally important motifs discerning the underlying genetic and biochemical factors responsible for its much sought-after properties. The coding region, with biased isocodons, encodes highly repetitious crystalline core, flanked by a pair of 5' and 3' non-repetitious ends. AaFhc mRNA expression is strictly territorial, confined to the posterior silk gland, encoding a protein of size 230 kDa, which makes homodimers making the elementary structural units of the fibrous core of the golden silk. Characteristic polyalanine repeats that make tight β-sheet crystals alternate with non-polyalanine repeats that make less orderly antiparallel β-sheets, β-turns and partial α-helices. Phylogenetic analysis of the conserved N-terminal amorphous motif and the comparative analysis of the crystalline region with other saturniid H-fibroins reveal that AaFhc has longer, numerous and relatively uniform repeat motifs with lower serine content that assume tighter β-crystals and denser packing, which are speculated to be responsible for its acclaimed properties of higher tensile strength and higher refractive index responsible for golden luster.
[Comparative analysis of clustered regularly interspaced short palindromic repeats (CRISPRs) loci in the genomes of halophilic archaea].

PubMed

Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian

2009-11-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.
Fragile DNA Motifs Trigger Mutagenesis at Distant Chromosomal Loci in Saccharomyces cerevisiae

PubMed Central

Saini, Natalie; Zhang, Yu; Nishida, Yuri; Sheng, Ziwei; Choudhury, Shilpa; Mieczkowski, Piotr; Lobachev, Kirill S.

2013-01-01

DNA sequences capable of adopting non-canonical secondary structures have been associated with gross-chromosomal rearrangements in humans and model organisms. Previously, we have shown that long inverted repeats that form hairpin and cruciform structures and triplex-forming GAA/TTC repeats induce the formation of double-strand breaks which trigger genome instability in yeast. In this study, we demonstrate that breakage at both inverted repeats and GAA/TTC repeats is augmented by defects in DNA replication. Increased fragility is associated with increased mutation levels in the reporter genes located as far as 8 kb from both sides of the repeats. The increase in mutations was dependent on the presence of inverted or GAA/TTC repeats and activity of the translesion polymerase Polζ. Mutagenesis induced by inverted repeats also required Sae2 which opens hairpin-capped breaks and initiates end resection. The amount of breakage at the repeats is an important determinant of mutations as a perfect palindromic sequence with inherently increased fragility was also found to elevate mutation rates even in replication-proficient strains. We hypothesize that the underlying mechanism for mutagenesis induced by fragile motifs involves the formation of long single-stranded regions in the broken chromosome, invasion of the undamaged sister chromatid for repair, and faulty DNA synthesis employing Polζ. These data demonstrate that repeat-mediated breaks pose a dual threat to eukaryotic genome integrity by inducing chromosomal aberrations as well as mutations in flanking genes. PMID:23785298
Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus

PubMed Central

Singh, Swati; Gupta, Sanchita; Mani, Ashutosh; Chaturvedi, Anoop

2012-01-01

Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. The genetically characterized domains in these databases are limited due to non-availability of reliable molecular markers. The large data of EST sequences are available in hops. The simple sequence repeat markers extracted from EST data are used as molecular markers for genetic characterization, in the present study. 25,495 EST sequences were examined and assembled to get full-length sequences. Maximum frequency distribution was shown by mononucleotide SSR motifs i.e. 60.44% in contig and 62.16% in singleton where as minimum frequency are observed for hexanucleotide SSR in contig (0.09%) and pentanucleotide SSR in singletons (0.12%). Maximum trinucleotide motifs code for Glutamic acid (GAA) while AT/TA were the most frequent repeat of dinucleotide SSRs. Flanking primer pairs were designed in-silico for the SSR containing sequences. Functional categorization of SSRs containing sequences was done through gene ontology terms like biological process, cellular component and molecular function. PMID:22368382
Structural basis of DNA sequence recognition by the response regulator PhoP in Mycobacterium tuberculosis.

PubMed

He, Xiaoyuan; Wang, Liqin; Wang, Shuishu

2016-04-15

The transcriptional regulator PhoP is an essential virulence factor in Mycobacterium tuberculosis, and it presents a target for the development of new anti-tuberculosis drugs and attenuated tuberculosis vaccine strains. PhoP binds to DNA as a highly cooperative dimer by recognizing direct repeats of 7-bp motifs with a 4-bp spacer. To elucidate the PhoP-DNA binding mechanism, we determined the crystal structure of the PhoP-DNA complex. The structure revealed a tandem PhoP dimer that bound to the direct repeat. The surprising tandem arrangement of the receiver domains allowed the four domains of the PhoP dimer to form a compact structure, accounting for the strict requirement of a 4-bp spacer and the highly cooperative binding of the dimer. The PhoP-DNA interactions exclusively involved the effector domain. The sequence-recognition helix made contact with the bases of the 7-bp motif in the major groove, and the wing interacted with the adjacent minor groove. The structure provides a starting point for the elucidation of the mechanism by which PhoP regulates the virulence of M. tuberculosis and guides the design of screening platforms for PhoP inhibitors.
The novel Tau mutation G335S: clinical, neuropathological and molecular characterization.

PubMed

Spina, Salvatore; Murrell, Jill R; Yoshida, Hirotaka; Ghetti, Bernardino; Bermingham, Niamh; Sweeney, Brian; Dlouhy, Stephen R; Crowther, R Anthony; Goedert, Michel; Keohane, Catherine

2007-04-01

Mutations in Tau cause the inherited neurodegenerative disease, frontotemporal dementia and Parkinsonism linked to chromosome 17 (FTDP-17). Known coding region mutations cluster in the microtubule-binding region, where they alter the ability of tau to promote microtubule assembly. Depending on the tau isoforms, this region consists of three or four imperfect repeats of 31 or 32 amino acids, each of which contains a characteristic and invariant PGGG motif. Here, we report the novel G335S mutation, which changes the PGGG motif of the third tau repeat to PGGS, in an individual who developed social withdrawal, emotional bluntness and stereotypic behavior at age 22, followed by disinhibition, hyperorality and ideomotor apraxia. Abundant tau-positive inclusions were present in neurons and glia in the frontotemporal cortex, hippocampus and brainstem. Sarkosyl-insoluble tau showed paired helical and straight filaments, as well as more irregular rope-like filaments. The pattern of pathological tau bands was like that of Alzheimer disease. Experimentally, the G335S mutation resulted in a greatly reduced ability of tau to promote microtubule assembly, while having no significant effect on heparin-induced assembly of recombinant tau into filaments.
Development and characterization of simple sequence repeats for Bipolaris sokiniana and cross transferability to related species

USDA-ARS?s Scientific Manuscript database

Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...
Are the TTAGG and TTAGGG telomeric repeats phylogenetically conserved in aculeate Hymenoptera?

NASA Astrophysics Data System (ADS)

Menezes, Rodolpho S. T.; Bardella, Vanessa B.; Cabral-de-Mello, Diogo C.; Lucena, Daercio A. A.; Almeida, Eduardo A. B.

2017-10-01

Despite the (TTAGG)n telomeric repeat supposed being the ancestral DNA motif of telomeres in insects, it was repeatedly lost within some insect orders. Notably, parasitoid hymenopterans and the social wasp Metapolybia decorata (Gribodo) lack the (TTAGG)n sequence, but in other representatives of Hymenoptera, this motif was noticed, such as different ant species and the honeybee. These findings raise the question of whether the insect telomeric repeat is or not phylogenetically predominant in Hymenoptera. Thus, we evaluated the occurrence of both the (TTAGG)n sequence and the vertebrate telomere sequence (TTAGGG)n using dot-blotting hybridization in 25 aculeate species of Hymenoptera. Our results revealed the absence of (TTAGG)n sequence in all tested species, elevating the number of hymenopteran families lacking this telomeric sequence to 13 out of the 15 tested families so far. The (TTAGGG)n was not observed in any tested species. Based on our data and compiled information, we suggest that the (TTAGG)n sequence was putatively lost in the ancestor of Apocrita with at least two subsequent independent regains (in Formicidae and Apidae).
NF-Y Binding Site Architecture Defines a C-Fos Targeted Promoter Class

PubMed Central

Haubrock, Martin; Hartmann, Fabian; Wingender, Edgar

2016-01-01

ChIP-seq experiments detect the chromatin occupancy of known transcription factors in a genome-wide fashion. The comparisons of several species-specific ChIP-seq libraries done for different transcription factors have revealed a complex combinatorial and context-specific co-localization behavior for the identified binding regions. In this study we have investigated human derived ChIP-seq data to identify common cis-regulatory principles for the human transcription factor c-Fos. We found that in four different cell lines, c-Fos targeted proximal and distal genomic intervals show prevalences for either AP-1 motifs or CCAAT boxes as known binding motifs for the transcription factor NF-Y, and thereby act in a mutually exclusive manner. For proximal regions of co-localized c-Fos and NF-YB binding, we gathered evidence that a characteristic configuration of repeating CCAAT motifs may be responsible for attracting c-Fos, probably provided by a nearby AP-1 bound enhancer. Our results suggest a novel regulatory function of NF-Y in gene-proximal regions. Specific CCAAT dimer repeats bound by the transcription factor NF-Y define this novel cis-regulatory module. Based on this behavior we propose a new enhancer promoter interaction model based on AP-1 motif defined enhancers which interact with CCAAT-box characterized promoter regions. PMID:27517874
Comparative genomics of pyridoxal 5′-phosphate-dependent transcription factor regulons in Bacteria

PubMed Central

Suvorova, Inna A.

2016-01-01

The MocR-subfamily transcription factors (MocR-TFs) characterized by the GntR-family DNA-binding domain and aminotransferase-like sensory domain are broadly distributed among certain lineages of Bacteria. Characterized MocR-TFs bind pyridoxal 5′-phosphate (PLP) and control transcription of genes involved in PLP, gamma aminobutyric acid (GABA) and taurine metabolism via binding specific DNA operator sites. To identify putative target genes and DNA binding motifs of MocR-TFs, we performed comparative genomics analysis of over 250 bacterial genomes. The reconstructed regulons for 825 MocR-TFs comprise structural genes from over 200 protein families involved in diverse biological processes. Using the genome context and metabolic subsystem analysis we tentatively assigned functional roles for 38 out of 86 orthologous groups of studied regulators. Most of these MocR-TF regulons are involved in PLP metabolism, as well as utilization of GABA, taurine and ectoine. The remaining studied MocR-TF regulators presumably control genes encoding enzymes involved in reduction/oxidation processes, various transporters and PLP-dependent enzymes, for example aminotransferases. Predicted DNA binding motifs of MocR-TFs are generally similar in each orthologous group and are characterized by two to four repeated sequences. Identified motifs were classified according to their structures. Motifs with direct and/or inverted repeat symmetry constitute the majority of inferred DNA motifs, suggesting preferable TF dimerization in head-to-tail or head-to-head configuration. The obtained genomic collection of in silico reconstructed MocR-TF motifs and regulons in Bacteria provides a basis for future experimental characterization of molecular mechanisms for various regulators in this family. PMID:28348826
Base-Pairing Energies of Protonated Nucleoside Base Pairs of dCyd and m5dCyd: Implications for the Stability of DNA i-Motif Conformations

NASA Astrophysics Data System (ADS)

Yang, Bo; Rodgers, M. T.

2015-08-01

Hypermethylation of cytosine in expanded (CCG)n•(CGG)n trinucleotide repeats results in Fragile X syndrome, the most common cause of inherited mental retardation. The (CCG)n•(CGG)n repeats adopt i-motif conformations that are preferentially stabilized by base-pairing interactions of protonated base pairs of cytosine. Here we investigate the effects of 5-methylation and the sugar moiety on the base-pairing energies (BPEs) of protonated cytosine base pairs by examining protonated nucleoside base pairs of 2'-deoxycytidine (dCyd) and 5-methyl-2'-deoxycytidine (m5dCyd) using threshold collision-induced dissociation techniques. 5-Methylation of a single or both cytosine residues leads to very small change in the BPE. However, the accumulated effect may be dramatic in diseased state trinucleotide repeats where many methylated base pairs may be present. The BPEs of the protonated nucleoside base pairs examined here significantly exceed those of Watson-Crick dGuo•dCyd and neutral dCyd•dCyd base pairs, such that these base-pairing interactions provide the major forces responsible for stabilization of DNA i-motif conformations. Compared with isolated protonated nucleobase pairs of cytosine and 1-methylcytosine, the 2'-deoxyribose sugar produces an effect similar to the 1-methyl substituent, and leads to a slight decrease in the BPE. These results suggest that the base-pairing interactions may be slightly weaker in nucleic acids, but that the extended backbone is likely to exert a relatively small effect on the total BPE. The proton affinity (PA) of m5dCyd is also determined by competitive analysis of the primary dissociation pathways that occur in parallel for the protonated (m5dCyd)H+(dCyd) nucleoside base pair and the absolute PA of dCyd previously reported.
Cyclin-dependent kinase (CDK) phosphorylation destabilizes somatic Wee1 via multiple pathways

PubMed Central

Watanabe, Nobumoto; Arai, Harumi; Iwasaki, Jun-ichi; Shiina, Masaaki; Ogata, Kazuhiro; Hunter, Tony; Osada, Hiroyuki

2005-01-01

At the onset of M phase, the activity of somatic Wee1 (Wee1A), the inhibitory kinase for cyclin-dependent kinase (CDK), is down-regulated primarily through proteasome-dependent degradation after ubiquitination by the E3 ubiquitin ligase SCFβ-TrCP. The F-box protein β-TrCP (β-transducin repeat-containing protein), the substrate recognition component of the ubiquitin ligase, binds to its substrates through a conserved binding motif (phosphodegron) containing two phosphoserines, DpSGXXpS. Although Wee1A lacks this motif, phosphorylation of serines 53 and 123 (S53 and S123) of Wee1A by polo-like kinase 1 (Plk1) and CDK, respectively, are required for binding to β-TrCP. The sequence surrounding phosphorylated S53 (DpSAFQE) is similar to the conserved β-TrCP-binding motif; however, the role of S123 phosphorylation (EEGFGSSpSPVK) in β-TrCP binding was not elucidated. In the present study, we show that phosphorylation of S123 (pS123) by CDK promoted the binding of Wee1A to β-TrCP through three independent mechanisms. The pS123 not only directly interacted with basic residues in the WD40 repeat domain of β-TrCP but also primed phosphorylation by two independent protein kinases, Plk1 and CK2 (formerly casein kinase 2), to create two phosphodegrons on Wee1A. In the case of Plk1, S123 phosphorylation created a polo box domain-binding motif (SpSP) on Wee1A to accelerate phosphorylation of S53 by Plk1. CK2 could phosphorylate S121, but only if S123 was phosphorylated first, thereby generating the second β-TrCP-binding site (EEGFGpS121). Using a specific inhibitor of CK2, we showed that the phosphorylation-dependent degradation of Wee1A is important for the proper onset of mitosis. PMID:16085715
ATM activation and its recruitment to damaged DNA require binding to the C terminus of Nbs1.

PubMed

You, Zhongsheng; Chahwan, Charly; Bailis, Julie; Hunter, Tony; Russell, Paul

2005-07-01

ATM has a central role in controlling the cellular responses to DNA damage. It and other phosphoinositide 3-kinase-related kinases (PIKKs) have giant helical HEAT repeat domains in their amino-terminal regions. The functions of these domains in PIKKs are not well understood. ATM activation in response to DNA damage appears to be regulated by the Mre11-Rad50-Nbs1 (MRN) complex, although the exact functional relationship between the MRN complex and ATM is uncertain. Here we show that two pairs of HEAT repeats in fission yeast ATM (Tel1) interact with an FXF/Y motif at the C terminus of Nbs1. This interaction resembles nucleoporin FXFG motif binding to HEAT repeats in importin-beta. Budding yeast Nbs1 (Xrs2) appears to have two FXF/Y motifs that interact with Tel1 (ATM). In Xenopus egg extracts, the C terminus of Nbs1 recruits ATM to damaged DNA, where it is subsequently autophosphorylated. This interaction is essential for ATM activation. A C-terminal 147-amino-acid fragment of Nbs1 that has the Mre11- and ATM-binding domains can restore ATM activation in an Nbs1-depleted extract. We conclude that an interaction between specific HEAT repeats in ATM and the C-terminal FXF/Y domain of Nbs1 is essential for ATM activation. We propose that conformational changes in the MRN complex that occur upon binding to damaged DNA are transmitted through the FXF/Y-HEAT interface to activate ATM. This interaction also retains active ATM at sites of DNA damage.
Characterization of carotenoid hydroxylase gene promoter in Haematococcus pluvialis.

PubMed

Meng, C X; Wei, W; Su, Z- L; Qin, S

2006-10-01

Astaxanthin, a high-value ketocarotenoid is mainly used in fish aquaculture. It also has potential in human health due to its higher antioxidant capacity than beta-carotene and vitamin E. The unicellular green alga Haematococcus pluvialis is known to accumulate astaxanthin in response to environmental stresses, such as high light intensity and salt stress. Carotenoid hydroxylase plays a key role in astaxanthin biosynthesis in H. pluvialis. In this paper, we report the characterization of a promoter-like region (-378 to -22 bp) of carotenoid hydroxylase gene by cloning, sequence analysis and functional verification of its 919 bp 5'-flanking region in H. pluvialis. The 5'-flanking region was characterized using micro-particle bombardment method and transient expression of LacZ reporter gene. Results of sequence analysis showed that the 5'-flanking region might have putative cis-acting elements, such as ABA (abscisic acid)-responsive element (ABRE), C-repeat/dehydration responsive element (C-repeat/DRE), ethylene-responsive element (ERE), heat-shock element (HSE), wound-responsive element (WUN-motif), gibberellin-responsive element (P-box), MYB-binding site (MBS) etc., except for typical TATA and CCAAT boxes. Results of 5' deletions construct and beta-galactosidase assays revealed that a highest promoter-like region might exist from -378 to -22 bp and some negative regulatory elements might lie in the region from -919 to -378 bp. Results of site-directed mutagenesis of a putative C-repeat/DRE and an ABRE-like motif in the promoter-like region (-378 to -22 bp) indicated that the putative C-repeat/DRE and ABRE-like motif might be important for expression of carotenoid hydroxylase gene.
Experimental definition of a clustered regularly interspaced short palindromic duplicon in Escherichia coli.

PubMed

Goren, Moran G; Yosef, Ido; Auster, Oren; Qimron, Udi

2012-10-12

We analyzed sequences of newly inserted repeats in an Escherichia coli CRISPR (clustered regularly interspaced short palindromic repeats) array in vivo and showed that a base previously thought to belong to the repeat is actually derived from a protospacer. Based on further experimental results, we propose to use the term "duplicon" for a repeated sequence in a CRISPR array that serves as a template for a new duplicon. Our findings suggest the possibility of redrawing the borders between repeats, spacers, and protospacer adjacent motifs. Copyright © 2012 Elsevier Ltd. All rights reserved.
Genome-wide analysis of tandem repeats in plants and green algae

Treesearch

Zhixin Zhao; Cheng Guo; Sreeskandarajan Sutharzan; Pei Li; Craig Echt; Jie Zhang; Chun Liang

2014-01-01

Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among...
Cross-species transferability and mapping of genomic and cDNA SSRs in pines

Treesearch

D. Chagne; P. Chaumeil; A. Ramboer; C. Collada; A. Guevara; M. T. Cervera; G. G. Vendramin; V. Garcia; J-M. Frigerio; Craig Echt; T. Richardson; Christophe Plomion

2004-01-01

Two unigene datasets of Pinus taeda and Pinus pinaster were screened to detect di-, tri and tetranucleotide repeated motifs using the SSRIT script. A total of 419 simple sequence repeats (SSRs) were identified, from which only 12.8% overlapped between the two sets. The position of the SSRs within the coding sequence were predicted...

Validation of a screening tool for the rapid and reliable detection of CGG trinucleotide repeat expansions in FMR1.

PubMed

Basehore, Monica J; Marlowe, Natalia M; Jones, Julie R; Behlendorf, Deborah E; Laver, Thomas A; Friez, Michael J

2012-06-01

Most individuals with intellectual disability and/or autism are tested for Fragile X syndrome at some point in their lifetime. Greater than 99% of individuals with Fragile X have an expanded CGG trinucleotide repeat motif in the promoter region of the FMR1 gene, and diagnostic testing involves determining the size of the CGG repeat as well as methylation status when an expansion is present. Using a previously described triplet repeat-primed polymerase chain reaction, we have performed additional validation studies using two cohorts with previous diagnostic testing results available for comparison purposes. The first cohort (n=88) consisted of both males and females and had a high percentage of abnormal samples, while the second cohort (n=624) consisted of only females and was not enriched for expansion mutations. Data from each cohort were completely concordant with the results previously obtained during the course of diagnostic testing. This study further demonstrates the utility of using laboratory-developed triplet repeat-primed FMR1 testing in a clinical setting.
A Legionella pneumophila collagen-like protein encoded by a gene with a variable number of tandem repeats is involved in the adherence and invasion of host cells.

PubMed

Vandersmissen, Liesbeth; De Buck, Emmy; Saels, Veerle; Coil, David A; Anné, Jozef

2010-05-01

Legionella pneumophila is a Gram-negative, facultative intracellular pathogen and the causative agent of Legionnaires' disease, a severe pneumonia in humans. Analysis of the Legionella sequenced genomes revealed a gene with a variable number of tandem repeats (VNTRs), whose number varies between strains. We examined the strain distribution of this gene among a collection of 108 clinical, environmental and hot spring serotype I strains. Twelve variants were identified, but no correlation was observed between the number of repeat units and clinical and environmental strains. The encoded protein contains the C-terminal consensus motif of outer membrane proteins and has a large region of collagen-like repeats that is encoded by the VNTR region. We have therefore annotated this protein Lcl for Legionella collagen-like protein. Lcl was shown to contribute to the adherence and invasion of host cells and it was demonstrated that the number of repeat units present in lcl had an influence on these adhesion characteristics.
Structural Basis of PP2A Inhibition by Small t Antigen

PubMed Central

Cho, Uhn Soo; Morrone, Seamus; Sablina, Anna A; Arroyo, Jason D; Hahn, William C; Xu, Wenqing

2007-01-01

The SV40 small t antigen (ST) is a potent oncoprotein that perturbs the function of protein phosphatase 2A (PP2A). ST directly interacts with the PP2A scaffolding A subunit and alters PP2A activity by displacing regulatory B subunits from the A subunit. We have determined the crystal structure of full-length ST in complex with PP2A A subunit at 3.1 Å resolution. ST consists of an N-terminal J domain and a C-terminal unique domain that contains two zinc-binding motifs. Both the J domain and second zinc-binding motif interact with the intra-HEAT-repeat loops of HEAT repeats 3–7 of the A subunit, which overlaps with the binding site of the PP2A B56 subunit. Intriguingly, the first zinc-binding motif is in a position that may allow it to directly interact with and inhibit the phosphatase activity of the PP2A catalytic C subunit. These observations provide a structural basis for understanding the oncogenic functions of ST. PMID:17608567
MSDB: A Comprehensive Database of Simple Sequence Repeats

PubMed Central

Avvaru, Akshay Kumar; Saxena, Saketh; Mishra, Rakesh Kumar

2017-01-01

Abstract Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1–6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. PMID:28854643
Characterization of Spindle Checkpoint Kinase Mps1 Reveals Domain with Functional and Structural Similarities to Tetratricopeptide Repeat Motifs of Bub1 and BubR1 Checkpoint Kinases*

PubMed Central

Lee, Semin; Thebault, Philippe; Freschi, Luca; Beaufils, Sylvie; Blundell, Tom L.; Landry, Christian R.; Bolanos-Garcia, Victor M.; Elowe, Sabine

2012-01-01

Kinetochore targeting of the mitotic kinases Bub1, BubR1, and Mps1 has been implicated in efficient execution of their functions in the spindle checkpoint, the self-monitoring system of the eukaryotic cell cycle that ensures chromosome segregation occurs with high fidelity. In all three kinases, kinetochore docking is mediated by the N-terminal region of the protein. Deletions within this region result in checkpoint failure and chromosome segregation defects. Here, we use an interdisciplinary approach that includes biophysical, biochemical, cell biological, and bioinformatics methods to study the N-terminal region of human Mps1. We report the identification of a tandem repeat of the tetratricopeptide repeat (TPR) motif in the N-terminal kinetochore binding region of Mps1, with close homology to the tandem TPR motif of Bub1 and BubR1. Phylogenetic analysis indicates that TPR Mps1 was acquired after the split between deutorostomes and protostomes, as it is distinguishable in chordates and echinoderms. Overexpression of TPR Mps1 resulted in decreased efficiency of both chromosome alignment and mitotic arrest, likely through displacement of endogenous Mps1 from the kinetochore and decreased Mps1 catalytic activity. Taken together, our multidisciplinary strategy provides new insights into the evolution, structural organization, and function of Mps1 N-terminal region. PMID:22187426
Characterization of spindle checkpoint kinase Mps1 reveals domain with functional and structural similarities to tetratricopeptide repeat motifs of Bub1 and BubR1 checkpoint kinases.

PubMed

Lee, Semin; Thebault, Philippe; Freschi, Luca; Beaufils, Sylvie; Blundell, Tom L; Landry, Christian R; Bolanos-Garcia, Victor M; Elowe, Sabine

2012-02-17

Kinetochore targeting of the mitotic kinases Bub1, BubR1, and Mps1 has been implicated in efficient execution of their functions in the spindle checkpoint, the self-monitoring system of the eukaryotic cell cycle that ensures chromosome segregation occurs with high fidelity. In all three kinases, kinetochore docking is mediated by the N-terminal region of the protein. Deletions within this region result in checkpoint failure and chromosome segregation defects. Here, we use an interdisciplinary approach that includes biophysical, biochemical, cell biological, and bioinformatics methods to study the N-terminal region of human Mps1. We report the identification of a tandem repeat of the tetratricopeptide repeat (TPR) motif in the N-terminal kinetochore binding region of Mps1, with close homology to the tandem TPR motif of Bub1 and BubR1. Phylogenetic analysis indicates that TPR Mps1 was acquired after the split between deutorostomes and protostomes, as it is distinguishable in chordates and echinoderms. Overexpression of TPR Mps1 resulted in decreased efficiency of both chromosome alignment and mitotic arrest, likely through displacement of endogenous Mps1 from the kinetochore and decreased Mps1 catalytic activity. Taken together, our multidisciplinary strategy provides new insights into the evolution, structural organization, and function of Mps1 N-terminal region.
Two synthetic Sp1-binding sites functionally substitute for the 21-base-pair repeat region to activate simian virus 40 growth in CV-1 cells.

PubMed Central

Lednicky, J; Folk, W R

1992-01-01

The 21-bp repeat region of simian virus 40 (SV40) activates viral transcription and DNA replication and contains binding sites for many cellular proteins, including Sp1, LSF, ETF, Ap2, Ap4, GT-1B, H16, and p53, and for the SV40 large tumor antigen. We have attempted to reduce the complexity of this region while maintaining its growth-promoting capacity. Deletion of the 21-bp repeat region from the SV40 genome delays the expression of viral early proteins and DNA replication and reduces virus production in CV-1 cells. Replacement of the 21-bp repeat region with two copies of DNA sequence motifs bound with high affinities by Sp1 promotes SV40 growth in CV-1 cells to nearly wild-type levels, but substitution by motifs bound less avidly by Sp1 or bound by other activator proteins does not restore growth. This indicates that Sp1 or a protein with similar sequence specificity is primarily responsible for the function of the 21-bp repeat region. We speculate about how Sp1 activates both SV40 transcription and DNA replication. Images PMID:1328672
Evaluation of the Pattern of EPIYA Motifs in the Helicobacter pylori cagA Gene of Patients with Gastritis and Gastric Adenocarcinoma from the Brazilian Amazon Region

PubMed Central

Vilar e Silva, Adenielson; Junior, Mario Ribeiro da Silva; Vinagre, Ruth Maria Dias Ferreira; Santos, Kemper Nunes; da Costa, Renata Aparecida Andrade; Fecury, Amanda Alves; Quaresma, Juarez Antônio Simões; Martins, Luisa Caricio

2014-01-01

The Helicobacter pylori is associated with the development of different diseases. The clinical outcome of infection may be associated with the cagA bacterial genotype. The aim of this study was to determine the EPIYA patterns of strains isolated from patients with gastritis and gastric adenocarcinoma and correlate these patterns with the histopathological features. Gastric biopsy samples were selected from 384 patients infected with H. pylori, including 194 with chronic gastritis and 190 with gastric adenocarcinoma. The presence of the cagA gene and the EPIYA motif was determined by PCR. The cagA gene was more prevalent in patients with gastric cancer and was associated with a higher degree of inflammation, neutrophil activity, and development of intestinal metaplasia. The number of EPIYA-C repeats showed a significant association with an increased risk of gastric carcinoma (OR = 3.79, 95% CI = 1.92–7.46, and P = 0.002). A larger number of EPIYA-C motifs were also associated with intestinal metaplasia. In the present study, infection with H. pylori strains harboring more than one EPIYA-C motif in the cagA gene was associated with the development of intestinal metaplasia and gastric adenocarcinoma but not with neutrophil activity or degree of inflammation. PMID:26904732
Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

PubMed

Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

2016-05-23

Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.
Chlorovirus Skp1-binding ankyrin repeat protein interplay and mimicry of cellular ubiquitin ligase machinery.

PubMed

Noel, Eric A; Kang, Ming; Adamec, Jiri; Van Etten, James L; Oyler, George A

2014-12-01

The ubiquitin-proteasome system is targeted by many viruses that have evolved strategies to redirect host ubiquitination machinery. Members of the genus Chlorovirus are proposed to share an ancestral lineage with a broader group of related viruses, nucleo-cytoplasmic large DNA viruses (NCLDV). Chloroviruses encode an Skp1 homolog and ankyrin repeat (ANK) proteins. Several chlorovirus-encoded ANK repeats contain C-terminal domains characteristic of cellular F-boxes or related NCLDV chordopox PRANC (pox protein repeats of ankyrin at C-terminal) domains. These observations suggested that this unique combination of Skp1 and ANK repeat proteins might form complexes analogous to the cellular Skp1-Cul1-F-box (SCF) ubiquitin ligase complex. We identified two ANK proteins from the prototypic chlorovirus Paramecium bursaria chlorella virus-1 (PBCV-1) that functioned as binding partners for the virus-encoded Skp1, proteins A682L and A607R. These ANK proteins had a C-terminal Skp1 interactional motif that functioned similarly to cellular F-box domains. A C-terminal motif of ANK protein A682L binds Skp1 proteins from widely divergent species. Yeast two-hybrid analyses using serial domain deletion constructs confirmed the C-terminal localization of the Skp1 interactional motif in PBCV-1 A682L. ANK protein A607R represents an ANK family with one member present in all 41 sequenced chloroviruses. A comprehensive phylogenetic analysis of these related ANK and viral Skp1 proteins suggested partnered function tailored to the host alga or common ancestral heritage. Here, we show protein-protein interaction between corresponding family clusters of virus-encoded ANK and Skp1 proteins from three chlorovirus types. Collectively, our results indicate that chloroviruses have evolved complementing Skp1 and ANK proteins that mimic cellular SCF-associated proteins. Viruses have evolved ways to direct ubiquitination events in order to create environments conducive to their replication. As reported in the manuscript, the large chloroviruses encode several components involved in the SCF ubiquitin ligase complex including a viral Skp1 homolog. Studies on how chloroviruses manipulate their host algal ubiquitination system will provide insights toward viral protein mimicry, substrate recognition, and key interactive domains controlling selective protein degradation. These findings may also further understanding of the evolution of other large DNA viruses, like poxviruses, that are reported to share the same monophyly lineage as chloroviruses. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Correlation between fibroin amino acid sequence and physical silk properties.

PubMed

Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

2003-09-12

The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet.
Base-Pairing Energies of Proton-Bound Dimers and Proton Affinities of 1-Methyl-5-Halocytosines: Implications for the Effects of Halogenation on the Stability of the DNA i-Motif

NASA Astrophysics Data System (ADS)

Yang, Bo; Wu, R. R.; Rodgers, M. T.

2015-09-01

(CCG)n•(CGG)n trinucleotide repeats have been found to be associated with fragile X syndrome, the most widespread inherited cause of mental retardation in humans. The (CCG)n•(CGG)n repeats adopt i-motif conformations that are preferentially stabilized by base-pairing interactions of noncanonical proton-bound dimers of cytosine (C+•C). Halogenated cytosine residues are one form of DNA damage that may be important in altering the structure and stability of DNA or DNA-protein interactions and, hence, regulate gene expression. Previously, we investigated the effects of 5-halogenation and 1-methylation of cytosine on the base-pairing energies (BPEs) using threshold collision-induced dissociation (TCID) techniques. In the present study, we extend our work to include proton-bound homo- and heterodimers of cytosine, 1-methyl-5-fluorocytosine, and 1-methyl-5-bromocytosine. All modifications examined here are found to produce a decrease in the BPEs. However, the BPEs of all of the proton-bound dimers examined significantly exceed those of Watson-Crick G•C, neutral C•C base pairs, and various methylated variants such that DNA i-motif conformations should still be preserved in the presence of these modifications. The proton affinities (PAs) of the halogenated cytosines are also obtained from the experimental data by competitive analysis of the primary dissociation pathways that occur in parallel for the proton-bound heterodimers. 5-Halogenation leads to a decrease in the N3 PA of cytosine, whereas 1-methylation leads to an increase in the N3 PA. Thus, the 1-methyl-5-halocytosines exhibit PAs that are intermediate.
Ankyrin-repeat containing proteins of microbes: a conserved structure with functional diversity

PubMed Central

Al-Khodor, Souhaila; Price, Christopher T.; Kalia, Awdhesh; Kwaik, Yousef Abu

2009-01-01

Summary The ankyrin repeat (ANK) is the most common protein-protein interaction motif in nature and predominantly found in eukaryotic proteins. The genome sequencing of various pathogenic or symbiotic bacteria and eukaryotic viruses identified numerous genes encoding ANK-containing proteins that were proposed to have been acquired from eukaryotes by horizontal gene transfer. However, the recent discovery of additional ANK-containing proteins encoded in the genomes of archaea and free-living bacteria suggests either a more ancient origin of the ANK motif or multiple convergent evolution events. Many bacterial pathogens employ various types of secretion systems to deliver ANK-containing proteins into eukaryotic cells where they mimic or manipulate various host functions. Understanding the molecular and biochemical functions of this family of proteins will enhance our understanding of important host-microbe interactions. PMID:19962898
Identification, characterization, and utilization of genome-wide simple sequence repeats to identify a QTL for acidity in apple

PubMed Central

2012-01-01

Background Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome. Results A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation. Conclusions This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding community in marker-assisted breeding, and for performing comparative genomic studies in Rosaceae. PMID:23039990
Targeting of Arabidopsis KNL2 to Centromeres Depends on the Conserved CENPC-k Motif in Its C Terminus.

PubMed

Sandmann, Michael; Talbert, Paul; Demidov, Dmitri; Kuhlmann, Markus; Rutten, Twan; Conrad, Udo; Lermontova, Inna

2017-01-01

KINETOCHORE NULL2 (KNL2) is involved in recognition of centromeres and in centromeric localization of the centromere-specific histone cenH3. Our study revealed a cenH3 nucleosome binding CENPC-k motif at the C terminus of Arabidopsis thaliana KNL2, which is conserved among a wide spectrum of eukaryotes. Centromeric localization of KNL2 is abolished by deletion of the CENPC-k motif and by mutating single conserved amino acids, but can be restored by insertion of the corresponding motif of Arabidopsis CENP-C. We showed by electrophoretic mobility shift assay that the C terminus of KNL2 binds DNA sequence-independently and interacts with the centromeric transcripts in vitro. Chromatin immunoprecipitation with anti-KNL2 antibodies indicated that in vivo KNL2 is preferentially associated with the centromeric repeat pAL1 Complete deletion of the CENPC-k motif did not influence its ability to interact with DNA in vitro. Therefore, we suggest that KNL2 recognizes centromeric nucleosomes, similar to CENP-C, via the CENPC-k motif and binds adjoining DNA. © 2017 American Society of Plant Biologists. All rights reserved.
Determination of the genetic diversity of vegetable soybean [Glycine max (L.) Merr.] using EST-SSR markers*

PubMed Central

Zhang, Gu-wen; Xu, Sheng-chun; Mao, Wei-hua; Hu, Qi-zan; Gong, Ya-ming

2013-01-01

The development of expressed sequence tag-derived simple sequence repeats (EST-SSRs) provided a useful tool for investigating plant genetic diversity. In the present study, 22 polymorphic EST-SSRs from grain soybean were identified and used to assess the genetic diversity in 48 vegetable soybean accessions. Among the 22 EST-SSR loci, tri-nucleotides were the most abundant repeats, accounting for 50.00% of the total motifs. GAA was the most common motif among tri-nucleotide repeats, with a frequency of 18.18%. Polymorphic analysis identified a total of 71 alleles, with an average of 3.23 per locus. The polymorphism information content (PIC) values ranged from 0.144 to 0.630, with a mean of 0.386. Observed heterozygosity (H o) values varied from 0.0196 to 1.0000, with an average of 0.6092, while the expected heterozygosity (H e) values ranged from 0.1502 to 0.6840, with a mean value of 0.4616. Principal coordinate analysis and phylogenetic tree analysis indicated that the accessions could be assigned to different groups based to a large extent on their geographic distribution, and most accessions from China were clustered into the same groups. These results suggest that Chinese vegetable soybean accessions have a narrow genetic base. The results of this study indicate that EST-SSRs from grain soybean have high transferability to vegetable soybean, and that these new markers would be helpful in taxonomy, molecular breeding, and comparative mapping studies of vegetable soybean in the future. PMID:23549845
Richieri-Costa-Pereira syndrome: Expanding its phenotypic and genotypic spectrum.

PubMed

Bertola, D R; Hsia, G; Alvizi, L; Gardham, A; Wakeling, E L; Yamamoto, G L; Honjo, R S; Oliveira, L A N; Di Francesco, R C; Perez, B A; Kim, C A; Passos-Bueno, M R

2018-04-01

Richieri-Costa-Pereira syndrome is a rare autosomal recessive acrofacial dysostosis that has been mainly described in Brazilian individuals. The cardinal features include Robin sequence, cleft mandible, laryngeal anomalies and limb defects. A biallelic expansion of a complex repeated motif in the 5' untranslated region of EIF4A3 has been shown to cause this syndrome, commonly with 15 or 16 repeats. The only patient with mild clinical findings harbored a 14-repeat expansion in 1 allele and a point mutation in the other allele. This proband is described here in more details, as well as is his affected sister, and 5 new individuals with Richieri-Costa-Pereira syndrome, including a patient from England, of African ancestry. This study has expanded the phenotype in this syndrome by the observation of microcephaly, better characterization of skeletal abnormalities, less severe phenotype with only mild facial dysmorphisms and limb anomalies, as well as the absence of cleft mandible, which is a hallmark of the syndrome. Although the most frequent mutation in this study was the recurrent 16-repeat expansion in EIF4A3, there was an overrepresentation of the 14-repeat expansion, with mild phenotypic expression, thus suggesting that the number of these motifs could play a role in phenotypic delineation. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Okada, Shoko; Weisman, Sarah; Trueman, Holly E.

Aposthonia gurneyi, an Australian webspinner species, is a primitive insect that constructs and lives in a silken tunnel which screens it from the attentions of predators. The insect spins silk threads from many tiny spines on its forelegs to weave a filmy sheet. We found that the webspinner silk fibers have a mean diameter of only 65 nm, an order of magnitude smaller than any previously reported insect silk. The purpose of such fine silk may be to reduce the metabolic cost of building the extensive tunnels. At the molecular level, the A. gurneyi silk has a predominantly beta-sheet proteinmore » structure. The most abundant clone in a cDNA library produced from the webspinner silk glands encoded a protein with extensive glycine-serine repeat regions. The GSGSGS repeat motif of the A. gurneyi silk protein is similar to the well-known GAGAGS repeat motif found in the heavy fibroin of silkworm silk, which also has beta-sheet structure. As the webspinner silk gene is unrelated to the silk gene of the phylogenetically distant silkworm, this is a striking example of convergent evolution.« less
Revisiting the TALE repeat.

PubMed

Deng, Dong; Yan, Chuangye; Wu, Jianping; Pan, Xiaojing; Yan, Nieng

2014-04-01

Transcription activator-like (TAL) effectors specifically bind to double stranded (ds) DNA through a central domain of tandem repeats. Each TAL effector (TALE) repeat comprises 33-35 amino acids and recognizes one specific DNA base through a highly variable residue at a fixed position in the repeat. Structural studies have revealed the molecular basis of DNA recognition by TALE repeats. Examination of the overall structure reveals that the basic building block of TALE protein, namely a helical hairpin, is one-helix shifted from the previously defined TALE motif. Here we wish to suggest a structure-based re-demarcation of the TALE repeat which starts with the residues that bind to the DNA backbone phosphate and concludes with the base-recognition hyper-variable residue. This new numbering system is consistent with the α-solenoid superfamily to which TALE belongs, and reflects the structural integrity of TAL effectors. In addition, it confers integral number of TALE repeats that matches the number of bound DNA bases. We then present fifteen crystal structures of engineered dHax3 variants in complex with target DNA molecules, which elucidate the structural basis for the recognition of bases adenine (A) and guanine (G) by reported or uncharacterized TALE codes. Finally, we analyzed the sequence-structure correlation of the amino acid residues within a TALE repeat. The structural analyses reported here may advance the mechanistic understanding of TALE proteins and facilitate the design of TALEN with improved affinity and specificity.
A novel species-specific tandem repeat DNA family from Sinapis arvensis: detection of telomere-like sequences.

PubMed

Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M

1996-08-01

DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.

Functional Motifs Responsible for Human Metapneumovirus M2-2-mediated Innate Immune Evasion

PubMed Central

Chen, Yu; Deng, Xiaoling; Deng, Junfang; Zhou, Jiehua; Ren, Yuping; Liu, Shengxuan; Prusak, Deborah J.; Wood, Thomas G.; Bao, Xiaoyong

2016-01-01

Human metapneumovirus (hMPV) is a major cause of lower respiratory infection in young children. Repeated infections occur throughout life, but its immune evasion mechanisms are largely unknown. We recently found that hMPV M2-2 protein elicits immune evasion by targeting mitochondrial antiviral-signaling protein (MAVS), an antiviral signaling molecule. However, the molecular mechanisms underlying such inhibition are not known. Our mutagenesis studies revealed that PDZ-binding motifs, 29-DEMI-32 and 39-KEALSDGI-46, located in an immune inhibitory region of M2-2, are responsible for M2-2-mediated immune evasion. We also found both motifs prevent TRAF5 and TRAF6, the MAVS downstream adaptors, to be recruited to MAVS, while the motif 39-KEALSDGI-46 also blocks TRAF3 migrating to MAVS. In parallel, these TRAFs are important in activating transcription factors NF-kB and/or IRF-3 by hMPV. Our findings collectively demonstrate that M2-2 uses its PDZ motifs to launch the hMPV immune evasion through blocking the interaction of MAVS and its downstream TRAFs. PMID:27743962
Functional motifs responsible for human metapneumovirus M2-2-mediated innate immune evasion.

PubMed

Chen, Yu; Deng, Xiaoling; Deng, Junfang; Zhou, Jiehua; Ren, Yuping; Liu, Shengxuan; Prusak, Deborah J; Wood, Thomas G; Bao, Xiaoyong

2016-12-01

Human metapneumovirus (hMPV) is a major cause of lower respiratory infection in young children. Repeated infections occur throughout life, but its immune evasion mechanisms are largely unknown. We recently found that hMPV M2-2 protein elicits immune evasion by targeting mitochondrial antiviral-signaling protein (MAVS), an antiviral signaling molecule. However, the molecular mechanisms underlying such inhibition are not known. Our mutagenesis studies revealed that PDZ-binding motifs, 29-DEMI-32 and 39-KEALSDGI-46, located in an immune inhibitory region of M2-2, are responsible for M2-2-mediated immune evasion. We also found both motifs prevent TRAF5 and TRAF6, the MAVS downstream adaptors, to be recruited to MAVS, while the motif 39-KEALSDGI-46 also blocks TRAF3 migrating to MAVS. In parallel, these TRAFs are important in activating transcription factors NF-kB and/or IRF-3 by hMPV. Our findings collectively demonstrate that M2-2 uses its PDZ motifs to launch the hMPV immune evasion through blocking the interaction of MAVS and its downstream TRAFs. Copyright © 2016 Elsevier Inc. All rights reserved.
Tandemly repeated sequences in mtDNA control region of whitefish, Coregonus lavaretus.

PubMed

Brzuzan, P

2000-06-01

Length variation of the mitochondrial DNA control region was observed with PCR amplification of a sample of 138 whitefish (Coregonus lavaretus). Nucleotide sequences of representative PCR products showed that the variation was due to the presence of an approximately 100-bp motif tandemly repeated two, three, or five times in the region between the conserved sequence block-3 (CSB-3) and the gene for phenylalanine tRNA. This is the first report on the tandem array composed of long repeat units in mitochondrial DNA of salmonids.
A Chromatin Insulator-Like Element in the Herpes Simplex Virus Type 1 Latency-Associated Transcript Region Binds CCCTC-Binding Factor and Displays Enhancer-Blocking and Silencing Activities

PubMed Central

Amelio, Antonio L.; McAnany, Peterjon K.; Bloom, David C.

2006-01-01

A previous study demonstrated that the latency-associated transcript (LAT) promoter and the LAT enhancer/reactivation critical region (rcr) are enriched in acetyl histone H3 (K9, K14) during herpes simplex virus type 1 (HSV-1) latency, whereas all lytic genes analyzed (ICP0, UL54, ICP4, and DNA polymerase) are not (N. J. Kubat, R. K. Tran, P. McAnany, and D. C. Bloom, J. Virol. 78:1139-1149, 2004). This suggests that the HSV-1 latent genome is organized into histone H3 (K9, K14) hyperacetylated and hypoacetylated regions corresponding to transcriptionally permissive and transcriptionally repressed chromatin domains, respectively. Such an organization implies that chromatin insulators, similar to those of cellular chromosomes, may separate distinct transcriptional domains of the HSV-1 latent genome. In the present study, we sought to identify cis elements that could partition the HSV-1 genome into distinct chromatin domains. Sequence analysis coupled with chromatin immunoprecipitation and luciferase reporter assays revealed that (i) the long and short repeats and the unique-short region of the HSV-1 genome contain clustered CTCF (CCCTC-binding factor) motifs, (ii) CTCF motif clusters similar to those in HSV-1 are conserved in other alphaherpesviruses, (iii) CTCF binds to these motifs on latent HSV-1 genomes in vivo, and (iv) a 1.5-kb region containing the CTCF motif cluster in the LAT region possesses insulator activities, specifically, enhancer blocking and silencing. The finding that CTCF, a cellular protein associated with chromatin insulators, binds to motifs on the latent genome and insulates the LAT enhancer suggests that CTCF may facilitate the formation of distinct chromatin boundaries during herpesvirus latency. PMID:16474142
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

PubMed Central

Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.

PubMed

Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.
ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants

PubMed Central

Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

2014-01-01

Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1–6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ PMID:25380781
ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants.

PubMed

Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

2014-01-01

Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ © The Author(s) 2014. Published by Oxford University Press.
MSDB: A Comprehensive Database of Simple Sequence Repeats.

PubMed

Avvaru, Akshay Kumar; Saxena, Saketh; Sowpati, Divya Tej; Mishra, Rakesh Kumar

2017-06-01

Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1-6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Structural analysis of Notch-regulating Rumi reveals basis for pathogenic mutations

DOE PAGES

Yu, Hongjun; Takeuchi, Hideyuki; Takeuchi, Megumi; ...

2016-07-18

We present Rumi O-glucosylates the EGF repeats of a growing list of proteins essential in metazoan development, including Notch. Rumi is essential for Notch signaling, and Rumi dysregulation is linked to several human diseases. Despite Rumi's critical roles, it is unknown how Rumi glucosylates a serine of many but not all EGF repeats. Here we report crystal structures of Drosophila Rumi as binary and ternary complexes with a folded EGF repeat and/or donor substrates. These structures provide insights into the catalytic mechanism and show that Rumi recognizes structural signatures of the EGF motif, the U-shaped consensus sequence, C-X-S-X-(P/A)-C and amore » conserved hydrophobic region. We found that five Rumi mutations identified in cancers and Dowling–Degos disease are clustered around the enzyme active site and adversely affect its activity. In conclusion, our study suggests that loss of Rumi activity may underlie these diseases, and the mechanistic insights may facilitate the development of modulators of Notch signaling.« less
Structural analysis of Notch-regulating Rumi reveals basis for pathogenic mutations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yu, Hongjun; Takeuchi, Hideyuki; Takeuchi, Megumi

We present Rumi O-glucosylates the EGF repeats of a growing list of proteins essential in metazoan development, including Notch. Rumi is essential for Notch signaling, and Rumi dysregulation is linked to several human diseases. Despite Rumi's critical roles, it is unknown how Rumi glucosylates a serine of many but not all EGF repeats. Here we report crystal structures of Drosophila Rumi as binary and ternary complexes with a folded EGF repeat and/or donor substrates. These structures provide insights into the catalytic mechanism and show that Rumi recognizes structural signatures of the EGF motif, the U-shaped consensus sequence, C-X-S-X-(P/A)-C and amore » conserved hydrophobic region. We found that five Rumi mutations identified in cancers and Dowling–Degos disease are clustered around the enzyme active site and adversely affect its activity. In conclusion, our study suggests that loss of Rumi activity may underlie these diseases, and the mechanistic insights may facilitate the development of modulators of Notch signaling.« less
Screening of repetitive motifs inside the genome of the flat oyster (Ostrea edulis): Transposable elements and short tandem repeats.

PubMed

Vera, Manuel; Bello, Xabier; Álvarez-Dios, Jose-Antonio; Pardo, Belen G; Sánchez, Laura; Carlsson, Jens; Carlsson, Jeanette E L; Bartolomé, Carolina; Maside, Xulio; Martinez, Paulino

2015-12-01

The flat oyster (Ostrea edulis) is one of the most appreciated molluscs in Europe, but its production has been greatly reduced by the parasite Bonamia ostreae. Here, new generation genomic resources were used to analyse the repetitive fraction of the oyster genome, with the aim of developing molecular markers to face this main oyster production challenge. The resulting oyster database, consists of two sets of 10,318 and 7159 unique contigs (4.8 Mbp and 6.8 Mbp in total length) representing the oyster's genome (WG) and haemocyte transcriptome (HT), respectively. A total of 1083 sequences were identified as TE-derived, which corresponded to 4.0% of WG and 1.1% of HT. They were clustered into 142 homology groups, most of which were assigned to the Penelope order of retrotransposons, and to the Helitron and TIR DNA-transposons. Simple repeats and rRNA pseudogenes, also made a significant contribution to the oyster's genome (0.5% and 0.3% of WG and HT, respectively).The most frequent short tandem repeats identified in WG were tetranucleotide motifs while trinucleotide motifs were in HT. Forty identified microsatellite loci, 20 from each database, were selected for technical validation. Success was much lower among WG than HT microsatellites (15% vs 55%), which could reflect higher variation in anonymous regions interfering with primer annealing. All microsatellites developed adjusted to Hardy-Weinberg proportions and represent a useful tool to support future breeding programmes and to manage genetic resources of natural flat oyster beds. Copyright © 2015 Elsevier B.V. All rights reserved.
Genome-Wide Characterization and Linkage Mapping of Simple Sequence Repeats in Mei (Prunus mume Sieb. et Zucc.)

PubMed Central

Sun, Lidan; Yang, Weiru; Zhang, Qixiang; Cheng, Tangren; Pan, Huitang; Xu, Zongda; Zhang, Jie; Chen, Chuguang

2013-01-01

Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc.) has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs) in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS) were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca), and apple (Malus×domestica) genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb) and almost twice as high as that of apple (398 SSR/Mb). Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs), with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species. PMID:23555708
Molecular structure and chromosome distribution of three repetitive DNA families in Anemone hortensis L. (Ranunculaceae).

PubMed

Mlinarec, Jelena; Chester, Mike; Siljak-Yakovlev, Sonja; Papes, Drazena; Leitch, Andrew R; Besendorfer, Visnja

2009-01-01

The structure, abundance and location of repetitive DNA sequences on chromosomes can characterize the nature of higher plant genomes. Here we report on three new repeat DNA families isolated from Anemone hortensis L.; (i) AhTR1, a family of satellite DNA (stDNA) composed of a 554-561 bp long EcoRV monomer; (ii) AhTR2, a stDNA family composed of a 743 bp long HindIII monomer and; (iii) AhDR, a repeat family composed of a 945 bp long HindIII fragment that exhibits some sequence similarity to Ty3/gypsy-like retroelements. Fluorescence in-situ hybridization (FISH) to metaphase chromosomes of A. hortensis (2n = 16) revealed that both AhTR1 and AhTR2 sequences co-localized with DAPI-positive AT-rich heterochromatic regions. AhTR1 sequences occur at intercalary DAPI bands while AhTR2 sequences occur at 8-10 terminally located heterochromatic blocks. In contrast AhDR sequences are dispersed over all chromosomes as expected of a Ty3/gypsy-like element. AhTR2 and AhTR1 repeat families include polyA- and polyT-tracks, AT/TA-motifs and a pentanucleotide sequence (CAAAA) that may have consequences for chromatin packing and sequence homogeneity. AhTR2 repeats also contain TTTAGGG motifs and degenerate variants. We suggest that they arose by interspersion of telomeric repeats with subtelomeric repeats, before hybrid unit(s) amplified through the heterochromatic domain. The three repetitive DNA families together occupy approximately 10% of the A. hortensis genome. Comparative analyses of eight Anemone species revealed that the divergence of the A. hortensis genome was accompanied by considerable modification and/or amplification of repeats.
Structural and Functional Characterization of an Archaeal Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-associated Complex for Antiviral Defense (CASCADE)*

PubMed Central

Lintner, Nathanael G.; Kerou, Melina; Brumfield, Susan K.; Graham, Shirley; Liu, Huanting; Naismith, James H.; Sdano, Matthew; Peng, Nan; She, Qunxin; Copié, Valérie; Young, Mark J.; White, Malcolm F.; Lawrence, C. Martin

2011-01-01

In response to viral infection, many prokaryotes incorporate fragments of virus-derived DNA into loci called clustered regularly interspaced short palindromic repeats (CRISPRs). The loci are then transcribed, and the processed CRISPR transcripts are used to target invading viral DNA and RNA. The Escherichia coli “CRISPR-associated complex for antiviral defense” (CASCADE) is central in targeting invading DNA. Here we report the structural and functional characterization of an archaeal CASCADE (aCASCADE) from Sulfolobus solfataricus. Tagged Csa2 (Cas7) expressed in S. solfataricus co-purifies with Cas5a-, Cas6-, Csa5-, and Cas6-processed CRISPR-RNA (crRNA). Csa2, the dominant protein in aCASCADE, forms a stable complex with Cas5a. Transmission electron microscopy reveals a helical complex of variable length, perhaps due to substoichiometric amounts of other CASCADE components. A recombinant Csa2-Cas5a complex is sufficient to bind crRNA and complementary ssDNA. The structure of Csa2 reveals a crescent-shaped structure unexpectedly composed of a modified RNA-recognition motif and two additional domains present as insertions in the RNA-recognition motif. Conserved residues indicate potential crRNA- and target DNA-binding sites, and the H160A variant shows significantly reduced affinity for crRNA. We propose a general subunit architecture for CASCADE in other bacteria and Archaea. PMID:21507944
Structural and functional characterization of an archaeal clustered regularly interspaced short palindromic repeat (CRISPR)-associated complex for antiviral defense (CASCADE).

PubMed

Lintner, Nathanael G; Kerou, Melina; Brumfield, Susan K; Graham, Shirley; Liu, Huanting; Naismith, James H; Sdano, Matthew; Peng, Nan; She, Qunxin; Copié, Valérie; Young, Mark J; White, Malcolm F; Lawrence, C Martin

2011-06-17

In response to viral infection, many prokaryotes incorporate fragments of virus-derived DNA into loci called clustered regularly interspaced short palindromic repeats (CRISPRs). The loci are then transcribed, and the processed CRISPR transcripts are used to target invading viral DNA and RNA. The Escherichia coli "CRISPR-associated complex for antiviral defense" (CASCADE) is central in targeting invading DNA. Here we report the structural and functional characterization of an archaeal CASCADE (aCASCADE) from Sulfolobus solfataricus. Tagged Csa2 (Cas7) expressed in S. solfataricus co-purifies with Cas5a-, Cas6-, Csa5-, and Cas6-processed CRISPR-RNA (crRNA). Csa2, the dominant protein in aCASCADE, forms a stable complex with Cas5a. Transmission electron microscopy reveals a helical complex of variable length, perhaps due to substoichiometric amounts of other CASCADE components. A recombinant Csa2-Cas5a complex is sufficient to bind crRNA and complementary ssDNA. The structure of Csa2 reveals a crescent-shaped structure unexpectedly composed of a modified RNA-recognition motif and two additional domains present as insertions in the RNA-recognition motif. Conserved residues indicate potential crRNA- and target DNA-binding sites, and the H160A variant shows significantly reduced affinity for crRNA. We propose a general subunit architecture for CASCADE in other bacteria and Archaea.
Telobox motifs recruit CLF/SWN-PRC2 for H3K27me3 deposition via TRB factors in Arabidopsis.

PubMed

Zhou, Yue; Wang, Yuejun; Krause, Kristin; Yang, Tingting; Dongus, Joram A; Zhang, Yijing; Turck, Franziska

2018-05-01

Polycomb repressive complexes (PRCs) control organismic development in higher eukaryotes through epigenetic gene repression 1-4 . PRC proteins do not contain DNA-binding domains, thus prompting questions regarding how PRCs find their target loci 5 . Here we present genome-wide evidence of PRC2 recruitment by telomere-repeat-binding factors (TRBs) through telobox-related motifs in Arabidopsis. A triple trb1-2, trb2-1, and trb3-2 (trb1/2/3) mutant with a developmental phenotype and a transcriptome strikingly similar to those of strong PRC2 mutants showed redistribution of trimethyl histone H3 Lys27 (H3K27me3) marks and lower H3K27me3 levels, which were correlated with derepression of TRB1-target genes. TRB1-3 physically interacted with the PRC2 proteins CLF and SWN. A SEP3 reporter gene with a telobox mutation showed ectopic expression, which was correlated with H3K27me3 depletion, whereas tethering TRB1 to the mutated cis element partially restored repression. We propose that telobox-related motifs recruit PRC2 through the interaction between TRBs and CLF/SWN, a mechanism essential for H3K27me3 deposition at a subset of target genes.
TCOF1 gene encodes a putative nucleolar phosphoprotein that exhibits mutations in Treacher Collins Syndrome throughout its coding region.

PubMed

Wise, C A; Chiang, L C; Paznekas, W A; Sharma, M; Musy, M M; Ashley, J A; Lovett, M; Jabs, E W

1997-04-01

Treacher Collins Syndrome (TCS) is the most common of the human mandibulofacial dysostosis disorders. Recently, a partial TCOF1 cDNA was identified and shown to contain mutations in TCS families. Here we present the entire exon/intron genomic structure and the complete coding sequence of TCOF1. TCOF1 encodes a low complexity protein of 1,411 amino acids, whose predicted protein structure reveals repeated motifs that mirror the organization of its exons. These motifs are shared with nucleolar trafficking proteins in other species and are predicted to be highly phosphorylated by casein kinase. Consistent with this, the full-length TCOF1 protein sequence also contains putative nuclear and nucleolar localization signals. Throughout the open reading frame, we detected an additional eight mutations in TCS families and several polymorphisms. We postulate that TCS results from defects in a nucleolar trafficking protein that is critically required during human craniofacial development.
TCOF1 gene encodes a putative nucleolar phosphoprotein that exhibits mutations in Treacher Collins Syndrome throughout its coding region

PubMed Central

Wise, Carol A.; Chiang, Lydia C.; Paznekas, William A.; Sharma, Mridula; Musy, Maurice M.; Ashley, Jennifer A.; Lovett, Michael; Jabs, Ethylin W.

1997-01-01

Treacher Collins Syndrome (TCS) is the most common of the human mandibulofacial dysostosis disorders. Recently, a partial TCOF1 cDNA was identified and shown to contain mutations in TCS families. Here we present the entire exon/intron genomic structure and the complete coding sequence of TCOF1. TCOF1 encodes a low complexity protein of 1,411 amino acids, whose predicted protein structure reveals repeated motifs that mirror the organization of its exons. These motifs are shared with nucleolar trafficking proteins in other species and are predicted to be highly phosphorylated by casein kinase. Consistent with this, the full-length TCOF1 protein sequence also contains putative nuclear and nucleolar localization signals. Throughout the open reading frame, we detected an additional eight mutations in TCS families and several polymorphisms. We postulate that TCS results from defects in a nucleolar trafficking protein that is critically required during human craniofacial development. PMID:9096354
Effector prediction in host-pathogen interaction based on a Markov model of a ubiquitous EPIYA motif

PubMed Central

2010-01-01

Background Effector secretion is a common strategy of pathogen in mediating host-pathogen interaction. Eight EPIYA-motif containing effectors have recently been discovered in six pathogens. Once these effectors enter host cells through type III/IV secretion systems (T3SS/T4SS), tyrosine in the EPIYA motif is phosphorylated, which triggers effectors binding other proteins to manipulate host-cell functions. The objectives of this study are to evaluate the distribution pattern of EPIYA motif in broad biological species, to predict potential effectors with EPIYA motif, and to suggest roles and biological functions of potential effectors in host-pathogen interactions. Results A hidden Markov model (HMM) of five amino acids was built for the EPIYA-motif based on the eight known effectors. Using this HMM to search the non-redundant protein database containing 9,216,047 sequences, we obtained 107,231 sequences with at least one EPIYA motif occurrence and 3115 sequences with multiple repeats of the EPIYA motif. Although the EPIYA motif exists among broad species, it is significantly over-represented in some particular groups of species. For those proteins containing at least four copies of EPIYA motif, most of them are from intracellular bacteria, extracellular bacteria with T3SS or T4SS or intracellular protozoan parasites. By combining the EPIYA motif and the adjacent SH2 binding motifs (KK, R4, Tarp and Tir), we built HMMs of nine amino acids and predicted many potential effectors in bacteria and protista by the HMMs. Some potential effectors for pathogens (such as Lawsonia intracellularis, Plasmodium falciparum and Leishmania major) are suggested. Conclusions Our study indicates that the EPIYA motif may be a ubiquitous functional site for effectors that play an important pathogenicity role in mediating host-pathogen interactions. We suggest that some intracellular protozoan parasites could secrete EPIYA-motif containing effectors through secretion systems similar to the T3SS/T4SS in bacteria. Our predicted effectors provide useful hypotheses for further studies. PMID:21143776

Genome Survey Sequencing for the Characterization of the Genetic Background of Rosa roxburghii Tratt and Leaf Ascorbate Metabolism Genes.

PubMed

Lu, Min; An, Huaming; Li, Liangliang

2016-01-01

Rosa roxburghii Tratt is an important commercial horticultural crop in China that is recognized for its nutritional and medicinal values. In spite of the economic significance, genomic information on this rose species is currently unavailable. In the present research, a genome survey of R. roxburghii was carried out using next-generation sequencing (NGS) technologies. Total 30.29 Gb sequence data was obtained by HiSeq 2500 sequencing and an estimated genome size of R. roxburghii was 480.97 Mb, in which the guanine plus cytosine (GC) content was calculated to be 38.63%. All of these reads were technically assembled and a total of 627,554 contigs with a N50 length of 1.484 kb and furthermore 335,902 scaffolds with a total length of 409.36 Mb were obtained. Transposable elements (TE) sequence of 90.84 Mb which comprised 29.20% of the genome, and 167,859 simple sequence repeats (SSRs) were identified from the scaffolds. Among these, the mono-(66.30%), di-(25.67%), and tri-(6.64%) nucleotide repeats contributed to nearly 99% of the SSRs, and sequence motifs AG/CT (28.81%) and GAA/TTC (14.76%) were the most abundant among the dinucleotide and trinucleotide repeat motifs, respectively. Genome analysis predicted a total of 22,721 genes which have an average length of 2311.52 bp, an average exon length of 228.15 bp, and average intron length of 401.18 bp. Eleven genes putatively involved in ascorbate metabolism were identified and its expression in R. roxburghii leaves was validated by quantitative real-time PCR (qRT-PCR). This is the first report of genome-wide characterization of this rose species.
β-hairpin-mediated nucleation of polyglutamine amyloid formation

PubMed Central

Kar, Karunakar; Hoop, Cody L.; Drombosky, Kenneth W.; Baker, Matthew A.; Kodali, Ravindra; Arduini, Irene; van der Wel, Patrick C. A.; Horne, W. Seth; Wetzel, Ronald

2013-01-01

The conformational preferences of polyglutamine (polyQ) sequences are of major interest because of their central importance in the expanded CAG repeat diseases that include Huntington’s disease (HD). Here we explore the response of various biophysical parameters to the introduction of β-hairpin motifs within polyQ sequences. These motifs (trpzip, disulfide, D-Pro-Gly, Coulombic attraction, L-Pro-Gly) enhance formation rates and stabilities of amyloid fibrils with degrees of effectiveness well-correlated with their known abilities to enhance β-hairpin formation in other peptides. These changes led to decreases in the critical nucleus for amyloid formation from a value of n* = 4 for a simple, unbroken Q23 sequence to approximate unitary n* values for similar length polyQs containing β-hairpin motifs. At the same time, the morphologies, secondary structures, and bioactivities of the resulting fibrils were essentially unchanged from simple polyQ aggregates. In particular, the signature pattern of SSNMR 13C Gln resonances that appears to be unique to polyQ amyloid is replicated exactly in fibrils from a β-hairpin polyQ. Importantly, while β-hairpin motifs do produce enhancements in the equilibrium constant for nucleation in aggregation reactions, these Kn* values remain quite low (~ 10−10) and there is no evidence for significant embellishment of β-structure within the monomer ensemble. The results indicate an important role for β-turns in the nucleation mechanism and structure of polyQ amyloid and have implications for the nature of the toxic species in expanded CAG repeat diseases. PMID:23353826
De novo transcriptome sequencing reveals a considerable bias in the incidence of simple sequence repeats towards the downstream of 'Pre-miRNAs' of black pepper.

PubMed

Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

2013-01-01

Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of '43 pre-miRNA candidates bearing different types of SSR motifs'. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted 'pre-miRNA candidates bearing SSRs'. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted 'pre-miRNA candidates'. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of 'tandem repeats' in miRNAs.
Microsatellites for Lindera species

Treesearch

Craig S. Echt; D. Deemer; T.L. Kubisiak; C.D. Nelson

2006-01-01

Microsatellite markers were developed for conservation genetic studies of Lindera melissifolia (pondberry), a federally endangered shrub of southern bottomland ecosystems. Microsatellite sequences were obtained from DNA libraries that were enriched for the (AC)n simple sequence repeat motif. From 35 clone sequences, 20 primer...
Leucine zipper motif in RRS1 is crucial for the regulation of Arabidopsis dual resistance protein complex RPS4/RRS1

PubMed Central

Narusaka, Mari; Toyoda, Kazuhiro; Shiraishi, Tomonori; Iuchi, Satoshi; Takano, Yoshitaka; Shirasu, Ken; Narusaka, Yoshihiro

2016-01-01

Arabidopsis thaliana leucine-rich repeat-containing (NLR) proteins RPS4 and RRS1, known as dual resistance proteins, confer resistance to multiple pathogen isolates, such as the bacterial pathogens Pseudomonas syringae and Ralstonia solanacearum and the fungal pathogen Colletotrichum higginsianum. RPS4 is a typical Toll/interleukin 1 Receptor (TIR)-type NLR, whereas RRS1 is an atypical TIR-NLR that contains a leucine zipper (LZ) motif and a C-terminal WRKY domain. RPS4 and RRS1 are localised near each other in a head-to-head orientation. In this study, direct mutagenesis of the C-terminal LZ motif in RRS1 caused an autoimmune response and stunting in the mutant. Co-immunoprecipitation analysis indicated that full-length RPS4 and RRS1 are physically associated with one another. Furthermore, virus-induced gene silencing experiments showed that hypersensitive-like cell death triggered by RPS4/LZ motif-mutated RRS1 depends on EDS1. In conclusion, we suggest that the RRS1-LZ motif is crucial for the regulation of the RPS4/RRS1 complex. PMID:26750751
Mutations in repeating structural motifs of tropomyosin cause gain of function in skeletal muscle myopathy patients

PubMed Central

Marston, Steven; Memo, Massimiliano; Messer, Andrew; Papadaki, Maria; Nowak, Kristen; McNamara, Elyshia; Ong, Royston; El-Mezgueldi, Mohammed; Li, Xiaochuan; Lehman, William

2013-01-01

The congenital myopathies include a wide spectrum of clinically, histologically and genetically variable neuromuscular disorders many of which are caused by mutations in genes for sarcomeric proteins. Some congenital myopathy patients have a hypercontractile phenotype. Recent functional studies demonstrated that ACTA1 K326N and TPM2 ΔK7 mutations were associated with hypercontractility that could be explained by increased myofibrillar Ca2+ sensitivity. A recent structure of the complex of actin and tropomyosin in the relaxed state showed that both these mutations are located in the actin–tropomyosin interface. Tropomyosin is an elongated molecule with a 7-fold repeated motif of around 40 amino acids corresponding to the 7 actin monomers it interacts with. Actin binds to tropomyosin electrostatically at two points, through Asp25 and through a cluster of amino acids that includes Lys326, mutated in the gain-of-function mutation. Asp25 interacts with tropomyosin K6, next to K7 that was mutated in the other gain-of-function mutation. We identified four tropomyosin motifs interacting with Asp25 (K6-K7, K48-K49, R90-R91 and R167-K168) and three E-E/D-K/R motifs interacting with Lys326 (E139, E181 and E218), and we predicted that the known skeletal myopathy mutations ΔK7, ΔK49, R91G, ΔE139, K168E and E181K would cause a gain of function. Tests by an in vitro motility assay confirmed that these mutations increased Ca2+ sensitivity, while mutations not in these motifs (R167H, R244G) decreased Ca2+ sensitivity. The work reported here explains the molecular mechanism for 6 out of 49 known disease-causing mutations in the TPM2 and TPM3 genes, derived from structural data of the actin–tropomyosin interface. PMID:23886664
[SSR loci information analysis in transcriptome of Andrographis paniculata].

PubMed

Li, Jun-Ren; Chen, Xiu-Zhen; Tang, Xiao-Ting; He, Rui; Zhan, Ruo-Ting

2018-06-01

To study the SSR loci information and develop molecular markers, a total of 43 683 Unigenes in transcriptome of Andrographis paniculata were used to explore SSR. The distribution frequency of SSR and the basic characteristics of repeat motifs were analyzed using MicroSAtellite software, SSR primers were designed by Primer 3.0 software and then validated by PCR. Moreover, the gene function analysis of SSR Unigene was obtained by Blast. The results showed that 14 135 SSR loci were found in the transcriptome of A. paniculata, which distributed in 9 973 Unigenes with a distribution frequency of 32.36%. Di-nucleotide and Tri-nucleotide repeat were the main types, accounted for 75.54% of all SSRs. The repeat motifs of AT/AT and CCG/CGG were the predominant repeat types of Di-nucleotide and Tri-nucleotide, respectively. A total of 4 740 pairs of SSR primers with the potential to produce polymorphism were designed for maker development. Ten pairs of primers in 20 pairs of randomly picked primers produced fragments with expected molecular size. The gene function of Unigenes containing SSR were mostly related to the basic metabolism function of A. paniculata. The SSR markers in transcriptome of A. paniculata show rich type, strong specificity and high potential of polymorphism, which will benefit the candidate gene mining and marker-assisted breeding. Copyright© by the Chinese Pharmaceutical Association.
Myricetin Reduces Toxic Level of CAG Repeats RNA in Huntington's Disease (HD) and Spino Cerebellar Ataxia (SCAs).

PubMed

Khan, Eshan; Tawani, Arpita; Mishra, Subodh Kumar; Verma, Arun Kumar; Upadhyay, Arun; Kumar, Mohit; Sandhir, Rajat; Mishra, Amit; Kumar, Amit

2018-01-19

Huntington's disease (HD) is a neurodegenerative disorder that is caused by abnormal expansion of CAG repeats in the HTT gene. The transcribed mutant RNA contains expanded CAG repeats that translate into a mutant huntingtin protein. This expanded CAG repeat also causes mis-splicing of pre-mRNA due to sequestration of muscle blind like-1 splicing factor (MBNL1), and thus both of these elicit the pathogenesis of HD. Targeting the onset as well as progression of HD by small molecules could be a potent therapeutic approach. We have screened a set of small molecules to target this transcript and found Myricetin, a flavonoid, as a lead molecule that interacts with the CAG motif and thus prevents the translation of mutant huntingtin protein as well as sequestration of MBNL1. Here, we report the first solution structure of the complex formed between Myricetin and RNA containing the 5'CAG/3'GAC motif. Myricetin interacts with this RNA via base stacking at the AA mismatch. Moreover, Myricetin was also found reducing the proteo-toxicity generated due to the aggregation of polyglutamine, and further, its supplementation also improves neurobehavioral deficits in the HD mouse model. Our study provides the structural and mechanistic basis of Myricetin as an effective therapeutic candidate for HD and other polyQ related disorders.
Molecular Phylogenetic Analysis of Archaeal Intron-Containing Genes Coding for rRNA Obtained from a Deep-Subsurface Geothermal Water Pool

PubMed Central

Takai, Ken; Horikoshi, Koki

1999-01-01

Molecular phylogenetic analysis of a naturally occurring microbial community in a deep-subsurface geothermal environment indicated that the phylogenetic diversity of the microbial population in the environment was extremely limited and that only hyperthermophilic archaeal members closely related to Pyrobaculum were present. All archaeal ribosomal DNA sequences contained intron-like sequences, some of which had open reading frames with repeated homing-endonuclease motifs. The sequence similarity analysis and the phylogenetic analysis of these homing endonucleases suggested the possible phylogenetic relationship among archaeal rRNA-encoded homing endonucleases. PMID:10584021
Edge usage, motifs, and regulatory logic for cell cycling genetic networks

NASA Astrophysics Data System (ADS)

Zagorski, M.; Krzywicki, A.; Martin, O. C.

2013-01-01

The cell cycle is a tightly controlled process, yet it shows marked differences across species. Which of its structural features follow solely from the ability to control gene expression? We tackle this question in silico by examining the ensemble of all regulatory networks which satisfy the constraint of producing a given sequence of gene expressions. We focus on three cell cycle profiles coming from baker's yeast, fission yeast, and mammals. First, we show that the networks in each of the ensembles use just a few interactions that are repeatedly reused as building blocks. Second, we find an enrichment in network motifs that is similar in the two yeast cell cycle systems investigated. These motifs do not have autonomous functions, yet they reveal a regulatory logic for cell cycling based on a feed-forward cascade of activating interactions.
Divergent Synthesis of Chondroitin Sulfate Disaccharides and Identification of Sulfate Motifs that Inhibit Triple Negative Breast Cancer

NASA Astrophysics Data System (ADS)

Wei Poh, Zhong; Heng Gan, Chin; Lee, Eric J.; Guo, Suxian; Yip, George W.; Lam, Yulin

2015-09-01

Glycosaminoglycans (GAGs) regulate many important physiological processes. A pertinent issue to address is whether GAGs encode important functional information via introduction of position specific sulfate groups in the GAG structure. However, procurement of pure, homogenous GAG motifs to probe the “sulfation code” is a challenging task due to isolation difficulty and structural complexity. To this end, we devised a versatile synthetic strategy to obtain all the 16 theoretically possible sulfation patterns in the chondroitin sulfate (CS) repeating unit; these include rare but potentially important sulfated motifs which have not been isolated earlier. Biological evaluation indicated that CS sulfation patterns had differing effects for different breast cancer cell types, and the greatest inhibitory effect was observed for the most aggressive, triple negative breast cancer cell line MDA-MB-231.
Factors affecting genotyping success in giant panda fecal samples.

PubMed

Zhu, Ying; Liu, Hong-Yi; Yang, Hai-Qiong; Li, Yu-Dong; Zhang, He-Min

2017-01-01

Fecal samples play an important role in giant panda conservation studies. Optimal preservation conditions and choice of microsatellites for giant panda fecal samples have not been established. In this study, we evaluated the effect of four factors (namely, storage type (ethanol (EtOH), EtOH -20 °C, 2-step storage medium, DMSO/EDTA/Tris/salt buffer (DETs) and frozen at -20 °C), storage time (one, three and six months), fragment length, and repeat motif of microsatellite loci) on the success rate of microsatellite amplification, allelic dropout (ADO) and false allele (FA) rates from giant panda fecal samples. Amplification success and ADO rates differed between the storage types. Freezing was inferior to the other four storage methods based on the lowest average amplification success and the highest ADO rates ( P < 0.05). The highest microsatellite amplification success was obtained from either EtOH or the 2-step storage medium at three storage time points. Storage time had a negative effect on the average amplification of microsatellites and samples stored in EtOH and the 2-step storage medium were more stable than the other three storage types. We only detected the effect of repeat motif on ADO and FA rates. The lower ADO and FA rates were obtained from tri- and tetra-nucleotide loci. We suggest that freezing should not be used for giant panda fecal preservation in microsatellite studies, and EtOH and the 2-step storage medium should be chosen on priority for long-term storage. We recommend candidate microsatellite loci with longer repeat motif to ensure greater genotyping success for giant panda fecal studies.
Factors affecting genotyping success in giant panda fecal samples

PubMed Central

Zhu, Ying; Liu, Hong-Yi; Yang, Hai-Qiong; Li, Yu-Dong

2017-01-01

Fecal samples play an important role in giant panda conservation studies. Optimal preservation conditions and choice of microsatellites for giant panda fecal samples have not been established. In this study, we evaluated the effect of four factors (namely, storage type (ethanol (EtOH), EtOH −20 °C, 2-step storage medium, DMSO/EDTA/Tris/salt buffer (DETs) and frozen at −20 °C), storage time (one, three and six months), fragment length, and repeat motif of microsatellite loci) on the success rate of microsatellite amplification, allelic dropout (ADO) and false allele (FA) rates from giant panda fecal samples. Amplification success and ADO rates differed between the storage types. Freezing was inferior to the other four storage methods based on the lowest average amplification success and the highest ADO rates (P < 0.05). The highest microsatellite amplification success was obtained from either EtOH or the 2-step storage medium at three storage time points. Storage time had a negative effect on the average amplification of microsatellites and samples stored in EtOH and the 2-step storage medium were more stable than the other three storage types. We only detected the effect of repeat motif on ADO and FA rates. The lower ADO and FA rates were obtained from tri- and tetra-nucleotide loci. We suggest that freezing should not be used for giant panda fecal preservation in microsatellite studies, and EtOH and the 2-step storage medium should be chosen on priority for long-term storage. We recommend candidate microsatellite loci with longer repeat motif to ensure greater genotyping success for giant panda fecal studies. PMID:28560107
An additional function of the rough endoplasmic reticulum protein complex prolyl 3-hydroxylase 1·cartilage-associated protein·cyclophilin B: the CXXXC motif reveals disulfide isomerase activity in vitro.

PubMed

Ishikawa, Yoshihiro; Bächinger, Hans Peter

2013-11-01

Collagen biosynthesis occurs in the rough endoplasmic reticulum, and many molecular chaperones and folding enzymes are involved in this process. The folding mechanism of type I procollagen has been well characterized, and protein disulfide isomerase (PDI) has been suggested as a key player in the formation of the correct disulfide bonds in the noncollagenous carboxyl-terminal and amino-terminal propeptides. Prolyl 3-hydroxylase 1 (P3H1) forms a hetero-trimeric complex with cartilage-associated protein and cyclophilin B (CypB). This complex is a multifunctional complex acting as a prolyl 3-hydroxylase, a peptidyl prolyl cis-trans isomerase, and a molecular chaperone. Two major domains are predicted from the primary sequence of P3H1: an amino-terminal domain and a carboxyl-terminal domain corresponding to the 2-oxoglutarate- and iron-dependent dioxygenase domains similar to the α-subunit of prolyl 4-hydroxylase and lysyl hydroxylases. The amino-terminal domain contains four CXXXC sequence repeats. The primary sequence of cartilage-associated protein is homologous to the amino-terminal domain of P3H1 and also contains four CXXXC sequence repeats. However, the function of the CXXXC sequence repeats is not known. Several publications have reported that short peptides containing a CXC or a CXXC sequence show oxido-reductase activity similar to PDI in vitro. We hypothesize that CXXXC motifs have oxido-reductase activity similar to the CXXC motif in PDI. We have tested the enzyme activities on model substrates in vitro using a GCRALCG peptide and the P3H1 complex. Our results suggest that this complex could function as a disulfide isomerase in the rough endoplasmic reticulum.
Intrastrand triplex DNA repeats in bacteria: a source of genomic instability

PubMed Central

Holder, Isabelle T.; Wagner, Stefanie; Xiong, Peiwen; Sinn, Malte; Frickey, Tancred; Meyer, Axel; Hartig, Jörg S.

2015-01-01

Repetitive nucleic acid sequences are often prone to form secondary structures distinct from B-DNA. Prominent examples of such structures are DNA triplexes. We observed that certain intrastrand triplex motifs are highly conserved and abundant in prokaryotic genomes. A systematic search of 5246 different prokaryotic plasmids and genomes for intrastrand triplex motifs was conducted and the results summarized in the ITxF database available online at http://bioinformatics.uni-konstanz.de/utils/ITxF/. Next we investigated biophysical and biochemical properties of a particular G/C-rich triplex motif (TM) that occurs in many copies in more than 260 bacterial genomes by CD and nuclear magnetic resonance spectroscopy as well as in vivo footprinting techniques. A characterization of putative properties and functions of these unusually frequent nucleic acid motifs demonstrated that the occurrence of the TM is associated with a high degree of genomic instability. TM-containing genomic loci are significantly more rearranged among closely related Escherichia coli strains compared to control sites. In addition, we found very high frequencies of TM motifs in certain Enterobacteria and Cyanobacteria that were previously described as genetically highly diverse. In conclusion we link intrastrand triplex motifs with the induction of genomic instability. We speculate that the observed instability might be an adaptive feature of these genomes that creates variation for natural selection to act upon. PMID:26450966
A multiplexed microsatellite fingerprinting set for hazelnut cultivar identification

USDA-ARS?s Scientific Manuscript database

The objective of this study was to develop a robust and cost-effective fingerprinting set for hazelnuts using microsatellite (SSR) markers. Twenty SSRs containing repeat motifs of = three nucleotides distributed throughout the hazelnut genome were screened on eight genetically diverse cultivars to a...
Short-Sequence DNA Repeats in Prokaryotic Genomes

PubMed Central

van Belkum, Alex; Scherer, Stewart; van Alphen, Loek; Verbrugh, Henri

1998-01-01

Short-sequence DNA repeat (SSR) loci can be identified in all eukaryotic and many prokaryotic genomes. These loci harbor short or long stretches of repeated nucleotide sequence motifs. DNA sequence motifs in a single locus can be identical and/or heterogeneous. SSRs are encountered in many different branches of the prokaryote kingdom. They are found in genes encoding products as diverse as microbial surface components recognizing adhesive matrix molecules and specific bacterial virulence factors such as lipopolysaccharide-modifying enzymes or adhesins. SSRs enable genetic and consequently phenotypic flexibility. SSRs function at various levels of gene expression regulation. Variations in the number of repeat units per locus or changes in the nature of the individual repeat sequences may result from recombination processes or polymerase inadequacy such as slipped-strand mispairing (SSM), either alone or in combination with DNA repair deficiencies. These rather complex phenomena can occur with relative ease, with SSM approaching a frequency of 10−4 per bacterial cell division and allowing high-frequency genetic switching. Bacteria use this random strategy to adapt their genetic repertoire in response to selective environmental pressure. SSR-mediated variation has important implications for bacterial pathogenesis and evolutionary fitness. Molecular analysis of changes in SSRs allows epidemiological studies on the spread of pathogenic bacteria. The occurrence, evolution and function of SSRs, and the molecular methods used to analyze them are discussed in the context of responsiveness to environmental factors, bacterial pathogenicity, epidemiology, and the availability of full-genome sequences for increasing numbers of microorganisms, especially those that are medically relevant. PMID:9618442
Identification of TTAGGG-binding proteins in Neurospora crassa, a fungus with vertebrate-like telomere repeats.

PubMed

Casas-Vila, Núria; Scheibe, Marion; Freiwald, Anja; Kappei, Dennis; Butter, Falk

2015-11-17

To date, telomere research in fungi has mainly focused on Saccharomyces cerevisiae and Schizosaccharomyces pombe, despite the fact that both yeasts have degenerated telomeric repeats in contrast to the canonical TTAGGG motif found in vertebrates and also several other fungi. Using label-free quantitative proteomics, we here investigate the telosome of Neurospora crassa, a fungus with canonical telomeric repeats. We show that at least six of the candidates detected in our screen are direct TTAGGG-repeat binding proteins. While three of the direct interactors (NCU03416 [ncTbf1], NCU01991 [ncTbf2] and NCU02182 [ncTay1]) feature the known myb/homeobox DNA interaction domain also found in the vertebrate telomeric factors, we additionally show that a zinc-finger protein (NCU07846) and two proteins without any annotated DNA-binding domain (NCU02644 and NCU05718) are also direct double-strand TTAGGG binders. We further find two single-strand binders (NCU02404 [ncGbp2] and NCU07735 [ncTcg1]). By quantitative label-free interactomics we identify TTAGGG-binding proteins in Neurospora crassa, suggesting candidates for telomeric factors that are supported by phylogenomic comparison with yeast species. Intriguingly, homologs in yeast species with degenerated telomeric repeats are also TTAGGG-binding proteins, e.g. in S. cerevisiae Tbf1 recognizes the TTAGGG motif found in its subtelomeres. However, there is also a subset of proteins that is not conserved. While a rudimentary core TTAGGG-recognition machinery may be conserved across yeast species, our data suggests Neurospora as an emerging model organism with unique features.
Development of unigene-derived SSR markers in cowpea (Vigna unguiculata) and their transferability to other Vigna species.

PubMed

Gupta, S K; Gopalakrishna, T

2010-07-01

Unigene sequences available in public databases provide a cost-effective and valuable source for the development of molecular markers. In this study, the identification and development of unigene-based SSR markers in cowpea (Vigna unguiculata (L.) Walp.) is presented. A total of 1071 SSRs were identified in 15 740 cowpea unigene sequences downloaded from the National Center for Biotechnology Information. The most frequent SSR motifs present in the unigenes were trinucleotides (59.7%), followed by dinucleotides (34.8%), pentanucleotides (4%), and tetranucleotides (1.5%). The copy number varied from 6 to 33 for dinucleotide, 5 to 29 for trinucleotide, 5 to 7 for tetranucleotide, and 4 to 6 for pentanucleotide repeats. Primer pairs were successfully designed for 803 SSR motifs and 102 SSR markers were finally characterized and validated. Putative function was assigned to 64.7% of the unigene SSR markers based on significant homology to reported proteins. About 31.7% of the SSRs were present in coding sequences and 68.3% in untranslated regions of the genes. About 87% of the SSRs located in the coding sequences were trinucleotide repeats. Allelic variation at 32 SSR loci produced 98 alleles in 20 cowpea genotypes. The polymorphic information content for the SSR markers varied from 0.10 to 0.83 with an average of 0.53. These unigene SSR markers showed a high rate of transferability (88%) across other Vigna species, thereby expanding their utility. Alignment of unigene sequences with soybean genomic sequences revealed the presence of introns in amplified products of some of the SSR markers. This study presents the distribution of SSRs in the expressed portion of the cowpea genome and is the first report of the development of functional unigene-based SSR markers in cowpea. These SSR markers would play an important role in molecular mapping, comparative genomics, and marker-assisted selection strategies in cowpea and other Vigna species.
A Hot-Spot Motif Characterizes the Interface between a Designed Ankyrin-Repeat Protein and Its Target Ligand

PubMed Central

Cheung, Luthur Siu-Lun; Kanwar, Manu; Ostermeier, Marc; Konstantopoulos, Konstantinos

2012-01-01

Nonantibody scaffolds such as designed ankyrin repeat proteins (DARPins) can be rapidly engineered to detect diverse target proteins with high specificity and offer an attractive alternative to antibodies. Using molecular simulations, we predicted that the binding interface between DARPin off7 and its ligand (maltose binding protein; MBP) is characterized by a hot-spot motif in which binding energy is largely concentrated on a few amino acids. To experimentally test this prediction, we fused MBP to a transmembrane domain to properly orient the protein into a polymer-cushioned lipid bilayer, and characterized its interaction with off7 using force spectroscopy. Using this, to our knowledge, novel technique along with surface plasmon resonance, we validated the simulation predictions and characterized the effects of select mutations on the kinetics of the off7-MBP interaction. Our integrated approach offers scientific insights on how the engineered protein interacts with the target molecule. PMID:22325262

Molecular cloning, gene expression analysis, and recombinant protein expression of novel silk proteins from larvae of a retreat-maker caddisfly, Stenopsyche marmorata.

PubMed

Bai, Xue; Sakaguchi, Mayo; Yamaguchi, Yuko; Ishihara, Shiori; Tsukada, Masuhiro; Hirabayashi, Kimio; Ohkawa, Kousaku; Nomura, Takaomi; Arai, Ryoichi

2015-08-28

Retreat-maker larvae of Stenopsyche marmorata, one of the major caddisfly species in Japan, produce silk threads and adhesives to build food capture nets and protective nests in water. Research on these underwater adhesive silk proteins potentially leads to the development of new functional biofiber materials. Recently, we identified four major S. marmorata silk proteins (Smsps), Smsp-1, Smsp-2, Smsp-3, and Smsp-4 from silk glands of S. marmorata larvae. In this study, we cloned full-length cDNAs of Smsp-2, Smsp-3, and Smsp-4 from the cDNA library of the S. marmorata silk glands to reveal the primary sequences of Smsps. Homology search results of the deduced amino acid sequences indicate that Smsp-2 and Smsp-4 are novel proteins. The Smsp-2 sequence [167 amino acids (aa)] has an array of GYD-rich repeat motifs and two (SX)4E motifs. The Smsp-4 sequence (132 aa) contains a number of GW-rich repeat motifs and three (SX)4E motifs. The Smsp-3 sequence (248 aa) exhibits high homology with fibroin light chain of other caddisflies. Gene expression analysis of Smsps by real-time PCR suggested that the gene expression of Smsp-1 and Smsp-3 was relatively stable throughout the year, whereas that of Smsp-2 and Smsp-4 varied seasonally. Furthermore, Smsps recombinant protein expression was successfully performed in Escherichia coli. The study provides new molecular insights into caddisfly aquatic silk and its potential for future applications. Copyright © 2015 Elsevier Inc. All rights reserved.
Ubiquitous presence of the hammerhead ribozyme motif along the tree of life

PubMed Central

de la Peña, Marcos; García-Robles, Inmaculada

2010-01-01

Examples of small self-cleaving RNAs embedded in noncoding regions already have been found to be involved in the control of gene expression, although their origin remains uncertain. In this work, we show the widespread occurrence of the hammerhead ribozyme (HHR) motif among genomes from the Bacteria, Chromalveolata, Plantae, and Metazoa kingdoms. Intergenic HHRs were detected in three different bacterial genomes, whereas metagenomic data from Galapagos Islands showed the occurrence of similar ribozymes that could be regarded as direct relics from the RNA world. Among eukaryotes, HHRs were detected in the genomes of three water molds as well as 20 plant species, ranging from unicellular algae to vascular plants. These HHRs were very similar to those previously described in small RNA plant pathogens and, in some cases, appeared as close tandem repetitions. A parallel situation of tandemly repeated HHR motifs was also detected in the genomes of lower metazoans from cnidarians to invertebrates, with special emphasis among hematophagous and parasitic organisms. Altogether, these findings unveil the HHR as a widespread motif in DNA genomes, which would be involved in new forms of retrotransposable elements. PMID:20705646
Electrostatics and N-glycan-mediated membrane tethering of SCUBE1 is critical for promoting bone morphogenetic protein signalling.

PubMed

Liao, Wei-Ju; Tsao, Ku-Chi; Yang, Ruey-Bing

2016-03-01

SCUBE1 (S1), a secreted and membrane-bound glycoprotein, has a modular protein structure composed of an N-terminal signal peptide sequence followed by nine epidermal growth factor (EGF)-like repeats, a spacer region and three cysteine-rich (CR) motifs with multiple potential N-linked glycosylation sites, and one CUB domain at the C-terminus. Soluble S1 is a biomarker of platelet activation but an active participant of thrombosis via its adhesive EGF-like repeats, whereas its membrane-associated form acts as a bone morphogenetic protein (BMP) co-receptor in promoting BMP signal activity. However, the mechanism responsible for the membrane tethering and the biological importance of N-glycosylation of S1 remain largely unknown. In the present study, molecular mapping analysis identified a polycationic segment (amino acids 501-550) in the spacer region required for its membrane tethering via electrostatic interactions possibly with the anionic heparan sulfate proteoglycans. Furthermore, deglycosylation by peptide N-glycosidase F treatment revealed that N-glycans within the CR motif are essential for membrane recruitment through lectin-mediated surface retention. Injection of mRNA encoding zebrafish wild-type but not N-glycan-deficient scube1 restores the expression of haematopoietic and erythroid markers (scl and gata1) in scube1-knockdown embryos. We describe novel mechanisms in targeting S1 to the plasma membrane and demonstrate that N-glycans are required for S1 functions during primitive haematopoiesis in zebrafish. © 2016 Authors; published by Portland Press Limited.
Analysis of SSR information in EST resources of sugarcane

USDA-ARS?s Scientific Manuscript database

Expressed sequence tags ( ESTs) offer the opportunity to exploit single, low -copy, conserved sequence motifs for the development of simple sequence repeats ( SSRs). The total of 262 113 ESTs of sugarcane (Saccharum officinarum) in the database of NCBI were downloaded and analyzed, which resulted in...
Motifs, modules and games in bacteria.

PubMed

Wolf, Denise M; Arkin, Adam P

2003-04-01

Global explorations of regulatory network dynamics, organization and evolution have become tractable thanks to high-throughput sequencing and molecular measurement of bacterial physiology. From these, a nascent conceptual framework is developing, that views the principles of regulation in term of motifs, modules and games. Motifs are small, repeated, and conserved biological units ranging from molecular domains to small reaction networks. They are arranged into functional modules, genetically dissectible cellular functions such as the cell cycle, or different stress responses. The dynamical functioning of modules defines the organism's strategy to survive in a game, pitting cell against cell, and cell against environment. Placing pathway structure and dynamics into an evolutionary context begins to allow discrimination between those physical and molecular features that particularize a species to its surroundings, and those that provide core physiological function. This approach promises to generate a higher level understanding of cellular design, pathway evolution and cellular bioengineering.
Motifs, modules and games in bacteria

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolf, Denise M.; Arkin, Adam P.

2003-04-01

Global explorations of regulatory network dynamics, organization and evolution have become tractable thanks to high-throughput sequencing and molecular measurement of bacterial physiology. From these, a nascent conceptual framework is developing, that views the principles of regulation in term of motifs, modules and games. Motifs are small, repeated, and conserved biological units ranging from molecular domains to small reaction networks. They are arranged into functional modules, genetically dissectible cellular functions such as the cell cycle, or different stress responses. The dynamical functioning of modules defines the organism's strategy to survive in a game, pitting cell against cell, and cell against environment.more » Placing pathway structure and dynamics into an evolutionary context begins to allow discrimination between those physical and molecular features that particularize a species to its surroundings, and those that provide core physiological function. This approach promises to generate a higher level understanding of cellular design, pathway evolution and cellular bioengineering.« less
Isolation and analysis of a multifunctional triterpene synthase KcMS promoter region from mangrove plant kandelia candel

NASA Astrophysics Data System (ADS)

Basyuni, M.; Wati, R.; Sulistiyono, N.; Sumardi; Oku, H.; Baba, S.; Sagami, H.

2018-03-01

Molecular cloning of Kandelia candel KcMS gene has previously been cloned and encoded a multifunctional triterpene synthase. In this study, the KcMS gene promoter was cloned through Genome walking, sequenced, and analyzed. A 1,358 bp genomic DNA fragment of KcMS promoter was obtained. PLACE and PlantCARE analysis of the KcMS promoter revealed that there was some regulatory elements in response to environmental signals and involved in the regulation of gene expression. Results showed that four kinds of elements are regulated by hormone binding, namely 2 MeJA-responsiveness elements (CGTCA-motif and TGACG-motif), the ABRE (TACGTG) involved in abscisic acid responsiveness, gibberellin-related GARE-motif (AAACAGA), and the TGA-element (AACGAC) as an auxin-responsive element. Several elements in the KcMS have been shown in other plants to be responsive to abiotic stress. These motifs were MBS (CAACTG), TC-rich repeats, and eight light responsive elements. The KcMS promoter was also involved in the activation of defense genes in plants such as HSE (AAAAAATTC) and four circadian control elements (CAANNNNATC). The presence of multipotential regulatory motifs suggested that KcMS may be involved in regulation of plant tolerance to several types of stresses.
A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum.

PubMed

Cheng, Jiaowen; Zhao, Zicheng; Li, Bo; Qin, Cheng; Wu, Zhiming; Trejo-Saavedra, Diana L; Luo, Xirong; Cui, Junjie; Rivera-Bustamante, Rafael F; Li, Shuaicheng; Hu, Kailin

2016-01-07

The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum.
A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum

PubMed Central

Cheng, Jiaowen; Zhao, Zicheng; Li, Bo; Qin, Cheng; Wu, Zhiming; Trejo-Saavedra, Diana L.; Luo, Xirong; Cui, Junjie; Rivera-Bustamante, Rafael F.; Li, Shuaicheng; Hu, Kailin

2016-01-01

The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum. PMID:26739748
Modulation of the multistate folding of designed TPR proteins through intrinsic and extrinsic factors

PubMed Central

Phillips, J J; Javadi, Y; Millership, C; Main, E R G

2012-01-01

Tetratricopeptide repeats (TPRs) are a class of all alpha-helical repeat proteins that are comprised of 34-aa helix-turn-helix motifs. These stack together to form nonglobular structures that are stabilized by short-range interactions from residues close in primary sequence. Unlike globular proteins, they have few, if any, long-range nonlocal stabilizing interactions. Several studies on designed TPR proteins have shown that this modular structure is reflected in their folding, that is, modular multistate folding is observed as opposed to two-state folding. Here we show that TPR multistate folding can be suppressed to approximate two-state folding through modulation of intrinsic stability or extrinsic environmental variables. This modulation was investigated by comparing the thermodynamic unfolding under differing buffer regimes of two distinct series of consensus-designed TPR proteins, which possess different intrinsic stabilities. A total of nine proteins of differing sizes and differing consensus TPR motifs were each thermally and chemically denatured and their unfolding monitored using differential scanning calorimetry (DSC) and CD/fluorescence, respectively. Analyses of both the DSC and chemical denaturation data show that reducing the total stability of each protein and repeat units leads to observable two-state unfolding. These data highlight the intimate link between global and intrinsic repeat stability that governs whether folding proceeds by an observably two-state mechanism, or whether partial unfolding yields stable intermediate structures which retain sufficient stability to be populated at equilibrium. PMID:22170589
A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

PubMed Central

2010-01-01

Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840
New PAH gene promoter KLF1 and 3'-region C/EBPalpha motifs influence transcription in vitro.

PubMed

Klaassen, Kristel; Stankovic, Biljana; Kotur, Nikola; Djordjevic, Maja; Zukic, Branka; Nikcevic, Gordana; Ugrin, Milena; Spasovski, Vesna; Srzentic, Sanja; Pavlovic, Sonja; Stojiljkovic, Maja

2017-02-01

Phenylketonuria (PKU) is a metabolic disease caused by mutations in the phenylalanine hydroxylase (PAH) gene. Although the PAH genotype remains the main determinant of PKU phenotype severity, genotype-phenotype inconsistencies have been reported. In this study, we focused on unanalysed sequences in non-coding PAH gene regions to assess their possible influence on the PKU phenotype. We transiently transfected HepG2 cells with various chloramphenicol acetyl transferase (CAT) reporter constructs which included PAH gene non-coding regions. Selected non-coding regions were indicated by in silico prediction to contain transcription factor binding sites. Furthermore, electrophoretic mobility shift assay (EMSA) and supershift assays were performed to identify which transcriptional factors were engaged in the interaction. We found novel KLF1 motif in the PAH promoter, which decreases CAT activity by 50 % in comparison to basal transcription in vitro. The cytosine at the c.-170 promoter position creates an additional binding site for the protein complex involving KLF1 transcription factor. Moreover, we assessed for the first time the role of a multivariant variable number tandem repeat (VNTR) region located in the 3'-region of the PAH gene. We found that the VNTR3, VNTR7 and VNTR8 constructs had approximately 60 % of CAT activity. The regulation is mediated by the C/EBPalpha transcription factor, present in protein complex binding to VNTR3. Our study highlighted two novel promoter KLF1 and 3'-region C/EBPalpha motifs in the PAH gene which decrease transcription in vitro and, thus, could be considered as PAH expression modifiers. New transcription motifs in non-coding regions will contribute to better understanding of the PKU phenotype complexity and may become important for the optimisation of PKU treatment.
A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

USDA-ARS?s Scientific Manuscript database

A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...
Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

USDA-ARS?s Scientific Manuscript database

Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...
Memory instability as a gateway to generalization

PubMed Central

2018-01-01

Our present frequently resembles our past. Patterns of actions and events repeat throughout our lives like a motif. Identifying and exploiting these patterns are fundamental to many behaviours, from creating grammar to the application of skill across diverse situations. Such generalization may be dependent upon memory instability. Following their formation, memories are unstable and able to interact with one another, allowing, at least in principle, common features to be extracted. Exploiting these common features creates generalized knowledge that can be applied across varied circumstances. Memory instability explains many of the biological and behavioural conditions necessary for generalization and offers predictions for how generalization is produced. PMID:29554094
The tripartite motif coiled-coil is an elongated antiparallel hairpin dimer.

PubMed

Sanchez, Jacint G; Okreglicka, Katarzyna; Chandrasekaran, Viswanathan; Welker, Jordan M; Sundquist, Wesley I; Pornillos, Owen

2014-02-18

Tripartite motif (TRIM) proteins make up a large family of coiled-coil-containing RING E3 ligases that function in many cellular processes, particularly innate antiviral response pathways. Both dimerization and higher-order assembly are important elements of TRIM protein function, but the atomic details of TRIM tertiary and quaternary structure have not been fully understood. Here, we present crystallographic and biochemical analyses of the TRIM coiled-coil and show that TRIM proteins dimerize by forming interdigitating antiparallel helical hairpins that position the N-terminal catalytic RING domains at opposite ends of the dimer and the C-terminal substrate-binding domains at the center. The dimer core comprises an antiparallel coiled-coil with a distinctive, symmetric pattern of flanking heptad and central hendecad repeats that appear to be conserved across the entire TRIM family. Our studies reveal how the coiled-coil organizes TRIM25 to polyubiquitylate the RIG-I/viral RNA recognition complex and how dimers of the TRIM5α protein are arranged within hexagonal arrays that recognize the HIV-1 capsid lattice and restrict retroviral replication.
The tripartite motif coiled-coil is an elongated antiparallel hairpin dimer

PubMed Central

Sanchez, Jacint G.; Okreglicka, Katarzyna; Chandrasekaran, Viswanathan; Welker, Jordan M.; Sundquist, Wesley I.; Pornillos, Owen

2014-01-01

Tripartite motif (TRIM) proteins make up a large family of coiled-coil-containing RING E3 ligases that function in many cellular processes, particularly innate antiviral response pathways. Both dimerization and higher-order assembly are important elements of TRIM protein function, but the atomic details of TRIM tertiary and quaternary structure have not been fully understood. Here, we present crystallographic and biochemical analyses of the TRIM coiled-coil and show that TRIM proteins dimerize by forming interdigitating antiparallel helical hairpins that position the N-terminal catalytic RING domains at opposite ends of the dimer and the C-terminal substrate-binding domains at the center. The dimer core comprises an antiparallel coiled-coil with a distinctive, symmetric pattern of flanking heptad and central hendecad repeats that appear to be conserved across the entire TRIM family. Our studies reveal how the coiled-coil organizes TRIM25 to polyubiquitylate the RIG-I/viral RNA recognition complex and how dimers of the TRIM5α protein are arranged within hexagonal arrays that recognize the HIV-1 capsid lattice and restrict retroviral replication. PMID:24550273
Mouse TCOF1 is expressed widely, has motifs conserved in nucleolar phosphoproteins, and maps to chromosome 18.

PubMed

Paznekas, W A; Zhang, N; Gridley, T; Jabs, E W

1997-09-08

Mutations in the human TCOF1 gene have been identified in patients with Treacher Collins Syndrome (Mandibulofacial Dysostosis), an autosomal dominant condition affecting the craniofacial region. We report the isolation of the entire mouse Tcof1 coding sequence (3960 bp) by performing a computer-based search for mouse cDNA clones homologous to TCOF1 and generating overlapping RT-PCR products from mouse RNA. Tcof1 is a 1320 amino acid protein of 135 kd with 61.4% identity to TCOF1 and displays repeating motifs enriched for serine- and acidic amino acid-rich regions with potential phosphorylation sites and putative nuclear localization signals. Tcof1 maps to the mouse chromosome 18 region syntenic with human chromosome 5q32-->q33 which contains the TCOF1 locus. Northern blot hybridization indicates Tcof1 expression is ubiquitous in adult tissues and in the embryonic stage, is elevated at 11 dpc when the branchial arches and facial swellings are present in mouse. Our results are consistent with TCOF1 mutations leading to the Treacher Collins syndrome phenotype.
Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

PubMed

Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

2005-09-01

We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.
Structure and Dynamics of DNA and RNA Double Helices Obtained from the CCG and GGC Trinucleotide Repeats.

PubMed

Pan, Feng; Man, Viet Hoang; Roland, Christopher; Sagui, Celeste

2018-04-26

Expansions of both GGC and CCG sequences lead to a number of expandable, trinucleotide repeat (TR) neurodegenerative diseases. Understanding of these diseases involves, among other things, the structural characterization of the atypical DNA and RNA secondary structures. We have performed molecular dynamics simulations of (GCC) n and (GGC) n homoduplexes in order to characterize their conformations, stability, and dynamics. Each TR has two reading frames, which results in eight nonequivalent RNA/DNA homoduplexes, characterized by CpG or GpC steps between the Watson-Crick base pairs. Free energy maps for the eight homoduplexes indicate that the C-mismatches prefer anti-anti conformations, while G-mismatches prefer anti-syn conformations. Comparison between three modifications of the DNA AMBER force field shows good agreement for the mismatch free energy maps. The mismatches in DNA-GCC (but not CCG) are extrahelical, forming an extended e-motif. The mismatched duplexes exhibit characteristic sequence-dependent step twist, with strong variations in the G-rich sequences and the e-motif. The distribution of Na + is highly localized around the mismatches, especially G-mismatches. In the e-motif, there is strong Na + binding by two G(N7) atoms belonging to the pseudo GpC step created when cytosines are extruded and by extrahelical cytosines. Finally, we used a novel technique based on fast melting by means of an infrared laser pulse to classify the relative stability of the different DNA-CCG and -GGC homoduplexes.

CACTA-superfamily transposable element is inserted in MYB transcription factor gene of soybean line producing variegated seeds.

PubMed

Yan, Fan; Di, Shaokang; Takahashi, Ryoji

2015-08-01

The R gene of soybean, presumably encoding a MYB transcription factor, controls seed coat color. The gene consists of multiple alleles, R (black), r-m (black spots and (or) concentric streaks on brown seed), and r (brown seed). This study was conducted to determine the structure of the MYB transcription factor gene in a near-isogenic line (NIL) having r-m allele. PCR amplification of a fragment of the candidate gene Glyma.09G235100 generated a fragment of about 1 kb in the soybean cultivar Clark, whereas a fragment of about 14 kb in addition to fragments of 1 and 1.4 kb were produced in L72-2040, a Clark 63 NIL with the r-m allele. Clark 63 is a NIL of Clark with the rxp and Rps1 alleles. A DNA fragment of 13 060 bp was inserted in the intron of Glyma.09G235100 in L72-2040. The fragment had the CACTA motif at both ends, imperfect terminal inverted repeats (TIR), inverse repetition of short sequence motifs close to the 5' and 3' ends, and a duplication of three nucleotides at the site of integration, indicating that it belongs to a CACTA-superfamily transposable element. We designated the element as Tgm11. Overall nucleotide sequence, motifs of TIR, and subterminal repeats were similar to those of Tgm1 and Tgs1, suggesting that these elements comprise a family.
Development and characterization of novel EST-SSR markers and their application for genetic diversity analysis of Jerusalem artichoke (Helianthus tuberosus L.).

PubMed

Mornkham, T; Wangsomnuk, P P; Mo, X C; Francisco, F O; Gao, L Z; Kurzweil, H

2016-10-24

Jerusalem artichoke (Helianthus tuberosus L.) is a perennial tuberous plant and a traditional inulin-rich crop in Thailand. It has become the most important source of inulin and has great potential for use in chemical and food industries. In this study, expressed sequence tag (EST)-based simple sequence repeat (SSR) markers were developed from 40,362 Jerusalem artichoke ESTs retrieved from the NCBI database. Among 23,691 non-redundant identified ESTs, 1949 SSR motifs harboring 2 to 6 nucleotides with varied repeat motifs were discovered from 1676 assembled sequences. Seventy-nine primer pairs were generated from EST sequences harboring SSR motifs. Our results show that 43 primers are polymorphic for the six studied populations, while the remaining 36 were either monomorphic or failed to amplify. These 43 SSR loci exhibited a high level of genetic diversity among populations, with allele numbers varying from 2 to 7, with an average of 3.95 alleles per loci. Heterozygosity ranged from 0.096 to 0.774, with an average of 0.536; polymorphic index content ranged from 0.096 to 0.854, with an average of 0.568. Principal component analysis and neighbor-joining analysis revealed that the six populations could be divided into six clusters. Our results indicate that these newly characterized EST-SSR markers may be useful in the exploration of genetic diversity and range expansion of the Jerusalem artichoke, and in cross-species application for the genus Helianthus.
Genetic and physical interactions between factors involved in both cell cycle progression and pre-mRNA splicing in Saccharomyces cerevisiae.

PubMed Central

Ben-Yehuda, S; Dix, I; Russell, C S; McGarvey, M; Beggs, J D; Kupiec, M

2000-01-01

The PRP17/CDC40 gene of Saccharomyces cerevisiae functions in two different cellular processes: pre-mRNA splicing and cell cycle progression. The Prp17/Cdc40 protein participates in the second step of the splicing reaction and, in addition, prp17/cdc40 mutant cells held at the restrictive temperature arrest in the G2 phase of the cell cycle. Here we describe the identification of nine genes that, when mutated, show synthetic lethality with the prp17/cdc40Delta allele. Six of these encode known splicing factors: Prp8p, Slu7p, Prp16p, Prp22p, Slt11p, and U2 snRNA. The other three, SYF1, SYF2, and SYF3, represent genes also involved in cell cycle progression and in pre-mRNA splicing. Syf1p and Syf3p are highly conserved proteins containing several copies of a repeated motif, which we term RTPR. This newly defined motif is shared by proteins involved in RNA processing and represents a subfamily of the known TPR (tetratricopeptide repeat) motif. Using two-hybrid interaction screens and biochemical analysis, we show that the SYF gene products interact with each other and with four other proteins: Isy1p, Cef1p, Prp22p, and Ntc20p. We discuss the role played by these proteins in splicing and cell cycle progression. PMID:11102353
Genetic and physical interactions between factors involved in both cell cycle progression and pre-mRNA splicing in Saccharomyces cerevisiae.

PubMed

Ben-Yehuda, S; Dix, I; Russell, C S; McGarvey, M; Beggs, J D; Kupiec, M

2000-12-01

The PRP17/CDC40 gene of Saccharomyces cerevisiae functions in two different cellular processes: pre-mRNA splicing and cell cycle progression. The Prp17/Cdc40 protein participates in the second step of the splicing reaction and, in addition, prp17/cdc40 mutant cells held at the restrictive temperature arrest in the G2 phase of the cell cycle. Here we describe the identification of nine genes that, when mutated, show synthetic lethality with the prp17/cdc40Delta allele. Six of these encode known splicing factors: Prp8p, Slu7p, Prp16p, Prp22p, Slt11p, and U2 snRNA. The other three, SYF1, SYF2, and SYF3, represent genes also involved in cell cycle progression and in pre-mRNA splicing. Syf1p and Syf3p are highly conserved proteins containing several copies of a repeated motif, which we term RTPR. This newly defined motif is shared by proteins involved in RNA processing and represents a subfamily of the known TPR (tetratricopeptide repeat) motif. Using two-hybrid interaction screens and biochemical analysis, we show that the SYF gene products interact with each other and with four other proteins: Isy1p, Cef1p, Prp22p, and Ntc20p. We discuss the role played by these proteins in splicing and cell cycle progression.
CmMDb: a versatile database for Cucumis melo microsatellite markers and other horticulture crop research.

PubMed

Bhawna; Chaduvula, Pavan K; Bonthala, Venkata S; Manjusha, Verma; Siddiq, Ebrahimali A; Polumetla, Ananda K; Prasad, Gajula M N V

2015-01-01

Cucumis melo L. that belongs to Cucurbitaceae family ranks among one of the highest valued horticulture crops being cultivated across the globe. Besides its economical and medicinal importance, Cucumis melo L. is a valuable resource and model system for the evolutionary studies of cucurbit family. However, very limited numbers of molecular markers were reported for Cucumis melo L. so far that limits the pace of functional genomic research in melon and other similar horticulture crops. We developed the first whole genome based microsatellite DNA marker database of Cucumis melo L. and comprehensive web resource that aids in variety identification and physical mapping of Cucurbitaceae family. The Cucumis melo L. microsatellite database (CmMDb: http://65.181.125.102/cmmdb2/index.html) encompasses 39,072 SSR markers along with its motif repeat, motif length, motif sequence, marker ID, motif type and chromosomal locations. The database is featured with novel automated primer designing facility to meet the needs of wet lab researchers. CmMDb is a freely available web resource that facilitates the researchers to select the most appropriate markers for marker-assisted selection in melons and to improve breeding strategies.
The Disruptive Effect of Lysozyme on the Bacterial Cell Wall Explored by an "In-Silico" Structural Outlook

ERIC Educational Resources Information Center

Primo, Emiliano D.; Otero, Lisandro H.; Ruiz, Francisco; Klinke, Sebastián; Giordano, Walter

2018-01-01

The bacterial cell wall, a structural unit of peptidoglycan polymer comprised of glycan strands consisting of a repeating disaccharide motif [N-acetylglucosamine (NAG) and N-acetylmuramylpentapeptide (NAM pentapeptide)], encases bacteria and provides structural integrity and protection. Lysozymes are enzymes that break down the bacterial cell wall…
PCR Cloning of Partial "nbs" Sequences from Grape ("Vitis aestivalis" Michx)

ERIC Educational Resources Information Center

Chang, Ming-Mei; DiGennaro, Peter; Macula, Anthony

2009-01-01

Plants defend themselves against pathogens via the expressions of disease resistance (R) genes. Many plant R gene products contain the characteristic nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domains. There are highly conserved motifs within the NBS domain which could be targeted for polymerase chain reaction (PCR) cloning of R…
Cas9 specifies functional viral targets during CRISPR-Cas adaptation.

PubMed

Heler, Robert; Samai, Poulami; Modell, Joshua W; Weiner, Catherine; Goldberg, Gregory W; Bikard, David; Marraffini, Luciano A

2015-03-12

Clustered regularly interspaced short palindromic repeat (CRISPR) loci and their associated (Cas) proteins provide adaptive immunity against viral infection in prokaryotes. Upon infection, short phage sequences known as spacers integrate between CRISPR repeats and are transcribed into small RNA molecules that guide the Cas9 nuclease to the viral targets (protospacers). Streptococcus pyogenes Cas9 cleavage of the viral genome requires the presence of a 5'-NGG-3' protospacer adjacent motif (PAM) sequence immediately downstream of the viral target. It is not known whether and how viral sequences flanked by the correct PAM are chosen as new spacers. Here we show that Cas9 selects functional spacers by recognizing their PAM during spacer acquisition. The replacement of cas9 with alleles that lack the PAM recognition motif or recognize an NGGNG PAM eliminated or changed PAM specificity during spacer acquisition, respectively. Cas9 associates with other proteins of the acquisition machinery (Cas1, Cas2 and Csn2), presumably to provide PAM-specificity to this process. These results establish a new function for Cas9 in the genesis of prokaryotic immunological memory.
Epitope mapping of PR81 anti-MUC1 monoclonal antibody following PEPSCAN and phage display techniques.

PubMed

Mohammadi, Mohammad; Rasaee, Mohammad Javad; Rajabibazl, Masoumeh; Paknejad, Malihe; Zare, Mehrak; Mohammadzadeh, Sara

2007-08-01

PR81 is an anti-MUC1 monoclonal antibody (MAb) which was generated against human MUC1 mucin that reacted with breast cancerous tissue, MUC1 positive cell line (MCF-7, BT-20, and T-4 7 D), and synthetic peptide, including the tandem repeat sequence of MUC1. Here we characterized the binding properties of PR81 against the tandem repeat of MUC1 by two different epitope mapping techniques, namely, PEPSCAN and phage display. Epitope mapping of PR81 MAb by PEPSCAN revealed a minimal consensus binding sequence, PDTRP, which is found on MUC1 peptide as the most important epitope. Using the phage display peptide library, we identified the motif PD(T/S/G)RP as an epitope and the motif AVGLSPDGSRGV as a mimotope recognized by PR81. Results of these two methods showed that the two residues, arginine and aspartic acid, have important roles in antibody binding and threonine can be substituted by either glycine or serine. These results may be of importance in tailor making antigens used in immunoassay.
T:G mismatch-specific thymine-DNA glycosylase (TDG) as a coregulator of transcription interacts with SRC1 family members through a novel tyrosine repeat motif

PubMed Central

Lucey, Marie J.; Chen, Dongsheng; Lopez-Garcia, Jorge; Hart, Stephen M.; Phoenix, Fladia; Al-Jehani, Rajai; Alao, John P.; White, Roger; Kindle, Karin B.; Losson, Régine; Chambon, Pierre; Parker, Malcolm G.; Schär, Primo; Heery, David M.; Buluwela, Lakjaya; Ali, Simak

2005-01-01

Gene activation involves protein complexes with diverse enzymatic activities, some of which are involved in chromatin modification. We have shown previously that the base excision repair enzyme thymine DNA glycosylase (TDG) acts as a potent coactivator for estrogen receptor-α. To further understand how TDG acts in this context, we studied its interaction with known coactivators of nuclear receptors. We find that TDG interacts in vitro and in vivo with the p160 coactivator SRC1, with the interaction being mediated by a previously undescribed motif encoding four equally spaced tyrosine residues in TDG, each tyrosine being separated by three amino acids. This is found to interact with two motifs in SRC1 also containing tyrosine residues separated by three amino acids. Site-directed mutagenesis shows that the tyrosines encoded in these motifs are critical for the interaction. The related p160 protein TIF2 does not interact with TDG and has the altered sequence, F-X-X-X-Y, at the equivalent positions relative to SRC1. Substitution of the phenylalanines to tyrosines is sufficient to bring about interaction of TIF2 with TDG. These findings highlight a new protein–protein interaction motif based on Y-X-X-X-Y and provide new insight into the interaction of diverse proteins in coactivator complexes. PMID:16282588
A structural-alphabet-based strategy for finding structural motifs across protein families

PubMed Central

Wu, Chih Yuan; Chen, Yao Chi; Lim, Carmay

2010-01-01

Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a ‘corner’ architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present ‘only’ in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign. PMID:20525797
A leucine repeat motif in AbiA is required for resistance of Lactococcus lactis to phages representing three species.

PubMed

Dinsmore, P K; O'Sullivan, D J; Klaenhammer, T R

1998-05-28

The abiA gene encodes an abortive bacteriophage infection mechanism that can protect Lactococcus species from infection by a variety of bacteriophages including three unrelated phage species. Five heptad leucine repeats suggestive of a leucine zipper motif were identified between residues 232 and 266 in the predicted amino acid sequence of the AbiA protein. The biological role of residues in the repeats was investigated by incorporating amino acid substitutions via site-directed mutagenesis. Each mutant was tested for phage resistance against three phages, phi 31, sk1, and c2, belonging to species P335, 936, and c2, respectively. The five residues that comprise the heptad repeats were designated L234, L242, A249, L256, and L263. Three single conservative mutations of leucine to valine in positions L235, L242, and L263 and a double mutation of two leucines (L235 and L242) to valines did not affect AbiA activity on any phages tested. Non-conservative single substitutions of charged amino acids for three of the leucines (L235, L242, and L256) virtually eliminated AbiA activity on all phages tested. Substitution of the alanine residue in the third repeat (A249) with a charged residue did not affect AbiA activity. Replacement of L242 with an alanine elimination phage resistance against phi 31, but partial resistance to sk1 and c2 remained. Two single proline substitutions for leucines L242 and L263 virtually eliminated AbiA activity against all phages, indicating that the predicted alpha-helical structure of this region is important. Mutations in an adjacent region of basic amino acids had various effects on phage resistance, suggesting that these basic residues are also important for AbiA activity. This directed mutagenesis analysis of AbiA indicated that the leucine repeat structure is essential for conferring phage resistance against three species of lactococcal bacteriophages.
Antibodies against the mono-methylated arginine-glycine repeat (MMA-RG) of the Epstein-Barr virus nuclear antigen 2 (EBNA2) identify potential cellular proteins targeted in viral transformation.

PubMed

Ayoubian, Hiresh; Fröhlich, Thomas; Pogodski, Dagmar; Flatley, Andrew; Kremmer, Elisabeth; Schepers, Aloys; Feederle, Regina; Arnold, Georg J; Grässer, Friedrich A

2017-08-01

The Epstein-Barr virus is a human herpes virus with oncogenic potential. The virus-encoded nuclear antigen 2 (EBNA2) is a key mediator of viral tumorigenesis. EBNA2 features an arginine-glycine (RG) repeat at amino acids (aa)339-354 that is essential for the transformation of lymphocytes and contains symmetrically (SDMA) and asymmetrically (ADMA) di-methylated arginine residues. The SDMA-modified EBNA2 binds the survival motor neuron protein (SMN), thus mimicking SMD3, a cellular SDMA-containing protein that interacts with SMN. Accordingly, a monoclonal antibody (mAb) specific for the SDMA-modified RG repeat of EBNA2 also binds to SMD3. With the novel mAb 19D4 we now show that EBNA2 contains mono-methylated arginine (MMA) residues within the RG repeat. Using 19D4, we immune-precipitated and analysed by mass spectrometry cellular proteins in EBV-transformed B-cells that feature MMA motifs that are similar to the one in EBNA2. Among the cellular proteins identified, we confirmed by immunoprecipitation and/or Western blot analyses Aly/REF, Coilin, DDX5, FXR1, HNRNPK, LSM4, MRE11, NRIP, nucleolin, PRPF8, RBM26, SMD1 (SNRDP1) and THRAP3 proteins that are either known to contain MMA residues or feature RG repeat sequences that probably serve as methylation substrates. The identified proteins are involved in splicing, tumorigenesis, transcriptional activation, DNA stability and RNA processing or export. Furthermore, we found that several proteins involved in energy metabolism are associated with MMA-modified proteins. Interestingly, the viral EBNA1 protein that features methylated RG repeat motifs also reacted with the antibodies. Our results indicate that the region between aa 34-52 of EBNA1 contains ADMA or SDMA residues, while the region between aa 328-377 mainly contains MMA residues.
cpSRP43 Is a Novel Chaperone Specific for Light-harvesting Chlorophyll a,b-binding Proteins*

PubMed Central

Falk, Sebastian; Sinning, Irmgard

2010-01-01

The biosynthesis of most membrane proteins is directly coupled to membrane insertion, and therefore, molecular chaperones are not required. The light-harvesting chlorophyll a,b-binding proteins (LHCPs) present a prominent exception as they are synthesized in the cytoplasm, and after import into the chloroplast, they are targeted and inserted into the thylakoid membrane. Upon arrival in the stroma, LHCPs form a soluble transit complex with the chloroplast signal recognition particle (cpSRP) consisting of an SRP54 homolog and the unique cpSRP43 composed of three chromodomains and four ankyrin repeats. Here we describe that cpSRP43 alone prevents aggregation of LHCP by formation of a complex with nanomolar affinity, whereas cpSRP54 is not required for this chaperone activity. Other stromal chaperones like trigger factor cannot replace cpSRP43, which implies that LHCPs require a specific chaperone. Although cpSRP43 does not have an ATPase activity, it can dissolve aggregates of LHCPs similar to chaperones of the Hsp104/ClpB family. We show that the LHCP-cpSRP43 interaction is predominantly hydrophobic but strictly depends on an intact DPLG motif between the second and third transmembrane region. The cpSRP43 ankyrin repeats that provide the binding site for the DPLG motif are sufficient for the chaperone function, whereas the chromodomains are dispensable. Taken together, we define cpSRP43 as a highly specific chaperone for LHCPs in addition to its established function as a targeting factor for this family of membrane proteins. PMID:20498370
PASTA repeats of the protein kinase StkP interconnect cell constriction and separation of Streptococcus pneumoniae.

PubMed

Zucchini, Laure; Mercy, Chryslène; Garcia, Pierre Simon; Cluzel, Caroline; Gueguen-Chaignon, Virginie; Galisson, Frédéric; Freton, Céline; Guiral, Sébastien; Brochier-Armanet, Céline; Gouet, Patrice; Grangeasse, Christophe

2018-02-01

Eukaryotic-like serine/threonine kinases (eSTKs) with extracellular PASTA repeats are key membrane regulators of bacterial cell division. How PASTA repeats govern eSTK activation and function remains elusive. Using evolution- and structural-guided approaches combined with cell imaging, we disentangle the role of each PASTA repeat of the eSTK StkP from Streptococcus pneumoniae. While the three membrane-proximal PASTA repeats behave as interchangeable modules required for the activation of StkP independently of cell wall binding, they also control the septal cell wall thickness. In contrast, the fourth and membrane-distal PASTA repeat directs StkP localization at the division septum and encompasses a specific motif that is critical for final cell separation through interaction with the cell wall hydrolase LytB. We propose a model in which the extracellular four-PASTA domain of StkP plays a dual function in interconnecting the phosphorylation of StkP endogenous targets along with septal cell wall remodelling to allow cell division of the pneumococcus.
Identification and characterization of a NBS–LRR class resistance gene analog in Pistacia atlantica subsp. Kurdica

PubMed Central

Bahramnejad, Bahman

2014-01-01

P. atlantica subsp. Kurdica, with the local name of Baneh, is a wild medicinal plant which grows in Kurdistan, Iran. The identification of resistance gene analogs holds great promise for the development of resistant cultivars. A PCR approach with degenerate primers designed according to conserved NBS-LRR (nucleotide binding site-leucine rich repeat) regions of known disease-resistance (R) genes was used to amplify and clone homologous sequences from P. atlantica subsp. Kurdica. A DNA fragment of the expected 500-bp size was amplified. The nucleotide sequence of this amplicon was obtained through sequencing and the predicted amino acid sequence compared to the amino acid sequences of known R-genes revealed significant sequence similarity. Alignment of the deduced amino acid sequence of P. atlantica subsp. Kurdica resistance gene analog (RGA) showed strong identity, ranging from 68% to 77%, to the non-toll interleukin receptor (non-TIR) R-gene subfamily from other plants. A P-loop motif (GMMGGEGKTT), a conserved and hydrophobic motif GLPLAL, a kinase-2a motif (LLVLDDV), when replaced by IAVFDDI in PAKRGA1 and a kinase-3a (FGPGSRIII) were presented in all RGA. A phylogenetic tree, based on the deduced amino-acid sequences of PAKRGA1 and RGAs from different species indicated that they were separated in two clusters, PAKRGA1 being on cluster II. The isolated NBS analogs can be eventually used as guidelines to isolate numerous R-genes in Pistachio. PMID:27843981
Function of multiple Lis-Homology domain/WD-40 repeat-containing proteins in feed-forward transcriptional repression by silencing mediator for retinoic and thyroid receptor/nuclear receptor corepressor complexes.

PubMed

Choi, Hyo-Kyoung; Choi, Kyung-Chul; Kang, Hee-Bum; Kim, Han-Cheon; Lee, Yoo-Hyun; Haam, Seungjoo; Park, Hyoung-Gi; Yoon, Ho-Geun

2008-05-01

Lis-homology (LisH) motifs are involved in protein dimerization, and the discovery of the conserved N-terminal LisH domain in transducin beta-like protein 1 and its receptor (TBL1 and TBLR1) led us to examine the role of this domain in transcriptional repression. Here we show that multiple beta-transducin (WD-40) repeat-containing proteins interact to form oligomers in solution and that oligomerization depends on the presence of the LisH domain in each protein. Repression of transcription, as assayed using Gal4 fusion proteins, also depended on the presence of the LisH domain, suggesting that oligomerization is a prerequisite for efficient transcriptional repression. Furthermore, we show that the LisH domain is responsible for the binding to the hypoacetylated histone H4 tail and for stable chromatin targeting by the nuclear receptor corepressor complex. Mutations in conserved residues in the LisH motif of TBL1 and TBLR1 block histone binding, oligomerization, and transcriptional repression, supporting the functional importance of the LisH motif in transcriptional repression. Our results indicate that another WD-40 protein, TBL3, also preferentially binds to the N-terminal domain of TBL1 and TBLR1, and forms oligomers with other WD-40 proteins. Finally, we observed that the WD-40 proteins RbAp46 and RbAp48 of the sin3A corepressor complex failed to dimerize. We also found the specific interaction UbcH/E2 with TBL1, but not RbAp46/48. Altogether, our results thus indicate that the presence of multiple LisH/WD-40 repeat containing proteins is exclusive to nuclear receptor corepressor/ silencing mediator for retinoic and thyroid receptor complexes compared with other class 1 histone deacetylase-containing corepessor complexes.
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence.

PubMed

Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J

2015-09-18

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence

DOE PAGES

Lahr, Roni M.; Mack, Seshat M.; Heroux, Annie; ...

2015-07-22

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. Amore » putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. Ultimately, these studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis.« less
Sequence analyses reveal that a TPR-DP module, surrounded by recombinable flanking introns, could be at the origin of eukaryotic Hop and Hip TPR-DP domains and prokaryotic GerD proteins.

PubMed

Hernández Torres, Jorge; Papandreou, Nikolaos; Chomilier, Jacques

2009-05-01

The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR-DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR-DP domains.

Molecular modeling of the elastomeric properties of repeating units and building blocks of resilin, a disordered elastic protein.

PubMed

Khandaker, Md Shahriar K; Dudek, Daniel M; Beers, Eric P; Dillard, David A; Bevan, David R

2016-08-01

The mechanisms responsible for the properties of disordered elastomeric proteins are not well known. To better understand the relationship between elastomeric behavior and amino acid sequence, we investigated resilin, a disordered rubber-like protein, found in specialized regions of the cuticle of insects. Resilin of Drosophila melanogaster contains Gly-rich repetitive motifs comprised of the amino acids, PSSSYGAPGGGNGGR, which confer elastic properties to resilin. The repetitive motifs of insect resilin can be divided into smaller partially conserved building blocks: PSS, SYGAP, GGGN and GGR. Using molecular dynamics (MD) simulations, we studied the relative roles of SYGAP, and its less common variants SYSAP and TYGAP, on the elastomeric properties of resilin. Results showed that SYGAP adopts a bent structure that is one-half to one-third the end-to-end length of the other motifs having an equal number of amino acids but containing SYSAP or TYGAP substituted for SYGAP. The bent structure of SYGAP forms due to conformational freedom of glycine, and hydrogen bonding within the motif apparently plays a role in maintaining this conformation. These structural features of SYGAP result in higher extensibility compared to other motifs, which may contribute to elastic properties at the macroscopic level. Overall, the results are consistent with a role for the SYGAP building block in the elastomeric properties of these disordered proteins. What we learned from simulating the repetitive motifs of resilin may be applicable to the biology and mechanics of other elastomeric biomaterials, and may provide us the deeper understanding of their unique properties. Copyright © 2016 Elsevier Ltd. All rights reserved.
Evolution of Protein Domain Repeats in Metazoa

PubMed Central

Schüler, Andreas; Bornberg-Bauer, Erich

2016-01-01

Repeats are ubiquitous elements of proteins and they play important roles for cellular function and during evolution. Repeats are, however, also notoriously difficult to capture computationally and large scale studies so far had difficulties in linking genetic causes, structural properties and evolutionary trajectories of protein repeats. Here we apply recently developed methods for repeat detection and analysis to a large dataset comprising over hundred metazoan genomes. We find that repeats in larger protein families experience generally very few insertions or deletions (indels) of repeat units but there is also a significant fraction of noteworthy volatile outliers with very high indel rates. Analysis of structural data indicates that repeats with an open structure and independently folding units are more volatile and more likely to be intrinsically disordered. Such disordered repeats are also significantly enriched in sites with a high functional potential such as linear motifs. Furthermore, the most volatile repeats have a high sequence similarity between their units. Since many volatile repeats also show signs of recombination, we conclude they are often shaped by concerted evolution. Intriguingly, many of these conserved yet volatile repeats are involved in host-pathogen interactions where they might foster fast but subtle adaptation in biological arms races. Key Words: protein evolution, domain rearrangements, protein repeats, concerted evolution. PMID:27671125
A New Protein Architecture for Processing Alkylation Damaged DNA: The Crystal Structure of DNA Glycosylase AlkD

PubMed Central

Rubinson, Emily H.; Metz, Audrey H.; O'Quin, Jami; Eichman, Brandt F.

2013-01-01

Summary DNA glycosylases safeguard the genome by locating and excising chemically modified bases from DNA. AlkD is a recently discovered bacterial DNA glycosylase that removes positively charged methylpurines from DNA, and was predicted to adopt a protein fold distinct from other DNA repair proteins. The crystal structure of Bacillus cereus AlkD presented here shows that the protein is composed exclusively of helical HEAT-like repeats, which form a solenoid perfectly shaped to accommodate a DNA duplex on the concave surface. Structural analysis of the variant HEAT repeats in AlkD provides a rationale for how this protein scaffolding motif has been modified to bind DNA. We report 7mG excision and DNA binding activities of AlkD mutants, along with a comparison of alkylpurine DNA glycosylase structures. Together, these data provide important insight into the requirements for alkylation repair within DNA and suggest that AlkD utilizes a novel strategy to manipulate DNA in its search for alkylpurine bases. PMID:18585735
MotifNet: a web-server for network motif analysis.

PubMed

Smoly, Ilan Y; Lerman, Eugene; Ziv-Ukelson, Michal; Yeger-Lotem, Esti

2017-06-15

Network motifs are small topological patterns that recur in a network significantly more often than expected by chance. Their identification emerged as a powerful approach for uncovering the design principles underlying complex networks. However, available tools for network motif analysis typically require download and execution of computationally intensive software on a local computer. We present MotifNet, the first open-access web-server for network motif analysis. MotifNet allows researchers to analyze integrated networks, where nodes and edges may be labeled, and to search for motifs of up to eight nodes. The output motifs are presented graphically and the user can interactively filter them by their significance, number of instances, node and edge labels, and node identities, and view their instances. MotifNet also allows the user to distinguish between motifs that are centered on specific nodes and motifs that recur in distinct parts of the network. MotifNet is freely available at http://netbio.bgu.ac.il/motifnet . The website was implemented using ReactJs and supports all major browsers. The server interface was implemented in Python with data stored on a MySQL database. estiyl@bgu.ac.il or michaluz@cs.bgu.ac.il. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Utility of EST-derived SSR in cultivated peanut (Arachis hypogaea L.) and Arachis wild species

PubMed Central

Liang, Xuanqiang; Chen, Xiaoping; Hong, Yanbin; Liu, Haiyan; Zhou, Guiyuan; Li, Shaoxiong; Guo, Baozhu

2009-01-01

Background Lack of sufficient molecular markers hinders current genetic research in peanuts (Arachis hypogaea L.). It is necessary to develop more molecular markers for potential use in peanut genetic research. With the development of peanut EST projects, a vast amount of available EST sequence data has been generated. These data offered an opportunity to identify SSR in ESTs by data mining. Results In this study, we investigated 24,238 ESTs for the identification and development of SSR markers. In total, 881 SSRs were identified from 780 SSR-containing unique ESTs. On an average, one SSR was found per 7.3 kb of EST sequence with tri-nucleotide motifs (63.9%) being the most abundant followed by di- (32.7%), tetra- (1.7%), hexa- (1.0%) and penta-nucleotide (0.7%) repeat types. The top six motifs included AG/TC (27.7%), AAG/TTC (17.4%), AAT/TTA (11.9%), ACC/TGG (7.72%), ACT/TGA (7.26%) and AT/TA (6.3%). Based on the 780 SSR-containing ESTs, a total of 290 primer pairs were successfully designed and used for validation of the amplification and assessment of the polymorphism among 22 genotypes of cultivated peanuts and 16 accessions of wild species. The results showed that 251 primer pairs yielded amplification products, of which 26 and 221 primer pairs exhibited polymorphism among the cultivated and wild species examined, respectively. Two to four alleles were found in cultivated peanuts, while 3–8 alleles presented in wild species. The apparent broad polymorphism was further confirmed by cloning and sequencing of amplified alleles. Sequence analysis of selected amplified alleles revealed that allelic diversity could be attributed mainly to differences in repeat type and length in the microsatellite regions. In addition, a few single base mutations were observed in the microsatellite flanking regions. Conclusion This study gives an insight into the frequency, type and distribution of peanut EST-SSRs and demonstrates successful development of EST-SSR markers in cultivated peanut. These EST-SSR markers could enrich the current resource of molecular markers for the peanut community and would be useful for qualitative and quantitative trait mapping, marker-assisted selection, and genetic diversity studies in cultivated peanut as well as related Arachis species. All of the 251 working primer pairs with names, motifs, repeat types, primer sequences, and alleles tested in cultivated and wild species are listed in Additional File 1. PMID:19309524
Bifunctional Anti-Huntingtin Proteasome-Directed Intrabodies Mediate Efficient Degradation of Mutant Huntingtin Exon 1 Protein Fragments

PubMed Central

Butler, David C.; Messer, Anne

2011-01-01

Huntington's disease (HD) is a fatal autosomal dominant neurodegenerative disorder caused by a trinucleotide (CAG)n repeat expansion in the coding sequence of the huntingtin gene, and an expanded polyglutamine (>37Q) tract in the protein. This results in misfolding and accumulation of huntingtin protein (htt), formation of neuronal intranuclear and cytoplasmic inclusions, and neuronal dysfunction/degeneration. Single-chain Fv antibodies (scFvs), expressed as intrabodies that bind htt and prevent aggregation, show promise as immunotherapeutics for HD. Intrastriatal delivery of anti-N-terminal htt scFv-C4 using an adeno-associated virus vector (AAV2/1) significantly reduces the size and number of aggregates in HDR6/1 transgenic mice; however, this protective effect diminishes with age and time after injection. We therefore explored enhancing intrabody efficacy via fusions to heterologous functional domains. Proteins containing a PEST motif are often targeted for proteasomal degradation and generally have a short half life. In ST14A cells, fusion of the C-terminal PEST region of mouse ornithine decarboxylase (mODC) to scFv-C4 reduces htt exon 1 protein fragments with 72 glutamine repeats (httex1-72Q) by ∼80–90% when compared to scFv-C4 alone. Proteasomal targeting was verified by either scrambling the mODC-PEST motif, or via proteasomal inhibition with epoxomicin. For these constructs, the proteasomal degradation of the scFv intrabody proteins themselves was reduced<25% by the addition of the mODC-PEST motif, with or without antigens. The remaining intrabody levels were amply sufficient to target N-terminal httex1-72Q protein fragment turnover. Critically, scFv-C4-PEST prevents aggregation and toxicity of httex1-72Q fragments at significantly lower doses than scFv-C4. Fusion of the mODC-PEST motif to intrabodies is a valuable general approach to specifically target toxic antigens to the proteasome for degradation. PMID:22216210
Evolutionary dynamics of the immunodominant repeats of the Plasmodium vivax malaria-vaccine candidate circumsporozoite protein (CSP)

PubMed Central

Patil, Aarti; Orjuela-Sánchez, Pamela; da Silva-Nunes, Mônica; Ferreira, Marcelo U.

2010-01-01

The circumsporozoite protein (CSP) of Plasmodium vivax, a major target for malaria vaccine development, has immunodominant B-cell epitopes mapped to central nonapeptide repeat arrays. To determine whether rearrangements of repeat motifs during mitotic DNA replication of parasites create significant CSP diversity under conditions of low effective meiotic recombination rates, we examined csp alleles from sympatric P. vivax isolates systematically sampled from an area of low malaria endemicity in Brazil over a period of 14 months. Nine unique csp types, comprising six different nonapeptide repeats, were observed in 45 isolates analyzed. Identical or nearly identical repeats predominated in most arrays, consistent with their recent expansion. We found strong linkage disequilibrium at sites across the chromosome 8 segment flanking the csp locus, consistent with rare meiotic recombination in this region. We conclude that CSP repeat diversity may not be severely constrained by rare meiotic recombination in areas of low malaria endemicity. New repeat variants may be readily created by nonhomologous recombination even when meiotic recombination is rare, with potential implications for CSP-based vaccine development. PMID:20097310
Influence of a heptad repeat stutter on the pH-dependent conformational behavior of the central coiled-coil from influenza hemagglutinin HA2.

PubMed

Higgins, Chelsea D; Malashkevich, Vladimir N; Almo, Steven C; Lai, Jonathan R

2014-09-01

The coiled-coil is one of the most common protein structural motifs. Amino acid sequences of regions that participate in coiled-coils contain a heptad repeat in which every third then forth residue is occupied by a hydrophobic residue. Here we examine the consequences of a "stutter," a deviation of the idealized heptad repeat that is found in the central coiled-coil of influenza hemagluttinin HA2. Characterization of a peptide containing the native stutter-containing HA2 sequence, as well as several variants in which the stutter was engineered out to restore an idealized heptad repeat pattern, revealed that the stutter is important for allowing coiled-coil formation in the WT HA2 at both neutral and low pH (7.1 and 4.5). By contrast, all variants that contained idealized heptad repeats exhibited marked pH-dependent coiled-coil formation with structures forming much more stably at low pH. A crystal structure of one variant containing an idealized heptad repeat, and comparison to the WT HA2 structure, suggest that the stutter distorts the optimal interhelical core packing arrangement, resulting in unwinding of the coiled-coil superhelix. Interactions between acidic side chains, in particular E69 and E74 (present in all peptides studied), are suggested to play a role in mediating these pH-dependent conformational effects. This conclusion is partially supported by studies on HA2 variant peptides in which these positions were altered to aspartic acid. These results provide new insight into the structural role of the heptad repeat stutter in HA2. © 2014 Wiley Periodicals, Inc.
CENP-B binds a novel centromeric sequence in the Asian mouse Mus caroli.

PubMed Central

Kipling, D; Mitchell, A R; Masumoto, H; Wilson, H E; Nicol, L; Cooke, H J

1995-01-01

Minor satellite DNA, found at Mus musculus centromeres, is not present in the genome of the Asian mouse Mus caroli. This repetitive sequence family is speculated to have a role in centromere function by providing an array of binding sites for the centromere-associated protein CENP-B. The apparent absence of CENP-B binding sites in the M. caroli genome poses a major challenge to this hypothesis. Here we describe two abundant satellite DNA sequences present at M. caroli centromeres. These satellites are organized as tandem repeat arrays, over 1 Mb in size, of either 60- or 79-bp monomers. All autosomes carry both satellites and small amounts of a sequence related to the M. musculus major satellite. The Y chromosome contains small amounts of both major satellite and the 60-bp satellite, whereas the X chromosome carries only major satellite sequences. M. caroli chromosomes segregate in M. caroli x M. musculus interspecific hybrid cell lines, indicating that the two sets of chromosomes can interact with the same mitotic spindle. Using a polyclonal CENP-B antiserum, we demonstrate that M. caroli centromeres can bind murine CENP-B in such an interspecific cell line, despite the absence of canonical 17-bp CENP-B binding sites in the M. caroli genome. Sequence analysis of the 79-bp M. caroli satellite reveals a 17-bp motif that contains all nine bases previously shown to be necessary for in vitro binding of CENP-B. This M. caroli motif binds CENP-B from HeLa cell nuclear extract in vitro, as indicated by gel mobility shift analysis. We therefore suggest that this motif also causes CENP-B to associate with M. caroli centromeres in vivo. Despite the sequence differences, M. caroli presents a third, novel mammalian centromeric sequence producing an array of binding sites for CENP-B. PMID:7623797
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

PubMed Central

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-01-01

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363
Isolation and mapping of telomeric pentanucleotide (TAACC)n repeats of the Pacific whiteleg shrimp, Penaeus vannamei, using fluorescence in situ hybridization.

PubMed

Alcivar-Warren, Acacia; Meehan-Meola, Dawn; Wang, Yongping; Guo, Ximing; Zhou, Linghua; Xiang, Jianhai; Moss, Shaun; Arce, Steve; Warren, William; Xu, Zhenkang; Bell, Kireina

2006-01-01

To develop genetic and physical maps for shrimp, accurate information on the actual number of chromosomes and a large number of genetic markers is needed. Previous reports have shown two different chromosome numbers for the Pacific whiteleg shrimp, Penaeus vannamei, the most important penaeid shrimp species cultured in the Western hemisphere. Preliminary results obtained by direct sequencing of clones from a Sau3A-digested genomic library of P. vannamei ovary identified a large number of (TAACC/GGTTA)-containing SSRs. The objectives of this study were to (1) examine the frequency of (TAACC)n repeats in 662 P. vannamei genomic clones that were directly sequenced, and perform homology searches of these clones, (2) confirm the number of chromosomes in testis of P. vannamei, and (3) localize the TAACC repeats in P. vannamei chromosome spreads using fluorescence in situ hybridization (FISH). Results for objective 1 showed that 395 out of the 662 clones sequenced contained single or multiple SSRs with three or more repeat motifs, 199 of which contained variable tandem repeats of the pentanucleotide (TAACC/GGTTA)n, with 3 to 14 copies per sequence. The frequency of (TAACC)n repeats in P. vannamei is 4.68 kb for SSRs with five or more repeat motifs. Sequence comparisons using the BLASTN nonredundant and expressed sequence tag (EST) databases indicated that most of the TAACC-containing clones were similar to either the core pentanucleotide repeat in PVPENTREP locus (GenBank accession no. X82619) or portions of 28S rRNA. Transposable elements (transposase for Tn1000 and reverse transcriptase family members), hypothetical or unnamed protein products, and genes of known function such as 18S and 28S rRNAs, heat shock protein 70, and thrombospondin were identified in non-TAACC-containing clones. For objective 2, the meiotic chromosome number of P. vannamei was confirmed as N = 44. For objective 3, four FISH probes (P1 to P4) containing different numbers of TAACC repeats produced positive signals on telomeres of P. vannamei chromosomes. A few chromosomes had positive signals interstitially. Probe signal strength and chromosome coverage differed in the general order of P1>P2>P3>P4, which correlated with the length of TAACC repeats within the probes: 83, 66, 35, and 30 bp, respectively, suggesting that the TAACC repeats, and not the flanking sequences, produced the TAACC signals at chromosome ends and TAACC is likely the telomere sequence for P. vannamei.
A mutation in the Arabidopsis HYL1 gene encoding a dsRNA binding protein affects responses to abscisic acid, auxin, and cytokinin

NASA Technical Reports Server (NTRS)

Lu, C.; Fedoroff, N.

2000-01-01

Both physiological and genetic evidence indicate interconnections among plant responses to different hormones. We describe a pleiotropic recessive Arabidopsis transposon insertion mutation, designated hyponastic leaves (hyl1), that alters the plant's responses to several hormones. The mutant is characterized by shorter stature, delayed flowering, leaf hyponasty, reduced fertility, decreased rate of root growth, and an altered root gravitropic response. It also exhibits less sensitivity to auxin and cytokinin and hypersensitivity to abscisic acid (ABA). The auxin transport inhibitor 2,3,5-triiodobenzoic acid normalizes the mutant phenotype somewhat, whereas another auxin transport inhibitor, N-(1-naph-thyl)phthalamic acid, exacerbates the phenotype. The gene, designated HYL1, encodes a 419-amino acid protein that contains two double-stranded RNA (dsRNA) binding motifs, a nuclear localization motif, and a C-terminal repeat structure suggestive of a protein-protein interaction domain. We present evidence that the HYL1 gene is ABA-regulated and encodes a nuclear dsRNA binding protein. We hypothesize that the HYL1 protein is a regulatory protein functioning at the transcriptional or post-transcriptional level.
Expressed sequence tags from the plant trypanosomatid Phytomonas serpens.

PubMed

Pappas, Georgios J; Benabdellah, Karim; Zingales, Bianca; González, Antonio

2005-08-01

We have generated 2190 expressed sequence tags (ESTs) from a cDNA library of the plant trypanosomatid Phytomonas serpens. Upon processing and clustering the set of 1893 accepted sequences was reduced to 697 clusters consisting of 452 singletons and 245 contigs. Functional categories were assigned based on BLAST searches against a database of the eukaryotic orthologous groups of proteins (KOG). Thirty six percent of the generated sequences showed no hits against the KOG database and 39.6% presented similarity to the KOG classes corresponding to translation, ribosomal structure and biogenesis. The most populated cluster contained 45 ESTs homologous to members of the glucose transporter family. This fact can be immediately correlated to the reported Phytomonas dependence on anaerobic glycolytic ATP production due to the lack of cytochrome-mediated respiratory chain. In this context, not only a number of enzymes of the glycolytic pathway were identified but also of the Krebs cycle as well as specific components of the respiratory chain. The data here reported, including a few hundred unique sequences and the description of tandemly repeated motifs and putative transcript stability motifs at untranslated mRNA ends, represent an initial approach to overcome the lack of information on the molecular biology of this organism.
Visual ModuleOrganizer: a graphical interface for the detection and comparative analysis of repeat DNA modules

PubMed Central

2014-01-01

Background DNA repeats, such as transposable elements, minisatellites and palindromic sequences, are abundant in sequences and have been shown to have significant and functional roles in the evolution of the host genomes. In a previous study, we introduced the concept of a repeat DNA module, a flexible motif present in at least two occurences in the sequences. This concept was embedded into ModuleOrganizer, a tool allowing the detection of repeat modules in a set of sequences. However, its implementation remains difficult for larger sequences. Results Here we present Visual ModuleOrganizer, a Java graphical interface that enables a new and optimized version of the ModuleOrganizer tool. To implement this version, it was recoded in C++ with compressed suffix tree data structures. This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences. Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results. Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules. As a case study, the analysis of FoldBack4 sequences clearly demonstrated that our tools can be extended to comparative and evolutionary analyses of any repeat sequence elements in a set of genomic sequences. With the increasing number of sequences available in public databases, it is now possible to perform comparative analyses of repeated DNA modules in a graphic and friendly manner within a reasonable time period. Availability Visual ModuleOrganizer interface and the new version of the ModuleOrganizer tool are freely available at: http://lcb.cnrs-mrs.fr/spip.php?rubrique313. PMID:24678954
[Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

PubMed

Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

2015-04-01

This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.
A mechanism for negative gene regulation in Autographa californica multinucleocapsid nuclear polyhedrosis virus

USGS Publications Warehouse

Leisy, D.J.; Rasmussen, C.; Owusu, E.O.; Rohrmann, G.F.

1997-01-01

The Autographa californica multinucleocapsid nuclear polyhedrosis virus (AcMNPV) ie-1 gene product (IE-1) is thought to play a central role in stimulating early viral transcription. IE-1 has been demonstrated to activate several early viral gene promoters and to negatively regulate the promoters of two other AcMNPV regulatory genes, ie-0 and ie-2. Our results indicate that IE-1 negatively regulates the expression of certain genes by binding directly, or as part of a complex, to promoter regions containing a specific IE-1-binding motif (5'-ACBYGTAA-3') near their mRNA start sites. The IE-1 binding motif was also found within the palindromic sequences of AcMNPV homologous repeat (hr) regions that have been shown to bind IE-1. The role of this IE-1 binding motif in the regulation of the ie-2 and pe-38 promoters was examined by introducing mutations in these promoters in which the central 6 bp were replaced with Bg/II sites. GUS reporter constructs containing ie-2 and pe-38 promoter fragments with and without these specific mutations were cotransfected into Sf9 cells with various amounts of an ie-1-containing plasmid (ple-1). Comparisons of GUS expression produced by the mutant and wild-type constructs demonstrated that the IE-1 binding motif mediated a significant decrease in expression from the ie-2 and pe-38 promoters in response to increasing pIe-1 concentrations. Electrophoretic mobility shift assays with pIe-1-transfected cell extracts and supershift assays with IE-1- specific antiserum demonstrated that IE-1 binds to promoter fragments containing the IE-1 binding motif but does not bind to promoter fragments lacking this motif.
Specific TATAA and bZIP requirements suggest that HTLV-I Tax has transcriptional activity subsequent to the assembly of an initiation complex

PubMed Central

Ching, Yick-Pang; Chun, Abel CS; Chin, King-Tung; Zhang, Zhi-Qing; Jeang, Kuan-Teh; Jin, Dong-Yan

2004-01-01

Background Human T-cell leukemia virus type I (HTLV-I) Tax protein is a transcriptional regulator of viral and cellular genes. In this study we have examined in detail the determinants for Tax-mediated transcriptional activation. Results Whereas previously the LTR enhancer elements were thought to be the sole Tax-targets, herein, we find that the core HTLV-I TATAA motif also provides specific responsiveness not seen with either the SV40 or the E1b TATAA boxes. When enhancer elements which can mediate Tax-responsiveness were compared, the authentic HTLV-I 21-bp repeats were found to be the most effective. Related bZIP factors such as CREB, ATF4, c-Jun and LZIP are often thought to recognize the 21-bp repeats equivalently. However, amongst bZIP factors, we found that CREB, by far, is preferred by Tax for activation. When LTR transcription was reconstituted by substituting either κB or serum response elements in place of the 21-bp repeats, Tax activated these surrogate motifs using surfaces which are different from that utilized for CREB interaction. Finally, we employed artificial recruitment of TATA-binding protein to the HTLV-I promoter in "bypass" experiments to show for the first time that Tax has transcriptional activity subsequent to the assembly of an initiation complex at the promoter. Conclusions Optimal activation of the HTLV-I LTR by Tax specifically requires the core HTLV-I TATAA promoter, CREB and the 21-bp repeats. In addition, we also provide the first evidence for transcriptional activity of Tax after the recruitment of TATA-binding protein to the promoter. PMID:15285791
Specific Binding of Tetratricopeptide Repeat Proteins to Heat Shock Protein 70 (Hsp70) and Heat Shock Protein 90 (Hsp90) Is Regulated by Affinity and Phosphorylation.

PubMed

Assimon, Victoria A; Southworth, Daniel R; Gestwicki, Jason E

2015-12-08

Heat shock protein 70 (Hsp70) and heat shock protein 90 (Hsp90) require the help of tetratricopeptide repeat (TPR) domain-containing cochaperones for many of their functions. Each monomer of Hsp70 or Hsp90 can interact with only a single TPR cochaperone at a time, and each member of the TPR cochaperone family brings distinct functions to the complex. Thus, competition for TPR binding sites on Hsp70 and Hsp90 appears to shape chaperone activity. Recent structural and biophysical efforts have improved our understanding of chaperone-TPR contacts, focusing on the C-terminal EEVD motif that is present in both chaperones. To better understand these important protein-protein interactions on a wider scale, we measured the affinity of five TPR cochaperones, CHIP, Hop, DnaJC7, FKBP51, and FKBP52, for the C-termini of four members of the chaperone family, Hsc70, Hsp72, Hsp90α, and Hsp90β, in vitro. These studies identified some surprising selectivity among the chaperone-TPR pairs, including the selective binding of FKBP51/52 to Hsp90α/β. These results also revealed that other TPR cochaperones are only able to weakly discriminate between the chaperones or between their paralogs. We also explored whether mimicking phosphorylation of serine and threonine residues near the EEVD motif might impact affinity and found that pseudophosphorylation had selective effects on binding to CHIP but not other cochaperones. Together, these findings suggest that both intrinsic affinity and post-translational modifications tune the interactions between the Hsp70 and Hsp90 proteins and the TPR cochaperones.
Discovering Sequence Motifs with Arbitrary Insertions and Deletions

PubMed Central

Frith, Martin C.; Saunders, Neil F. W.; Kobe, Bostjan; Bailey, Timothy L.

2008-01-01

Biology is encoded in molecular sequences: deciphering this encoding remains a grand scientific challenge. Functional regions of DNA, RNA, and protein sequences often exhibit characteristic but subtle motifs; thus, computational discovery of motifs in sequences is a fundamental and much-studied problem. However, most current algorithms do not allow for insertions or deletions (indels) within motifs, and the few that do have other limitations. We present a method, GLAM2 (Gapped Local Alignment of Motifs), for discovering motifs allowing indels in a fully general manner, and a companion method GLAM2SCAN for searching sequence databases using such motifs. glam2 is a generalization of the gapless Gibbs sampling algorithm. It re-discovers variable-width protein motifs from the PROSITE database significantly more accurately than the alternative methods PRATT and SAM-T2K. Furthermore, it usefully refines protein motifs from the ELM database: in some cases, the refined motifs make orders of magnitude fewer overpredictions than the original ELM regular expressions. GLAM2 performs respectably on the BAliBASE multiple alignment benchmark, and may be superior to leading multiple alignment methods for “motif-like” alignments with N- and C-terminal extensions. Finally, we demonstrate the use of GLAM2 to discover protein kinase substrate motifs and a gapped DNA motif for the LIM-only transcriptional regulatory complex: using GLAM2SCAN, we identify promising targets for the latter. GLAM2 is especially promising for short protein motifs, and it should improve our ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology. It may be equally useful for arbitrarily gapped motifs in DNA and RNA, although fewer examples of such motifs are known at present. GLAM2 is public domain software, available for download at http://bioinformatics.org.au/glam2. PMID:18437229
In silico search, characterization and validation of new EST-SSR markers in the genus Prunus.

PubMed

Sorkheh, Karim; Prudencio, Angela S; Ghebinejad, Azim; Dehkordi, Mehrana Kohei; Erogul, Deniz; Rubio, Manuel; Martínez-Gómez, Pedro

2016-07-07

Simple sequence repeats (SSRs) are defined as sequence repeat units between 1 and 6 bp that occur in both coding and non-coding regions abundant in eukaryotic genomes, which may affect the expression of genes. In this study, expressed sequence tags (ESTs) of eight Prunus species were analyzed for in silico mining of EST-SSRs, protein annotation, and open reading frames (ORFs), and the identification of codon repetitions. A total of 316 SSRs were identified using MISA software. Dinucleotide SSR motifs (26.31 %) were found to be the most abundant type of repeats, followed by tri- (14.58 %), tetra- (0.53 %), and penta- (0.27 %) nucleotide motifs. An attempt was made to design primer pairs for 316 identified SSRs but these were successful for only 175 SSR sequences. The positions of SSRs with respect to ORFs were detected, and annotation of sequences containing SSRs was performed to assign function to each sequence. SSRs were also characterized (in terms of position in the reference genome and associated gene) using the two available Prunus reference genomes (mei and peach). Finally, 38 SSR markers were validated across peach, almond, plum, and apricot genotypes. This validation showed a higher transferability level of EST-SSR developed in P. mume (mei) in comparison with the rest of species analyzed. Findings will aid analysis of functionally important molecular markers and facilitate the analysis of genetic diversity.

Characterization and compilation of polymorphic simple sequence repeat (SSR) markers of peanut from public database

PubMed Central

2012-01-01

Background There are several reports describing thousands of SSR markers in the peanut (Arachis hypogaea L.) genome. There is a need to integrate various research reports of peanut DNA polymorphism into a single platform. Further, because of lack of uniformity in the labeling of these markers across the publications, there is some confusion on the identities of many markers. We describe below an effort to develop a central comprehensive database of polymorphic SSR markers in peanut. Findings We compiled 1,343 SSR markers as detecting polymorphism (14.5%) within a total of 9,274 markers. Amongst all polymorphic SSRs examined, we found that AG motif (36.5%) was the most abundant followed by AAG (12.1%), AAT (10.9%), and AT (10.3%).The mean length of SSR repeats in dinucleotide SSRs was significantly longer than that in trinucleotide SSRs. Dinucleotide SSRs showed higher polymorphism frequency for genomic SSRs when compared to trinucleotide SSRs, while for EST-SSRs, the frequency of polymorphic SSRs was higher in trinucleotide SSRs than in dinucleotide SSRs. The correlation of the length of SSR and the frequency of polymorphism revealed that the frequency of polymorphism was decreased as motif repeat number increased. Conclusions The assembled polymorphic SSRs would enhance the density of the existing genetic maps of peanut, which could also be a useful source of DNA markers suitable for high-throughput QTL mapping and marker-assisted selection in peanut improvement and thus would be of value to breeders. PMID:22818284
Development and application of microsatellites in candidate genes related to wood properties in the Chinese white poplar (Populus tomentosa Carr.).

PubMed

Du, Qingzhang; Gong, Chenrui; Pan, Wei; Zhang, Deqiang

2013-02-01

Gene-derived simple sequence repeats (genic SSRs), also known as functional markers, are often preferred over random genomic markers because they represent variation in gene coding and/or regulatory regions. We characterized 544 genic SSR loci derived from 138 candidate genes involved in wood formation, distributed throughout the genome of Populus tomentosa, a key ecological and cultivated wood production species. Of these SSRs, three-quarters were located in the promoter or intron regions, and dinucleotide (59.7%) and trinucleotide repeat motifs (26.5%) predominated. By screening 15 wild P. tomentosa ecotypes, we identified 188 polymorphic genic SSRs with 861 alleles, 2-7 alleles for each marker. Transferability analysis of 30 random genic SSRs, testing whether these SSRs work in 26 genotypes of five genus Populus sections (outgroup, Salix matsudana), showed that 72% of the SSRs could be amplified in Turanga and 100% could be amplified in Leuce. Based on genotyping of these 26 genotypes, a neighbour-joining analysis showed the expected six phylogenetic groupings. In silico analysis of SSR variation in 220 sequences that are homologous between P. tomentosa and Populus trichocarpa suggested that genic SSR variations between relatives were predominantly affected by repeat motif variations or flanking sequence mutations. Inheritance tests and single-marker associations demonstrated the power of genic SSRs in family-based linkage mapping and candidate gene-based association studies, as well as marker-assisted selection and comparative genomic studies of P. tomentosa and related species.
Bioengineered Chimeric Spider Silk-Uranium Binding Proteins

PubMed Central

Krishnaji, Sreevidhya Tarakkad; Kaplan, David L.

2014-01-01

Heavy metals constitute a source of environmental pollution. Here, novel functional hybrid biomaterials for specific interactions with heavy metals are designed by bioengineering consensus sequence repeats from spider silk of Nephila clavipes with repeats of a uranium peptide recognition motif from a mutated 33-residue of calmodulin protein from Paramecium tetraurelia. The self-assembly features of the silk to control nanoscale organic/inorganic material interfaces provides new biomaterials for uranium recovery. With subsequent enzymatic digestion of the silk to concentrate the sequestered metals, options can be envisaged to use these new chimeric protein systems in environmental engineering, including to remediate environments contaminated by uranium. PMID:23212989
Investigating the role of GXXXG motifs in helical folding and self-association of plasticins, Gly/Leu-rich antimicrobial peptides.

PubMed

Carlier, Ludovic; Joanne, Pierre; Khemtémourian, Lucie; Lacombe, Claire; Nicolas, Pierre; El Amri, Chahrazade; Lequin, Olivier

2015-01-01

Plasticins (PTC) are dermaseptin-related antimicrobial peptides characterized by a large number of leucine and glycine residues arranged in GXXXG motifs that are often described to promote helix association within biological membranes. We report the structure and interaction properties of two plasticins, PTC-B1 from Phyllomedusa bicolor and a cationic analog of PTC-DA1 from Pachymedusa dacnicolor, which exhibit membrane-lytic activities on a broad range of microorganisms. Despite a high number of glycine, CD and NMR spectroscopy show that the two plasticins adopt mainly alpha-helical conformations in a wide variety of environments such as trifluoroethanol, detergent micelles and lipid vesicles. In DPC and SDS, plasticins adopt well-defined helices that lie parallel to the micelle surface, all glycine residues being located on the solvent-exposed face. Spectroscopic data and cross-linking experiments indicate that the GXXXG repeats in these amphipathic helices do not provide a strong oligomerization interface, suggesting a different role from GXXXG motifs found in transmembrane helices. Copyright © 2014 Elsevier B.V. All rights reserved.
A common antigenic motif recognized by naturally occurring human VH5-51/VL4-1 anti-tau antibodies with distinct functionalities.

PubMed

Apetri, Adrian; Crespo, Rosa; Juraszek, Jarek; Pascual, Gabriel; Janson, Roosmarijn; Zhu, Xueyong; Zhang, Heng; Keogh, Elissa; Holland, Trevin; Wadia, Jay; Verveen, Hanneke; Siregar, Berdien; Mrosek, Michael; Taggenbrock, Renske; Ameijde, Jeroenvan; Inganäs, Hanna; van Winsen, Margot; Koldijk, Martin H; Zuijdgeest, David; Borgers, Marianne; Dockx, Koen; Stoop, Esther J M; Yu, Wenli; Brinkman-van der Linden, Els C; Ummenthum, Kimberley; van Kolen, Kristof; Mercken, Marc; Steinbacher, Stefan; de Marco, Donata; Hoozemans, Jeroen J; Wilson, Ian A; Koudstaal, Wouter; Goudsmit, Jaap

2018-05-31

Misfolding and aggregation of tau protein are closely associated with the onset and progression of Alzheimer's Disease (AD). By interrogating IgG + memory B cells from asymptomatic donors with tau peptides, we have identified two somatically mutated V H 5-51/V L 4-1 antibodies. One of these, CBTAU-27.1, binds to the aggregation motif in the R3 repeat domain and blocks the aggregation of tau into paired helical filaments (PHFs) by sequestering monomeric tau. The other, CBTAU-28.1, binds to the N-terminal insert region and inhibits the spreading of tau seeds and mediates the uptake of tau aggregates into microglia by binding PHFs. Crystal structures revealed that the combination of V H 5-51 and V L 4-1 recognizes a common Pro-X n -Lys motif driven by germline-encoded hotspot interactions while the specificity and thereby functionality of the antibodies are defined by the CDR3 regions. Affinity improvement led to improvement in functionality, identifying their epitopes as new targets for therapy and prevention of AD.
Automated extraction and classification of RNA tertiary structure cyclic motifs

PubMed Central

Lemieux, Sébastien; Major, François

2006-01-01

A minimum cycle basis of the tertiary structure of a large ribosomal subunit (LSU) X-ray crystal structure was analyzed. Most cycles are small, as they are composed of 3- to 5 nt, and repeated across the LSU tertiary structure. We used hierarchical clustering to quantify and classify the 4 nt cycles. One class is defined by the GNRA tetraloop motif. The inspection of the GNRA class revealed peculiar instances in sequence. First is the presence of UA, CA, UC and CC base pairs that substitute the usual sheared GA base pair. Second is the revelation of GNR(Xn)A tetraloops, where Xn is bulged out of the classical GNRA structure, and of GN/RA formed by the two strands of interior-loops. We were able to unambiguously characterize the cycle classes using base stacking and base pairing annotations. The cycles identified correspond to small and cyclic motifs that compose most of the LSU RNA tertiary structure and contribute to its thermodynamic stability. Consequently, the RNA minimum cycles could well be used as the basic elements of RNA tertiary structure prediction methods. PMID:16679452
Crystal Structure of FadA Adhesin from Fusobacterium nucleatum Reveals a Novel Oligomerization Motif, the Leucine Chain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nithianantham, Stanley; Xu, Minghua; Yamada, Mitsunori

2009-04-07

Many bacterial appendages have filamentous structures, often composed of repeating monomers assembled in a head-to-tail manner. The mechanisms of such linkages vary. We report here a novel protein oligomerization motif identified in the FadA adhesin from the Gram-negative bacterium Fusobacterium nucleatum. The 2.0 {angstrom} crystal structure of the secreted form of FadA (mFadA) reveals two antiparallel {alpha}-helices connected by an intervening 8-residue hairpin loop. Leucine-leucine contacts play a prominent dual intra- and intermolecular role in the structure and function of FadA. First, they comprise the main association between the two helical arms of the monomer; second, they mediate the head-to-tailmore » association of monomers to form the elongated polymers. This leucine-mediated filamentous assembly of FadA molecules constitutes a novel structural motif termed the 'leucine chain.' The essential role of these residues in FadA is corroborated by mutagenesis of selected leucine residues, which leads to the abrogation of oligomerization, filament formation, and binding to host cells.« less
The cytidine deaminase signature HxE(x)n CxxC of DYW1 binds zinc and is necessary for RNA editing of ndhD-1.

PubMed

Boussardon, Clément; Avon, Alexandra; Kindgren, Peter; Bond, Charles S; Challenor, Michael; Lurin, Claire; Small, Ian

2014-09-01

In flowering plants, RNA editing involves deamination of specific cytidines to uridines in both mitochondrial and chloroplast transcripts. Pentatricopeptide repeat (PPR) proteins and multiple organellar RNA editing factor (MORF) proteins have been shown to be involved in RNA editing but none have been shown to possess cytidine deaminase activity. The DYW domain of some PPR proteins contains a highly conserved signature resembling the zinc-binding active site motif of known nucleotide deaminases. We modified these highly conserved amino acids in the DYW motif of DYW1, an editing factor required for editing of the ndhD-1 site in Arabidopsis chloroplasts. We demonstrate that several amino acids of this signature motif are required for RNA editing in vivo and for zinc binding in vitro. We conclude that the DYW domain of DYW1 has features in common with cytidine deaminases, reinforcing the hypothesis that this domain forms part of the active enzyme that carries out RNA editing in plants. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
Hyperactive antifreeze proteins from longhorn beetles: some structural insights.

PubMed

Kristiansen, Erlend; Wilkens, Casper; Vincents, Bjarne; Friis, Dennis; Lorentzen, Anders Blomkild; Jenssen, Håvard; Løbner-Olesen, Anders; Ramløv, Hans

2012-11-01

This study reports on structural characteristics of hyperactive antifreeze proteins (AFPs) from two species of longhorn beetles. In Rhagium mordax, eight unique mRNAs coding for five different mature AFPs were identified from cold-hardy individuals. These AFPs are apparently homologues to a previously characterized AFP from the closely related species Rhagium inquisitor, and consist of six identifiable repeats of a putative ice binding motif TxTxTxT spaced irregularly apart by segments varying in length from 13 to 20 residues. Circular dichroism spectra show that the AFPs from both species have a high content of β-sheet and low levels of α-helix and random coil. Theoretical predictions of residue-specific secondary structure locate these β-sheets within the putative ice-binding motifs and the central parts of the segments separating them, consistent with an overall β-helical structure with the ice-binding motifs stacked in a β-sheet on one side of the coil. Molecular dynamics models based on these findings show that these AFPs would be energetically stable in a β-helical conformation. Copyright © 2012 Elsevier Ltd. All rights reserved.
Sequence analyses reveal that a TPR–DP module, surrounded by recombinable flanking introns, could be at the origin of eukaryotic Hop and Hip TPR–DP domains and prokaryotic GerD proteins

PubMed Central

Papandreou, Nikolaos; Chomilier, Jacques

2008-01-01

The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR–DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR–DP domains. Electronic supplementary material The online version of this article (doi:10.1007/s12192-008-0083-8) contains supplementary material, which is available to authorized users. PMID:18987995
An Efficient Scheme for Crystal Structure Prediction Based on Structural Motifs

DOE PAGES

Zhu, Zizhong; Wu, Ping; Wu, Shunqing; ...

2017-05-15

An efficient scheme based on structural motifs is proposed for the crystal structure prediction of materials. The key advantage of the present method comes in two fold: first, the degrees of freedom of the system are greatly reduced, since each structural motif, regardless of its size, can always be described by a set of parameters (R, θ, φ) with five degrees of freedom; second, the motifs could always appear in the predicted structures when the energies of the structures are relatively low. Both features make the present scheme a very efficient method for predicting desired materials. The method has beenmore » applied to the case of LiFePO 4, an important cathode material for lithium-ion batteries. Numerous new structures of LiFePO 4 have been found, compared to those currently available, available, demonstrating the reliability of the present methodology and illustrating the promise of the concept of structural motifs.« less
An Efficient Scheme for Crystal Structure Prediction Based on Structural Motifs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Zizhong; Wu, Ping; Wu, Shunqing

An efficient scheme based on structural motifs is proposed for the crystal structure prediction of materials. The key advantage of the present method comes in two fold: first, the degrees of freedom of the system are greatly reduced, since each structural motif, regardless of its size, can always be described by a set of parameters (R, θ, φ) with five degrees of freedom; second, the motifs could always appear in the predicted structures when the energies of the structures are relatively low. Both features make the present scheme a very efficient method for predicting desired materials. The method has beenmore » applied to the case of LiFePO 4, an important cathode material for lithium-ion batteries. Numerous new structures of LiFePO 4 have been found, compared to those currently available, available, demonstrating the reliability of the present methodology and illustrating the promise of the concept of structural motifs.« less
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

PubMed

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-11-16

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Isolation and molecular characterization of dTnp1, a mobile and defective transposable element of Nicotiana plumbaginifolia.

PubMed

Meyer, C; Pouteau, S; Rouzé, P; Caboche, M

1994-01-01

By Northern blot analysis of nitrate reductase-deficient mutants of Nicotiana plumbaginifolia, we identified a mutant (mutant D65), obtained after gamma-ray irradiation of protoplasts, which contained an insertion sequence in the nitrate reductase (NR) mRNA. This insertion sequence was localized by polymerase chain reaction (PCR) in the first exon of NR and was also shown to be present in the NR gene. The mutant gene contained a 565 bp insertion sequence that exhibits the sequence characteristics of a transposable element, which was thus named dTnp1. The dTnp1 element has 14 bp terminal inverted repeats and is flanked by an 8-bp target site duplication generated upon transposition. These inverted repeats have significant sequence homology with those of other transposable elements. Judging by its size and the absence of a long open reading frame, dTnp1 appears to represent a defective, although mobile, transposable element. The octamer motif TTTAGGCC was found several times in direct orientation near the 5' and 3' ends of dTnp1 together with a perfect palindrome located after the 5' inverted repeat. Southern blot analysis using an internal probe of dTnp1 suggested that this element occurs as a single copy in the genome of N. plumbaginifolia. It is also present in N. tabacum, but absent in tomato or petunia. The dTnp1 element is therefore of potential use for gene tagging in Nicotiana species.
Development and characterization of 32 microsatellite loci in Genipa americana (Rubiaceae)1

PubMed Central

Manoel, Ricardo O.; Freitas, Miguel L. M.; Barreto, Mariana A.; Moraes, Mário L. T.; Souza, Anete P.; Sebbenn, Alexandre M.

2014-01-01

• Premise of the study: Microsatellite primers were developed for the tree species Genipa americana (Rubiaceae) for further population genetic studies. • Methods and Results: We identified 144 clones containing 65 repeat motifs from a genomic library enriched for (CT)8 and (GT)8 motifs. Primer pairs were developed for 32 microsatellite loci and validated in 40 individuals of two natural G. americana populations. Seventeen loci were polymorphic, revealing from three to seven alleles per locus. The observed and expected heterozygosities ranged from 0.24 to 1.00 and from 0.22 to 0.78, respectively. • Conclusions: The 17 primers identified as polymorphic loci are suitable to study the genetic diversity and structure, mating system, and gene flow in G. americana. PMID:25202610
Plasticity of signaling and mate choice in a trilling species of the Mecopoda complex (Orthoptera: Tettigoniidae).

PubMed

Krobath, I; Römer, H; Hartbauer, M

2017-01-01

Males of a trilling species in the Mecopoda complex produce conspicuous calling songs that consist of two motifs: an amplitude-modulated motif with alternating loud and soft segments (AM-motif) and a continuous, high-intensity trill. The function of these song motifs for female attraction and competition between males was investigated. We tested the hypothesis that males modify their signaling behavior depending on the social environment (presence/absence of females or rival males) when they compete for mates. Therefore, we analyzed acoustic signaling of males in three different situations: (1) solo singing, (2) acoustic interaction with another male, and (3) singing in the presence of a female. In addition, the preference of females for these song motifs and further song parameters was studied in two-choice experiments. As expected, females showed a preference for conspicuous and loud song elements, but nevertheless, males increased the proportion of the AM-motif in the presence of a female. In acoustic interactions, males reduced bout duration significantly compared to both other situations. However, song bouts in this situation still overlapped more than expected by chance, which indicates intentionally simultaneous singing. A multivariate statistical analysis revealed that the proportion of the AM-motif and the duration of loud segments within the AM-motif allow a reliable prediction of whether males sing in isolation, compete with another male, or sing in the presence of a female. These results indicate that the AM-motif plays a dominant role especially in close-range courtship and that males are challenged in finding a balance between attracting females and saving energy during repeated acoustic interactions. Males of acoustic insects often produce conspicuous calling songs that have a dual function in male-male competition and mate attraction. High signal amplitudes and signal rates are associated with high energetic costs for signal production. We would therefore predict that males adjust their signaling behavior according to their perception of the social context. Here we studied signal production and mate choice in a katydid, where males switch between loud and soft song segments in a dynamic way. Additionally, we examined the attractiveness of different song elements in female choice tests. Our results show how males of this katydid deal with the conflict of remaining attractive for females and competing with a costly signal with rivals.
TRStalker: an efficient heuristic for finding fuzzy tandem repeats.

PubMed

Pellegrini, Marco; Renda, M Elena; Vecchio, Alessio

2010-06-15

Genomes in higher eukaryotic organisms contain a substantial amount of repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage and are characterized by close spatial contiguity. They play an important role in several molecular regulatory mechanisms, and also in several diseases (e.g. in the group of trinucleotide repeat disorders). While for TRs with a low or medium level of divergence the current methods are rather effective, the problem of detecting TRs with higher divergence (fuzzy TRs) is still open. The detection of fuzzy TRs is propaedeutic to enriching our view of their role in regulatory mechanisms and diseases. Fuzzy TRs are also important as tools to shed light on the evolutionary history of the genome, where higher divergence correlates with more remote duplication events. We have developed an algorithm (christened TRStalker) with the aim of detecting efficiently TRs that are hard to detect because of their inherent fuzziness, due to high levels of base substitutions, insertions and deletions. To attain this goal, we developed heuristics to solve a Steiner version of the problem for which the fuzziness is measured with respect to a motif string not necessarily present in the input string. This problem is akin to the 'generalized median string' that is known to be an NP-hard problem. Experiments with both synthetic and biological sequences demonstrate that our method performs better than current state of the art for fuzzy TRs and that the fuzzy TRs of the type we detect are indeed present in important biological sequences. TRStalker will be integrated in the web-based TRs Discovery Service (TReaDS) at bioalgo.iit.cnr.it. Supplementary data are available at Bioinformatics online.
Association of a Model Transmembrane Peptide Containing Gly in a Heptad Sequence Motif

PubMed Central

Lear, James D.; Stouffer, Amanda L.; Gratkowski, Holly; Nanda, Vikas; DeGrado, William F.

2004-01-01

A peptide containing glycine at a and d positions of a heptad motif was synthesized to investigate the possibility that membrane-soluble peptides with a Gly-based, left-handed helical packing motif would associate. Based on analytical ultracentrifugation in C14-betaine detergent micelles, the peptide did associate in a monomer-dimer equilibrium, although the association constant was significantly less than that reported for the right-handed dimer of the glycophorin A transmembrane peptide in similar detergents. Fluorescence resonance energy transfer (FRET) experiments conducted on peptides labeled at their N-termini with either tetramethylrhodamine (TMR) or 7-nitrobenz-2-oxa-1,3-diazole (NBD) also indicated association. However, analysis of the FRET data using the usual assumption of complete quenching for NBD-TMR pairs in the dimer could not be quantitatively reconciled with the analytical ultracentrifugation-measured dimerization constant. This led us to develop a general treatment for the association of helices to either parallel or antiparallel structures of any aggregation state. Applying this treatment to the FRET data, constraining the dimerization constant to be within experimental uncertainty of that measured by analytical ultracentrifugation, we found the data could be well described by a monomer-dimer equilibrium with only partial quenching of the dimer, suggesting that the helices are most probably antiparallel. These results also suggest that a left-handed Gly heptad repeat motif can drive membrane helix association, but the affinity is likely to be less strong than the previously reported right-handed motif described for glycophorin A. PMID:15315956
NF-Y coassociates with FOS at promoters, enhancers, repetitive elements, and inactive chromatin regions, and is stereo-positioned with growth-controlling transcription factors

PubMed Central

Fleming, Joseph D.; Pavesi, Giulio; Benatti, Paolo; Imbriano, Carol; Mantovani, Roberto; Struhl, Kevin

2013-01-01

NF-Y, a trimeric transcription factor (TF) composed of two histone-like subunits (NF-YB and NF-YC) and a sequence-specific subunit (NF-YA), binds to the CCAAT motif, a common promoter element. Genome-wide mapping reveals 5000–15,000 NF-Y binding sites depending on the cell type, with the NF-YA and NF-YB subunits binding asymmetrically with respect to the CCAAT motif. Despite being characterized as a proximal promoter TF, only 25% of NF-Y sites map to promoters. A comparable number of NF-Y sites are located at enhancers, many of which are tissue specific, and nearly half of the NF-Y sites are in select subclasses of HERV LTR repeats. Unlike most TFs, NF-Y can access its target DNA motif in inactive (nonmodified) or polycomb-repressed chromatin domains. Unexpectedly, NF-Y extensively colocalizes with FOS in all genomic contexts, and this often occurs in the absence of JUN and the AP-1 motif. NF-Y also coassociates with a select cluster of growth-controlling and oncogenic TFs, consistent with the abundance of CCAAT motifs in the promoters of genes overexpressed in cancer. Interestingly, NF-Y and several growth-controlling TFs bind in a stereo-specific manner, suggesting a mechanism for cooperative action at promoters and enhancers. Our results indicate that NF-Y is not merely a commonly used proximal promoter TF, but rather performs a more diverse set of biological functions, many of which are likely to involve coassociation with FOS. PMID:23595228
Conserved structure and inferred evolutionary history of long terminal repeats (LTRs)

PubMed Central

2013-01-01

Background Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability. The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible. Results Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups. Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5´TGTTRNR…YNYAACA 3´); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region. The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening. The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a ‘Superviterbi’ alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny. Conclusion The HMMs clearly demonstrated a unitary conserved structure of LTRs, supporting that they arose once during evolution. We attempted to follow the evolution of LTRs by tracing their functional foundations, that is, acquisition of RNAse H, a combined promoter/ polyadenylation site, integrase, hairpin priming and the primer binding site (PBS). Available information did not support a simple evolutionary chain of events. PMID:23369192

A Repeating Sulfated Galactan Motif Resuscitates Dormant Micrococcus luteus Bacteria.

PubMed

Böttcher, Thomas; Szamosvári, Dávid; Clardy, Jon

2018-07-01

Only a small fraction of bacteria can autonomously initiate growth on agar plates. Nongrowing bacteria typically enter a metabolically inactive dormant state and require specific chemical trigger factors or signals to exit this state and to resume growth. Micrococcus luteus has become a model organism for this important yet poorly understood phenomenon. Only a few resuscitation signals have been described to date, and all of them are produced endogenously by bacterial species. We report the discovery of a novel type of resuscitation signal that allows M. luteus to grow on agar but not agarose plates. Fractionation of the agar polysaccharide complex and sulfation of agarose allowed us to identify the signal as highly sulfated saccharides found in agar or carrageenans. Purification of hydrolyzed κ-carrageenan ultimately led to the identification of the signal as a small fragment of a large linear polysaccharide, i.e., an oligosaccharide of five or more sugars with a repeating disaccharide motif containing d-galactose-4-sulfate (G4S) 1,4-linked to 3,6-anhydro-α-d-galactose (DA), G4S-(DA-G4S) n ≥2 IMPORTANCE Most environmental bacteria cannot initiate growth on agar plates, but they can flourish on the same plates once growth is initiated. While there are a number of names for and manifestations of this phenomenon, the underlying cause appears to be the requirement for a molecular signal indicating safe growing conditions. Micrococcus luteus has become a model organism for studying this growth initiation process, often called resuscitation, because of its apparent connection with the persistent or dormant form of Mycobacterium tuberculosis , an important human pathogen. In this report, we identify a highly sulfated saccharide from agar or carrageenans that robustly resuscitates dormant M. luteus on agarose plates. We identified and characterized the signal as a small repeating disaccharide motif. Our results indicate that signals inherent in or absent from the polysaccharide composition of solid growth media can have major effects on bacterial growth. Copyright © 2018 American Society for Microbiology.
Combining flagelliform and dragline spider silk motifs to produce tunable synthetic biopolymer fibers.

PubMed

Teulé, Florence; Addison, Bennett; Cooper, Alyssa R; Ayon, Joel; Henning, Robert W; Benmore, Chris J; Holland, Gregory P; Yarger, Jeffery L; Lewis, Randolph V

2012-06-01

The two Flag/MaSp 2 silk proteins produced recombinantly were based on the basic consensus repeat of the dragline silk spidroin 2 protein (MaSp 2) from the Nephila clavipes orb weaving spider. However, the proline-containing pentapeptides juxtaposed to the polyalanine segments resembled those found in the flagelliform silk protein (Flag) composing the web spiral: (GPGGX(1) GPGGX(2))(2) with X(1) /X(2) = A/A or Y/S. Fibers were formed from protein films in aqueous solutions or extruded from resolubilized protein dopes in organic conditions when the Flag motif was (GPGGX(1) GPGGX(2))(2) with X(1) /X(2) = Y/S or A/A, respectively. Post-fiber processing involved similar drawing ratios (2-2.5×) before or after water-treatment. Structural (ssNMR and XRD) and morphological (SEM) changes in the fibers were compared to the mechanical properties of the fibers at each step. Nuclear magnetic resonance indicated that the fraction of β-sheet nanocrystals in the polyalanine regions formed upon extrusion, increased during stretching, and was maximized after water-treatment. X-ray diffraction showed that nanocrystallite orientation parallel to the fiber axis increased the ultimate strength and initial stiffness of the fibers. Water furthered nanocrystal orientation and three-dimensional growth while plasticizing the amorphous regions, thus producing tougher fibers due to increased extensibility. These fibers were highly hygroscopic and had similar internal network organization, thus similar range of mechanical properties that depended on their diameters. The overall structure of the consensus repeat of the silk-like protein dictated the mechanical properties of the fibers while protein molecular weight limited these same properties. Subtle structural motif re-design impacted protein self-assembly mechanisms and requirements for fiber formation. Copyright © 2011 Wiley Periodicals, Inc.
Combining flagelliform and dragline spider silk motifs to produce tunable synthetic biopolymer fibers

PubMed Central

Teulé, Florence; Addison, Bennett; Cooper, Alyssa R.; Ayon, Joel; Henning, Robert W.; Benmore, Chris J.; Holland, Gregory P.; Yarger, Jeffery L.; Lewis, Randolph V.

2012-01-01

The two Flag/MaSp 2 silk proteins produced recombinantly were based on the basic consensus repeat of the dragline silk spidroin 2 protein (MaSp 2) from the Nephila clavipes orb weaving spider. However, the proline-containing pentaptides juxtaposed to the polyalanine segments resembled those found in the flagelliform silk protein (Flag) composing the web spiral: (GPGGX1 GPGGX2)2 with X1/X2=A/A or Y/S. Fibers were formed from protein films in aqueous solutions or extruded from resolubilized protein dopes in organic conditions when the Flag motif was (GPGGX1 GPGGX2)2 with X1/X2 = Y/S or A/A, respectively. Post fiber processing involved similar drawing ratios (2–2.5×) before or after water-treatment. Structural (ssNMR and XRD) and morphological (SEM) changes in the fibers were compared to the mechanical properties of the fibers at each step. NMR indicated that the fraction of β-sheet nanocrystals in the polyalanine regions formed upon extrusion, increased during stretching, and was maximized after water-treatment. XRD showed that nanocrystallite orientation parallel to the fiber axis increased the ultimate strength and initial stiffness of the fibers. Water furthered nanocrystal orientation and three-dimensional growth while plasticizing the amorphous regions, thus producing tougher fibers due to increased extensibility. These fibers were highly hygroscopic and had similar internal network organization, thus similar range of mechanical properties that depended on their diameters. The overall structure of the consensus repeat of the silk-like protein dictated the mechanical properties of the fibers while protein molecular weight limited these same properties. Subtle structural motif redesign impacted protein self-assembly mechanisms and requirements for fiber formation. PMID:22012252
Structural Insights Into the Recognition of Peroxisomal Targeting Signal 1 By Trypanosoma Brucei Peroxin 5

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sampathkumar, P.; Roach, C.; Michels, P.A.M.

2009-05-27

Glycosomes are peroxisome-like organelles essential for trypanosomatid parasites. Glycosome biogenesis is mediated by proteins called 'peroxins,' which are considered to be promising drug targets in pathogenic Trypanosomatidae. The first step during protein translocation across the glycosomal membrane of peroxisomal targeting signal 1 (PTS1)-harboring proteins is signal recognition by the cytosolic receptor peroxin 5 (PEX5). The C-terminal PTS1 motifs interact with the PTS1 binding domain (P1BD) of PEX5, which is made up of seven tetratricopeptide repeats. Obtaining diffraction-quality crystals of the P1BD of Trypanosoma brucei PEX5 (TbPEX5) required surface entropy reduction mutagenesis. Each of the seven tetratricopeptide repeats appears to havemore » a residue in the alpha(L) conformation in the loop connecting helices A and B. Five crystal structures of the P1BD of TbPEX5 were determined, each in complex with a hepta- or decapeptide corresponding to a natural or nonnatural PTS1 sequence. The PTS1 peptides are bound between the two subdomains of the P1BD. These structures indicate precise recognition of the C-terminal Leu of the PTS1 motif and important interactions between the PTS1 peptide main chain and up to five invariant Asn side chains of PEX5. The TbPEX5 structures reported here reveal a unique hydrophobic pocket in the subdomain interface that might be explored to obtain compounds that prevent relative motions of the subdomains and interfere selectively with PTS1 motif binding or release in trypanosomatids, and would therefore disrupt glycosome biogenesis and prevent parasite growth.« less
Origin and diversification of leucine-rich repeat receptor-like protein kinase (LRR-RLK) genes in plants.

PubMed

Liu, Ping-Li; Du, Liang; Huang, Yuan; Gao, Shu-Min; Yu, Meng

2017-02-07

Leucine-rich repeat receptor-like protein kinases (LRR-RLKs) are the largest group of receptor-like kinases in plants and play crucial roles in development and stress responses. The evolutionary relationships among LRR-RLK genes have been investigated in flowering plants; however, no comprehensive studies have been performed for these genes in more ancestral groups. The subfamily classification of LRR-RLK genes in plants, the evolutionary history and driving force for the evolution of each LRR-RLK subfamily remain to be understood. We identified 119 LRR-RLK genes in the Physcomitrella patens moss genome, 67 LRR-RLK genes in the Selaginella moellendorffii lycophyte genome, and no LRR-RLK genes in five green algae genomes. Furthermore, these LRR-RLK sequences, along with previously reported LRR-RLK sequences from Arabidopsis thaliana and Oryza sativa, were subjected to evolutionary analyses. Phylogenetic analyses revealed that plant LRR-RLKs belong to 19 subfamilies, eighteen of which were established in early land plants, and one of which evolved in flowering plants. More importantly, we found that the basic structures of LRR-RLK genes for most subfamilies are established in early land plants and conserved within subfamilies and across different plant lineages, but divergent among subfamilies. In addition, most members of the same subfamily had common protein motif compositions, whereas members of different subfamilies showed variations in protein motif compositions. The unique gene structure and protein motif compositions of each subfamily differentiate the subfamily classifications and, more importantly, provide evidence for functional divergence among LRR-RLK subfamilies. Maximum likelihood analyses showed that some sites within four subfamilies were under positive selection. Much of the diversity of plant LRR-RLK genes was established in early land plants. Positive selection contributed to the evolution of a few LRR-RLK subfamilies.
Structure and function of the N-terminal domain of the yeast telomerase reverse transcriptase

PubMed Central

Petrova, Olga A; Mantsyzov, Alexey B; Rodina, Elena V; Efimov, Sergey V; Hackenberg, Claudia; Hakanpää, Johanna; Klochkov, Vladimir V; Lebedev, Andrej A; Chugunova, Anastasia A; Malyavko, Alexander N; Zatsepin, Timofei S; Mishin, Alexey V; Zvereva, Maria I

2018-01-01

Abstract The elongation of single-stranded DNA repeats at the 3′-ends of chromosomes by telomerase is a key process in maintaining genome integrity in eukaryotes. Abnormal activation of telomerase leads to uncontrolled cell division, whereas its down-regulation is attributed to ageing and several pathologies related to early cell death. Telomerase function is based on the dynamic interactions of its catalytic subunit (TERT) with nucleic acids—telomerase RNA, telomeric DNA and the DNA/RNA heteroduplex. Here, we present the crystallographic and NMR structures of the N-terminal (TEN) domain of TERT from the thermotolerant yeast Hansenula polymorpha and demonstrate the structural conservation of the core motif in evolutionarily divergent organisms. We identify the TEN residues that are involved in interactions with the telomerase RNA and in the recognition of the ‘fork’ at the distal end of the DNA product/RNA template heteroduplex. We propose that the TEN domain assists telomerase biological function and is involved in restricting the size of the heteroduplex during telomere repeat synthesis. PMID:29294091
De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes.

PubMed

Zolotarov, Yevgen; Strömvik, Martina

2015-01-01

Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved.
PISMA: A Visual Representation of Motif Distribution in DNA Sequences.

PubMed

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code-like, as a gene-map-like, and as a transcript scheme. We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf.
PISMA: A Visual Representation of Motif Distribution in DNA Sequences

PubMed Central

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf. PMID:28469418
Exploring the repeat protein universe through computational protein design

DOE PAGES

Brunette, TJ; Parmeggiani, Fabio; Huang, Po-Ssu; ...

2015-12-16

A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. In this paper, we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix–loop–helix–loop structural motif. Eighty-three designs with sequences unrelatedmore » to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Finally, our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.« less
Signature motif-guided identification of receptors for peptide hormones essential for root meristem growth.

PubMed

Song, Wen; Liu, Li; Wang, Jizong; Wu, Zhen; Zhang, Heqiao; Tang, Jiao; Lin, Guangzhong; Wang, Yichuan; Wen, Xing; Li, Wenyang; Han, Zhifu; Guo, Hongwei; Chai, Jijie

2016-06-01

Peptide-mediated cell-to-cell signaling has crucial roles in coordination and definition of cellular functions in plants. Peptide-receptor matching is important for understanding the mechanisms underlying peptide-mediated signaling. Here we report the structure-guided identification of root meristem growth factor (RGF) receptors important for plant development. An assay based on a signature ligand recognition motif (Arg-x-Arg) conserved in a subfamily of leucine-rich repeat receptor kinases (LRR-RKs) identified the functionally uncharacterized LRR-RK At4g26540 as a receptor of RGF1 (RGFR1). We further solved the crystal structure of RGF1 in complex with the LRR domain of RGFR1 at a resolution of 2.6 Å, which reveals that the Arg-x-Gly-Gly (RxGG) motif is responsible for specific recognition of the sulfate group of RGF1 by RGFR1. Based on the RxGG motif, we identified additional four RGFRs. Participation of the five RGFRs in RGF-induced signaling is supported by biochemical and genetic data. We also offer evidence showing that SERKs function as co-receptors for RGFs. Taken together, our study identifies RGF receptors and co-receptors that can link RGF signals with their downstream components and provides a proof of principle for structure-based matching of LRR-RKs with their peptide ligands.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Schürpf, Thomas; Chen, Qiang; Liu, Jin-huan

Developmental endothelial cell locus-1 (Del-1) glycoprotein is secreted by endothelial cells and a subset of macrophages. Del-1 plays a regulatory role in vascular remodeling and functions in innate immunity through interaction with integrin {alpha}{sub V}{beta}{sub 3}. Del-1 contains 3 epidermal growth factor (EGF)-like repeats and 2 discoidin-like domains. An Arg-Gly-Asp (RGD) motif in the second EGF domain (EGF2) mediates adhesion by endothelial cells and phagocytes. We report the crystal structure of its 3 EGF domains. The RGD motif of EGF2 forms a type II' {beta} turn at the tip of a long protruding loop, dubbed the RGD finger. Whereas EGF2more » and EGF3 constitute a rigid rod via an interdomain calcium ion binding site, the long linker between EGF1 and EGF2 lends considerable flexibility to EGF1. Two unique O-linked glycans and 1 N-linked glycan locate to the opposite side of EGF2 from the RGD motif. These structural features favor integrin binding of the RGD finger. Mutagenesis data confirm the importance of having the RGD motif at the tip of the RGD finger. A database search for EGF domain sequences shows that this RGD finger is likely an evolutionary insertion and unique to the EGF domain of Del-1 and its homologue milk fat globule-EGF 8. The RGD finger of Del-1 is a unique structural feature critical for integrin binding.« less
Dissection of Swa2p/Auxilin Domain Requirements for Cochaperoning Hsp70 Clathrin-uncoating Activity In Vivo

PubMed Central

Xiao, Jing; Kim, Leslie S.

2006-01-01

The auxilin family of J-domain proteins load Hsp70 onto clathrin-coated vesicles (CCVs) to drive uncoating. In vitro, auxilin function requires its ability to bind clathrin and stimulate Hsp70 ATPase activity via its J-domain. To test these requirements in vivo, we performed a mutational analysis of Swa2p, the yeast auxilin ortholog. Swa2p is a modular protein with three N-terminal clathrin-binding (CB) motifs, a ubiquitin association (UBA) domain, a tetratricopeptide repeat (TPR) domain, and a C-terminal J-domain. In vitro, clathrin binding is mediated by multiple weak interactions, but a Swa2p truncation lacking two CB motifs and the UBA domain retains nearly full function in vivo. Deletion of all CB motifs strongly abrogates clathrin disassembly but does not eliminate Swa2p function in vivo. Surprisingly, mutation of the invariant HPD motif within the J-domain to AAA only partially affects Swa2p function. Similarly, a TPR point mutation (G388R) causes a modest phenotype. However, Swa2p function is abolished when these TPR and J mutations are combined. The TPR and J-domains are not functionally redundant because deletion of either domain renders Swa2p nonfunctional. These data suggest that the TPR and J-domains collaborate in a bipartite interaction with Hsp70 to regulate its activity in clathrin disassembly. PMID:16687570
Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

PubMed

Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

2016-05-01

The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
In-silico analysis of cis-acting regulatory elements of pathogenesis-related proteins of Arabidopsis thaliana and Oryza sativa

PubMed Central

Kaur, Amritpreet; Pati, Pratap Kumar; Pati, Aparna Maitra; Nagpal, Avinash Kaur

2017-01-01

Pathogenesis related (PR) proteins are low molecular weight family of proteins induced in plants under various biotic and abiotic stresses. They play an important role in plant-defense mechanism. PRs have wide range of functions, acting as hydrolases, peroxidases, chitinases, anti-fungal, protease inhibitors etc. In the present study, an attempt has been made to analyze promoter regions of PR1, PR2, PR5, PR9, PR10 and PR12 of Arabidopsis thaliana and Oryza sativa. Analysis of cis-element distribution revealed the functional multiplicity of PRs and provides insight into the gene regulation. CpG islands are observed only in rice PRs, which indicates that monocot genome contains more GC rich motifs than dicots. Tandem repeats were also observed in 5’ UTR of PR genes. Thus, the present study provides an understanding of regulation of PR genes and their versatile roles in plants. PMID:28910327
In-silico analysis of cis-acting regulatory elements of pathogenesis-related proteins of Arabidopsis thaliana and Oryza sativa.

PubMed

Kaur, Amritpreet; Pati, Pratap Kumar; Pati, Aparna Maitra; Nagpal, Avinash Kaur

2017-01-01

Pathogenesis related (PR) proteins are low molecular weight family of proteins induced in plants under various biotic and abiotic stresses. They play an important role in plant-defense mechanism. PRs have wide range of functions, acting as hydrolases, peroxidases, chitinases, anti-fungal, protease inhibitors etc. In the present study, an attempt has been made to analyze promoter regions of PR1, PR2, PR5, PR9, PR10 and PR12 of Arabidopsis thaliana and Oryza sativa. Analysis of cis-element distribution revealed the functional multiplicity of PRs and provides insight into the gene regulation. CpG islands are observed only in rice PRs, which indicates that monocot genome contains more GC rich motifs than dicots. Tandem repeats were also observed in 5' UTR of PR genes. Thus, the present study provides an understanding of regulation of PR genes and their versatile roles in plants.
Occurrence probability of structured motifs in random sequences.

PubMed

Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S

2002-01-01

The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.
The molecular mechanism of nuclear transport revealed by atomic-scale measurements

PubMed Central

Hough, Loren E; Dutta, Kaushik; Sparks, Samuel; Temel, Deniz B; Kamal, Alia; Tetenbaum-Novatt, Jaclyn; Rout, Michael P; Cowburn, David

2015-01-01

Nuclear pore complexes (NPCs) form a selective filter that allows the rapid passage of transport factors (TFs) and their cargoes across the nuclear envelope, while blocking the passage of other macromolecules. Intrinsically disordered proteins (IDPs) containing phenylalanyl-glycyl (FG)-rich repeats line the pore and interact with TFs. However, the reason that transport can be both fast and specific remains undetermined, through lack of atomic-scale information on the behavior of FGs and their interaction with TFs. We used nuclear magnetic resonance spectroscopy to address these issues. We show that FG repeats are highly dynamic IDPs, stabilized by the cellular environment. Fast transport of TFs is supported because the rapid motion of FG motifs allows them to exchange on and off TFs extremely quickly through transient interactions. Because TFs uniquely carry multiple pockets for FG repeats, only they can form the many frequent interactions needed for specific passage between FG repeats to cross the NPC. DOI: http://dx.doi.org/10.7554/eLife.10027.001 PMID:26371551
Structural determinants of the interaction between the Haemophilus influenzae Hap autotransporter and fibronectin.

PubMed

Spahich, Nicole A; Kenjale, Roma; McCann, Jessica; Meng, Guoyu; Ohashi, Tomoo; Erickson, Harold P; St Geme, Joseph W

2014-06-01

Haemophilus influenzae is a Gram-negative cocco-bacillus that initiates infection by colonizing the upper respiratory tract. Hap is an H. influenzae serine protease autotransporter protein that mediates adherence, invasion and microcolony formation in assays with human epithelial cells and is presumed to facilitate the process of colonization. Additionally, Hap mediates adherence to fibronectin, laminin and collagen IV, extracellular matrix (ECM) proteins that are present in the respiratory tract and are probably important targets for H. influenzae colonization. The region of Hap responsible for adherence to ECM proteins has been localized to the C-terminal 511 aa of the Hap passenger domain (HapS). In this study, we characterized the structural determinants of the interaction between HapS and fibronectin. Using defined fibronectin fragments, we established that Hap interacts with the fibronectin repeat fragment called FNIII(1-2). Using site-directed mutagenesis, we found a series of motifs in the C-terminal region of HapS that contribute to the interaction with fibronectin. Most of these motifs are located on the F1 and F3 faces of the HapS structure, suggesting that the F1 and F3 faces may be responsible for the HapS-fibronectin interaction. © 2014 The Authors.
Structural determinants of the interaction between the Haemophilus influenzae Hap autotransporter and fibronectin

PubMed Central

Spahich, Nicole A.; Kenjale, Roma; McCann, Jessica; Meng, Guoyu; Ohashi, Tomoo; Erickson, Harold P.

2014-01-01

Haemophilus influenzae is a Gram-negative cocco-bacillus that initiates infection by colonizing the upper respiratory tract. Hap is an H. influenzae serine protease autotransporter protein that mediates adherence, invasion and microcolony formation in assays with human epithelial cells and is presumed to facilitate the process of colonization. Additionally, Hap mediates adherence to fibronectin, laminin and collagen IV, extracellular matrix (ECM) proteins that are present in the respiratory tract and are probably important targets for H. influenzae colonization. The region of Hap responsible for adherence to ECM proteins has been localized to the C-terminal 511 aa of the Hap passenger domain (HapS). In this study, we characterized the structural determinants of the interaction between HapS and fibronectin. Using defined fibronectin fragments, we established that Hap interacts with the fibronectin repeat fragment called FNIII(1–2). Using site-directed mutagenesis, we found a series of motifs in the C-terminal region of HapS that contribute to the interaction with fibronectin. Most of these motifs are located on the F1 and F3 faces of the HapS structure, suggesting that the F1 and F3 faces may be responsible for the HapS–fibronectin interaction. PMID:24687948

Divergence and evolution of homologous regions of Bombyx mori nuclear polyhedrosis virus.

PubMed Central

Majima, K; Kobara, R; Maeda, S

1993-01-01

Homologous regions (hrs) (hr1,hr2-left,hr2-right,hr3,hr4-left,hr 4-right, and hr5) similar to those found in the Autographa californica nuclear polyhedrosis virus (AcNPV) genome were found in the Bombyx mori NPV (BmNPV) genome. The BmNPV hrs contained two to eight repeats of a homologous nucleotide sequence which were on average about 75 bp long. All of these homologous sequence repeats contained a 26-bp-long palindrome motif with an EcoRI or EcoRI-like site at its core. The consensus sequence of the BmNPV hrs showed 95% conservation with respect to those found in AcNPV. Nucleotide sequence analysis indicated that hr2-left and hr2-right of BmNPV evolved from an ancestor similar to hr2 of AcNPV by inversion, cleavage, and ligation. The polarities of the BmNPV and AcNPV hrs were conserved except for that of hr4-left. Within hr4-right of BmNPV, four repeats of a previously underscribed palindrome motif were found. Bmhr5D, a BmNPV mutant which lacked hr5, replicated at a rate similar to that of wild-type BmNPV in BmN cells and silkworm larvae, indicating that hr5 was not essential for viral replication. After ten passages of Bmhr5D in BmN cells, no detectable changes in its genome were observed by restriction endonuclease analysis. The evolution and divergence of the BmNPV genome are also discussed. Images PMID:8230471
Host adaptation of Chlamydia pecorum towards low virulence evident in co-evolution of the ompA, incA, and ORF663 Loci.

PubMed

Mohamad, Khalil Yousef; Kaltenboeck, Bernhard; Rahman, Kh Shamsur; Magnino, Simone; Sachse, Konrad; Rodolakis, Annie

2014-01-01

Chlamydia (C.) pecorum, an obligate intracellular bacterium, may cause severe diseases in ruminants, swine and koalas, although asymptomatic infections are the norm. Recently, we identified genetic polymorphisms in the ompA, incA and ORF663 genes that potentially differentiate between high-virulence C. pecorum isolates from diseased animals and low-virulence isolates from asymptomatic animals. Here, we expand these findings by including additional ruminant, swine, and koala strains. Coding tandem repeats (CTRs) at the incA locus encoded a variable number of repeats of APA or AGA amino acid motifs. Addition of any non-APA/AGA repeat motif, such as APEVPA, APAVPA, APE, or APAPE, associated with low virulence (P<10-4), as did a high number of amino acids in all incA CTRs (P = 0.0028). In ORF663, high numbers of 15-mer CTRs correlated with low virulence (P = 0.0001). Correction for ompA phylogram position in ORF663 and incA abolished the correlation between genetic changes and virulence, demonstrating co-evolution of ompA, incA, and ORF663 towards low virulence. Pairwise divergence of ompA, incA, and ORF663 among isolates from healthy animals was significantly higher than among strains isolated from diseased animals (P≤10-5), confirming the longer evolutionary path traversed by low-virulence strains. All three markers combined identified 43 unique strains and 4 pairs of identical strains among all 57 isolates tested, demonstrating the suitability of these markers for epidemiological investigations.
DEAR1, a transcriptional repressor of DREB protein that mediates plant defense and freezing stress responses in Arabidopsis.

PubMed

Tsutsui, Tomokazu; Kato, Wataru; Asada, Yutaka; Sako, Kaori; Sato, Takeo; Sonoda, Yutaka; Kidokoro, Satoshi; Yamaguchi-Shinozaki, Kazuko; Tamaoki, Masanori; Arakawa, Keita; Ichikawa, Takanari; Nakazawa, Miki; Seki, Motoaki; Shinozaki, Kazuo; Matsui, Minami; Ikeda, Akira; Yamaguchi, Junji

2009-11-01

Plants have evolved intricate mechanisms to respond and adapt to a wide variety of biotic and abiotic stresses in their environment. The Arabidopsis DEAR1 (DREB and EAR motif protein 1; At3g50260) gene encodes a protein containing significant homology to the DREB1/CBF (dehydration-responsive element binding protein 1/C-repeat binding factor) domain and the EAR (ethylene response factor-associated amphiphilic repression) motif. We show here that DEAR1 mRNA accumulates in response to both pathogen infection and cold treatment. Transgenic Arabidopsis overexpressing DEAR1 (DEAR1ox) showed a dwarf phenotype and lesion-like cell death, together with constitutive expression of PR genes and accumulation of salicylic acid. DEAR1ox also showed more limited P. syringae pathogen growth compared to wild-type, consistent with an activated defense phenotype. In addition, transient expression experiments revealed that the DEAR1 protein represses DRE/CRT (dehydration-responsive element/C-repeat)-dependent transcription, which is regulated by low temperature. Furthermore, the induction of DREB1/CBF family genes by cold treatment was suppressed in DEAR1ox, leading to a reduction in freezing tolerance. These results suggest that DEAR1 has an upstream regulatory role in mediating crosstalk between signaling pathways for biotic and abiotic stress responses.
Detecting microsatellites within genomes: significant variation among algorithms.

PubMed

Leclercq, Sébastien; Rivals, Eric; Jarne, Philippe

2007-04-18

Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker). Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (Saccharomyces cerevisiae, Neurospora crassa and Drosophila melanogaster) spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp), regardless of motif. Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions.
Detecting microsatellites within genomes: significant variation among algorithms

PubMed Central

Leclercq, Sébastien; Rivals, Eric; Jarne, Philippe

2007-01-01

Background Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker). Results Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (Saccharomyces cerevisiae, Neurospora crassa and Drosophila melanogaster) spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp), regardless of motif. Conclusion Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions. PMID:17442102
The Arabidopsis mediator complex subunits MED16, MED14, and MED2 regulate mediator and RNA polymerase II recruitment to CBF-responsive cold-regulated genes.

PubMed

Hemsley, Piers A; Hurst, Charlotte H; Kaliyadasa, Ewon; Lamb, Rebecca; Knight, Marc R; De Cothi, Elizabeth A; Steele, John F; Knight, Heather

2014-01-01

The Mediator16 (MED16; formerly termed SENSITIVE TO FREEZING6 [SFR6]) subunit of the plant Mediator transcriptional coactivator complex regulates cold-responsive gene expression in Arabidopsis thaliana, acting downstream of the C-repeat binding factor (CBF) transcription factors to recruit the core Mediator complex to cold-regulated genes. Here, we use loss-of-function mutants to show that RNA polymerase II recruitment to CBF-responsive cold-regulated genes requires MED16, MED2, and MED14 subunits. Transcription of genes known to be regulated via CBFs binding to the C-repeat motif/drought-responsive element promoter motif requires all three Mediator subunits, as does cold acclimation-induced freezing tolerance. In addition, these three subunits are required for low temperature-induced expression of some other, but not all, cold-responsive genes, including genes that are not known targets of CBFs. Genes inducible by darkness also required MED16 but required a different combination of Mediator subunits for their expression than the genes induced by cold. Together, our data illustrate that plants control transcription of specific genes through the action of subsets of Mediator subunits; the specific combination defined by the nature of the stimulus but also by the identity of the gene induced.
Panax ginseng genome examination for ginsenoside biosynthesis.

PubMed

Xu, Jiang; Chu, Yang; Liao, Baosheng; Xiao, Shuiming; Yin, Qinggang; Bai, Rui; Su, He; Dong, Linlin; Li, Xiwen; Qian, Jun; Zhang, Jingjing; Zhang, Yujun; Zhang, Xiaoyan; Wu, Mingli; Zhang, Jie; Li, Guozheng; Zhang, Lei; Chang, Zhenzhan; Zhang, Yuebin; Jia, Zhengwei; Liu, Zhixiang; Afreh, Daniel; Nahurira, Ruth; Zhang, Lianjuan; Cheng, Ruiyang; Zhu, Yingjie; Zhu, Guangwei; Rao, Wei; Zhou, Chao; Qiao, Lirui; Huang, Zhihai; Cheng, Yung-Chi; Chen, Shilin

2017-11-01

Ginseng, which contains ginsenosides as bioactive compounds, has been regarded as an important traditional medicine for several millennia. However, the genetic background of ginseng remains poorly understood, partly because of the plant's large and complex genome composition. We report the entire genome sequence of Panax ginseng using next-generation sequencing. The 3.5-Gb nucleotide sequence contains more than 60% repeats and encodes 42 006 predicted genes. Twenty-two transcriptome datasets and mass spectrometry images of ginseng roots were adopted to precisely quantify the functional genes. Thirty-one genes were identified to be involved in the mevalonic acid pathway. Eight of these genes were annotated as 3-hydroxy-3-methylglutaryl-CoA reductases, which displayed diverse structures and expression characteristics. A total of 225 UDP-glycosyltransferases (UGTs) were identified, and these UGTs accounted for one of the largest gene families of ginseng. Tandem repeats contributed to the duplication and divergence of UGTs. Molecular modeling of UGTs in the 71st, 74th, and 94th families revealed a regiospecific conserved motif located at the N-terminus. Molecular docking predicted that this motif captures ginsenoside precursors. The ginseng genome represents a valuable resource for understanding and improving the breeding, cultivation, and synthesis biology of this key herb. © The Author 2017. Published by Oxford University Press.
TAD-free analysis of architectural proteins and insulators.

PubMed

Mourad, Raphaël; Cuvier, Olivier

2018-03-16

The three-dimensional (3D) organization of the genome is intimately related to numerous key biological functions including gene expression and DNA replication regulations. The mechanisms by which molecular drivers functionally organize the 3D genome, such as topologically associating domains (TADs), remain to be explored. Current approaches consist in assessing the enrichments or influences of proteins at TAD borders. Here, we propose a TAD-free model to directly estimate the blocking effects of architectural proteins, insulators and DNA motifs on long-range contacts, making the model intuitive and biologically meaningful. In addition, the model allows analyzing the whole Hi-C information content (2D information) instead of only focusing on TAD borders (1D information). The model outperforms multiple logistic regression at TAD borders in terms of parameter estimation accuracy and is validated by enhancer-blocking assays. In Drosophila, the results support the insulating role of simple sequence repeats and suggest that the blocking effects depend on the number of repeats. Motif analysis uncovered the roles of the transcriptional factors pannier and tramtrack in blocking long-range contacts. In human, the results suggest that the blocking effects of the well-known architectural proteins CTCF, cohesin and ZNF143 depend on the distance between loci, where each protein may participate at different scales of the 3D chromatin organization.
A generic motif discovery algorithm for sequential data.

PubMed

Jensen, Kyle L; Styczynski, Mark P; Rigoutsos, Isidore; Stephanopoulos, Gregory N

2006-01-01

Motif discovery in sequential data is a problem of great interest and with many applications. However, previous methods have been unable to combine exhaustive search with complex motif representations and are each typically only applicable to a certain class of problems. Here we present a generic motif discovery algorithm (Gemoda) for sequential data. Gemoda can be applied to any dataset with a sequential character, including both categorical and real-valued data. As we show, Gemoda deterministically discovers motifs that are maximal in composition and length. As well, the algorithm allows any choice of similarity metric for finding motifs. Finally, Gemoda's output motifs are representation-agnostic: they can be represented using regular expressions, position weight matrices or any number of other models for any type of sequential data. We demonstrate a number of applications of the algorithm, including the discovery of motifs in amino acids sequences, a new solution to the (l,d)-motif problem in DNA sequences and the discovery of conserved protein substructures. Gemoda is freely available at http://web.mit.edu/bamel/gemoda
2-D Structure of the A Region of Xist RNA and Its Implication for PRC2 Association

PubMed Central

Maenner, Sylvain; Blaud, Magali; Fouillen, Laetitia; Savoye, Anne; Marchand, Virginie; Dubois, Agnès; Sanglier-Cianférani, Sarah; Van Dorsselaer, Alain; Clerc, Philippe; Avner, Philip; Visvikis, Athanase; Branlant, Christiane

2010-01-01

In placental mammals, inactivation of one of the X chromosomes in female cells ensures sex chromosome dosage compensation. The 17 kb non-coding Xist RNA is crucial to this process and accumulates on the future inactive X chromosome. The most conserved Xist RNA region, the A region, contains eight or nine repeats separated by U-rich spacers. It is implicated in the recruitment of late inactivated X genes to the silencing compartment and likely in the recruitment of complex PRC2. Little is known about the structure of the A region and more generally about Xist RNA structure. Knowledge of its structure is restricted to an NMR study of a single A repeat element. Our study is the first experimental analysis of the structure of the entire A region in solution. By the use of chemical and enzymatic probes and FRET experiments, using oligonucleotides carrying fluorescent dyes, we resolved problems linked to sequence redundancies and established a 2-D structure for the A region that contains two long stem-loop structures each including four repeats. Interactions formed between repeats and between repeats and spacers stabilize these structures. Conservation of the spacer terminal sequences allows formation of such structures in all sequenced Xist RNAs. By combination of RNP affinity chromatography, immunoprecipitation assays, mass spectrometry, and Western blot analysis, we demonstrate that the A region can associate with components of the PRC2 complex in mouse ES cell nuclear extracts. Whilst a single four-repeat motif is able to associate with components of this complex, recruitment of Suz12 is clearly more efficient when the entire A region is present. Our data with their emphasis on the importance of inter-repeat pairing change fundamentally our conception of the 2-D structure of the A region of Xist RNA and support its possible implication in recruitment of the PRC2 complex. PMID:20052282
Mechanical unfolding of an ankyrin repeat protein.

PubMed

Serquera, David; Lee, Whasil; Settanni, Giovanni; Marszalek, Piotr E; Paci, Emanuele; Itzhaki, Laura S

2010-04-07

Ankryin repeat proteins comprise tandem arrays of a 33-residue, predominantly alpha-helical motif that stacks roughly linearly to produce elongated and superhelical structures. They function as scaffolds mediating a diverse range of protein-protein interactions, and some have been proposed to play a role in mechanical signal transduction processes in the cell. Here we use atomic force microscopy and molecular-dynamics simulations to investigate the natural 7-ankyrin repeat protein gankyrin. We find that gankyrin unfolds under force via multiple distinct pathways. The reactions do not proceed in a cooperative manner, nor do they always involve fully stepwise unfolding of one repeat at a time. The peeling away of half an ankyrin repeat, or one or more ankyrin repeats, occurs at low forces; however, intermediate species are formed that are resistant to high forces, and the simulations indicate that in some instances they are stabilized by nonnative interactions. The unfolding of individual ankyrin repeats generates a refolding force, a feature that may be more easily detected in these proteins than in globular proteins because the refolding of a repeat involves a short contraction distance and incurs a low entropic cost. We discuss the origins of the differences between the force- and chemical-induced unfolding pathways of ankyrin repeat proteins, as well as the differences between the mechanics of natural occurring ankyrin repeat proteins and those of designed consensus ankyin repeat and globular proteins. Copyright (c) 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Identity and functions of CxxC-derived motifs.

PubMed

Fomenko, Dmitri E; Gladyshev, Vadim N

2003-09-30

Two cysteines separated by two other residues (the CxxC motif) are employed by many redox proteins for formation, isomerization, and reduction of disulfide bonds and for other redox functions. The place of the C-terminal cysteine in this motif may be occupied by serine (the CxxS motif), modifying the functional repertoire of redox proteins. Here we found that the CxxC motif may also give rise to a motif, in which the C-terminal cysteine is replaced with threonine (the CxxT motif). Moreover, in contrast to a view that the N-terminal cysteine in the CxxC motif always serves as a nucleophilic attacking group, this residue could also be replaced with threonine (the TxxC motif), serine (the SxxC motif), or other residues. In each of these CxxC-derived motifs, the presence of a downstream alpha-helix was strongly favored. A search for conserved CxxC-derived motif/helix patterns in four complete genomes representing bacteria, archaea, and eukaryotes identified known redox proteins and suggested possible redox functions for several additional proteins. Catalytic sites in peroxiredoxins were major representatives of the TxxC motif, whereas those in glutathione peroxidases represented the CxxT motif. Structural assessments indicated that threonines in these enzymes could stabilize catalytic thiolates, suggesting revisions to previously proposed catalytic triads. Each of the CxxC-derived motifs was also observed in natural selenium-containing proteins, in which selenocysteine was present in place of a catalytic cysteine.
Motivated Proteins: A web application for studying small three-dimensional protein motifs

PubMed Central

Leader, David P; Milner-White, E James

2009-01-01

Background Small loop-shaped motifs are common constituents of the three-dimensional structure of proteins. Typically they comprise between three and seven amino acid residues, and are defined by a combination of dihedral angles and hydrogen bonding partners. The most abundant of these are αβ-motifs, asx-motifs, asx-turns, β-bulges, β-bulge loops, β-turns, nests, niches, Schellmann loops, ST-motifs, ST-staples and ST-turns. We have constructed a database of such motifs from a range of high-quality protein structures and built a web application as a visual interface to this. Description The web application, Motivated Proteins, provides access to these 12 motifs (with 48 sub-categories) in a database of over 400 representative proteins. Queries can be made for specific categories or sub-categories of motif, motifs in the vicinity of ligands, motifs which include part of an enzyme active site, overlapping motifs, or motifs which include a particular amino acid sequence. Individual proteins can be specified, or, where appropriate, motifs for all proteins listed. The results of queries are presented in textual form as an (X)HTML table, and may be saved as parsable plain text or XML. Motifs can be viewed and manipulated either individually or in the context of the protein in the Jmol applet structural viewer. Cartoons of the motifs imposed on a linear representation of protein secondary structure are also provided. Summary information for the motifs is available, as are histograms of amino acid distribution, and graphs of dihedral angles at individual positions in the motifs. Conclusion Motivated Proteins is a publicly and freely accessible web application that enables protein scientists to study small three-dimensional motifs without requiring knowledge of either Structured Query Language or the underlying database schema. PMID:19210785
Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space

PubMed Central

Karnik, Rahul; Beer, Michael A.

2015-01-01

The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884
Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

PubMed

Karnik, Rahul; Beer, Michael A

2015-01-01

The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.
RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants.

PubMed

Li, Pingchuan; Quan, Xiande; Jia, Gaofeng; Xiao, Jin; Cloutier, Sylvie; You, Frank M

2016-11-02

Resistance gene analogs (RGAs), such as NBS-encoding proteins, receptor-like protein kinases (RLKs) and receptor-like proteins (RLPs), are potential R-genes that contain specific conserved domains and motifs. Thus, RGAs can be predicted based on their conserved structural features using bioinformatics tools. Computer programs have been developed for the identification of individual domains and motifs from the protein sequences of RGAs but none offer a systematic assessment of the different types of RGAs. A user-friendly and efficient pipeline is needed for large-scale genome-wide RGA predictions of the growing number of sequenced plant genomes. An integrative pipeline, named RGAugury, was developed to automate RGA prediction. The pipeline first identifies RGA-related protein domains and motifs, namely nucleotide binding site (NB-ARC), leucine rich repeat (LRR), transmembrane (TM), serine/threonine and tyrosine kinase (STTK), lysin motif (LysM), coiled-coil (CC) and Toll/Interleukin-1 receptor (TIR). RGA candidates are identified and classified into four major families based on the presence of combinations of these RGA domains and motifs: NBS-encoding, TM-CC, and membrane associated RLP and RLK. All time-consuming analyses of the pipeline are paralleled to improve performance. The pipeline was evaluated using the well-annotated Arabidopsis genome. A total of 98.5, 85.2, and 100 % of the reported NBS-encoding genes, membrane associated RLPs and RLKs were validated, respectively. The pipeline was also successfully applied to predict RGAs for 50 sequenced plant genomes. A user-friendly web interface was implemented to ease command line operations, facilitate visualization and simplify result management for multiple datasets. RGAugury is an efficiently integrative bioinformatics tool for large scale genome-wide identification of RGAs. It is freely available at Bitbucket: https://bitbucket.org/yaanlpc/rgaugury .
Structural analysis of the rDNA intergenic spacer of Brassica nigra: evolutionary divergence of the spacers of the three diploid Brassica species.

PubMed

Bhatia, S; Singh Negi, M; Lakshmikumaran, M

1996-11-01

EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
Sequence heuristics to encode phase behaviour in intrinsically disordered protein polymers

PubMed Central

Quiroz, Felipe García; Chilkoti, Ashutosh

2015-01-01

Proteins and synthetic polymers that undergo aqueous phase transitions mediate self-assembly in nature and in man-made material systems. Yet little is known about how the phase behaviour of a protein is encoded in its amino acid sequence. Here, by synthesizing intrinsically disordered, repeat proteins to test motifs that we hypothesized would encode phase behaviour, we show that the proteins can be designed to exhibit tunable lower or upper critical solution temperature (LCST and UCST, respectively) transitions in physiological solutions. We also show that mutation of key residues at the repeat level abolishes phase behaviour or encodes an orthogonal transition. Furthermore, we provide heuristics to identify, at the proteome level, proteins that might exhibit phase behaviour and to design novel protein polymers consisting of biologically active peptide repeats that exhibit LCST or UCST transitions. These findings set the foundation for the prediction and encoding of phase behaviour at the sequence level. PMID:26390327
Development of Simple Sequence Repeats (SSR) markers in Setaria italica (Poaceae) and cross-amplification in related species.

PubMed

Lin, Heng-Sheng; Chiang, Chih-Yun; Chang, Song-Bin; Kuoh, Chang-Sheng

2011-01-01

Foxtail millet is one of the world's oldest cultivated crops. It has been adopted as a model organism for providing a deeper understanding of plant biology. In this study, 45 simple sequence repeats (SSR) markers of Setaria italica were developed. These markers showing polymorphism were screened in 223 samples from 12 foxtail millet populations around Taiwan. The most common dinucleotide and trinucleotide repeat motifs are AC/TG (84.21%) and CAT (46.15%). The average number of alleles (N(a)), the average heterozygosities observed (H(o)) and expected (H(e)) are 3.73, 0.714, 0.587, respectively. In addition, 24 SSR markers had shown transferability to six related Poaceae species. These new markers provide tools for examining genetic relatedness among foxtail millet populations and other related species. It is suitable for germplasm management and protection in Poaceae.
Development of Simple Sequence Repeats (SSR) Markers in Setaria italica (Poaceae) and Cross-Amplification in Related Species

PubMed Central

Lin, Heng-Sheng; Chiang, Chih-Yun; Chang, Song-Bin; Kuoh, Chang-Sheng

2011-01-01

Foxtail millet is one of the world’s oldest cultivated crops. It has been adopted as a model organism for providing a deeper understanding of plant biology. In this study, 45 simple sequence repeats (SSR) markers of Setaria italica were developed. These markers showing polymorphism were screened in 223 samples from 12 foxtail millet populations around Taiwan. The most common dinucleotide and trinucleotide repeat motifs are AC/TG (84.21%) and CAT (46.15%). The average number of alleles (Na), the average heterozygosities observed (Ho) and expected (He) are 3.73, 0.714, 0.587, respectively. In addition, 24 SSR markers had shown transferability to six related Poaceae species. These new markers provide tools for examining genetic relatedness among foxtail millet populations and other related species. It is suitable for germplasm management and protection in Poaceae. PMID:22174636

kpLogo: positional k-mer analysis reveals hidden specificity in biological sequences

PubMed Central

2017-01-01

Abstract Motifs of only 1–4 letters can play important roles when present at key locations within macromolecules. Because existing motif-discovery tools typically miss these position-specific short motifs, we developed kpLogo, a probability-based logo tool for integrated detection and visualization of position-specific ultra-short motifs from a set of aligned sequences. kpLogo also overcomes the limitations of conventional motif-visualization tools in handling positional interdependencies and utilizing ranked or weighted sequences increasingly available from high-throughput assays. kpLogo can be found at http://kplogo.wi.mit.edu/. PMID:28460012
Vanilla mosaic virus isolates from French Polynesia and the Cook Islands are Dasheen mosaic virus strains that exclusively infect vanilla.

PubMed

Farreyrol, K; Pearson, M N; Grisoni, M; Cohen, D; Beck, D

2006-05-01

Sequence was determined for the coat protein (CP) gene and 3' non-translated region (3'NTR) of two vanilla mosaic virus (VanMV) isolates from Vanilla tahitensis, respectively from the Cook Islands (VanMV-CI) and French Polynesia (VanMV-FP). Both viruses displayed distinctive features in the N-terminal region of their CPs; for VanMV-CI, a 16-amino-acid deletion including the aphid transmission-related DAG motif, and for VanMV-FP, a stretch of GTN repeats that putatively belongs to the class of natively unfolded proteins. VanMV-FP CP also has a novel DVG motif in place of the DAG motif, and an uncommon Q//V protease cleavage site. The sequences were compared to a range of Dasheen mosaic virus (DsMV) strains and to potyviruses infecting orchids. Identity was low to DsMV strains across the entire CP coding region and across the 3'NTR, but high across the CP core and the CI-6K2-NIa region. In accordance with current ICTV criteria for species demarcation within the family Potyviridae, VanMV-CI and VanMV-FP are strains of DsMV that exclusively infect vanilla.
Microsatellite abundance across the Anthozoa and Hydrozoa in the phylum Cnidaria.

PubMed

Ruiz-Ramos, Dannise V; Baums, Iliana B

2014-10-27

Microsatellite loci have high mutation rates and thus are indicative of mutational processes within the genome. By concentrating on the symbiotic and aposymbiotic cnidarians, we investigated if microsatellite abundances follow a phylogenetic or ecological pattern. Individuals from eight species were shotgun sequenced using 454 GS-FLX Titanium technology. Sequences from the three available cnidarian genomes (Nematostella vectensis, Hydra magnipapillata and Acropora digitifera) were added to the analysis for a total of eleven species representing two classes, three subclasses and eight orders within the phylum Cnidaria. Trinucleotide and tetranucleotide repeats were the most abundant motifs, followed by hexa- and dinucleotides. Pentanucleotides were the least abundant motif in the data set. Hierarchical clustering and log likelihood ratio tests revealed a weak relationship between phylogeny and microsatellite content. Further, comparisons between cnidaria harboring intracellular dinoflagellates and those that do not, show microsatellite coverage is higher in the latter group. Our results support previous studies that found tri- and tetranucleotides to be the most abundant motifs in invertebrates. Differences in microsatellite coverage and composition between symbiotic and non-symbiotic cnidaria suggest the presence/absence of dinoflagellates might place restrictions on the host genome.
MOTIFSIM 2.1: An Enhanced Software Platform for Detecting Similarity in Multiple DNA Motif Data Sets

PubMed Central

Huang, Chun-Hsi

2017-01-01

Abstract Finding binding site motifs plays an important role in bioinformatics as it reveals the transcription factors that control the gene expression. The development for motif finders has flourished in the past years with many tools have been introduced to the research community. Although these tools possess exceptional features for detecting motifs, they report different results for an identical data set. Hence, using multiple tools is recommended because motifs reported by several tools are likely biologically significant. However, the results from multiple tools need to be compared for obtaining common significant motifs. MOTIFSIM web tool and command-line tool were developed for this purpose. In this work, we present several technical improvements as well as additional features to further support the motif analysis in our new release MOTIFSIM 2.1. PMID:28632401
Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh].

PubMed

Dutta, Sutapa; Kumawat, Giriraj; Singh, Bikram P; Gupta, Deepak K; Singh, Sangeeta; Dogra, Vivek; Gaikwad, Kishor; Sharma, Tilak R; Raje, Ranjeet S; Bandhopadhya, Tapas K; Datta, Subhojit; Singh, Mahendra N; Bashasab, Fakrudin; Kulwal, Pawan; Wanjari, K B; K Varshney, Rajeev; Cook, Douglas R; Singh, Nagendra K

2011-01-20

Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥ 18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea.
Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh

PubMed Central

2011-01-01

Background Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea. PMID:21251263
CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

NASA Astrophysics Data System (ADS)

Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

2014-12-01

Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.
Computation of direct and inverse mutations with the SEGM web server (Stochastic Evolution of Genetic Motifs): an application to splice sites of human genome introns.

PubMed

Benard, Emmanuel; Michel, Christian J

2009-08-01

We present here the SEGM web server (Stochastic Evolution of Genetic Motifs) in order to study the evolution of genetic motifs both in the direct evolutionary sense (past-present) and in the inverse evolutionary sense (present-past). The genetic motifs studied can be nucleotides, dinucleotides and trinucleotides. As an example of an application of SEGM and to understand its functionalities, we give an analysis of inverse mutations of splice sites of human genome introns. SEGM is freely accessible at http://lsiit-bioinfo.u-strasbg.fr:8080/webMathematica/SEGM/SEGM.html directly or by the web site http://dpt-info.u-strasbg.fr/~michel/. To our knowledge, this SEGM web server is to date the only computational biology software in this evolutionary approach.
Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

PubMed Central

Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

2013-01-01

The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545
Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations

PubMed Central

Zhu, Yicheng; Neeman, Teresa; Yap, Von Bing; Huttley, Gavin A.

2017-01-01

Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A→G mutations. We show that major effects of neighbors on germline mutation lie within ±2 of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T→C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif. PMID:27974498
BayesMotif: de novo protein sorting motif discovery from impure datasets.

PubMed

Hu, Jianjun; Zhang, Fan

2010-01-18

Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of PWM (position weight matrix) motif model.
cDNA sequence and expression of a cold-responsive gene in Citrus unshiu.

PubMed

Hara, M; Wakasugi, Y; Ikoma, Y; Yano, M; Ogawa, K; Kuboi, T

1999-02-01

A cDNA clone encoding a protein (CuCOR19), the sequence of which is similar to Poncirus COR19, of the dehydrin family was isolated from the epicarp of Citrus unshiu. The molecular mass of the predicted protein was 18,980 daltons. CuCOR19 was highly hydrophilic and contained three repeating elements including Lys-rich motifs. The gene expression in leaves increased by cold stress.
SA-Mot: a web server for the identification of motifs of interest extracted from protein loops

PubMed Central

Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude

2011-01-01

The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr. PMID:21665924
SA-Mot: a web server for the identification of motifs of interest extracted from protein loops.

PubMed

Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude

2011-07-01

The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr.
Systematic comparison of the response properties of protein and RNA mediated gene regulatory motifs.

PubMed

Iyengar, Bharat Ravi; Pillai, Beena; Venkatesh, K V; Gadgil, Chetan J

2017-05-30

We present a framework enabling the dissection of the effects of motif structure (feedback or feedforward), the nature of the controller (RNA or protein), and the regulation mode (transcriptional, post-transcriptional or translational) on the response to a step change in the input. We have used a common model framework for gene expression where both motif structures have an activating input and repressing regulator, with the same set of parameters, to enable a comparison of the responses. We studied the global sensitivity of the system properties, such as steady-state gain, overshoot, peak time, and peak duration, to parameters. We find that, in all motifs, overshoot correlated negatively whereas peak duration varied concavely with peak time. Differences in the other system properties were found to be mainly dependent on the nature of the controller rather than the motif structure. Protein mediated motifs showed a higher degree of adaptation i.e. a tendency to return to baseline levels; in particular, feedforward motifs exhibited perfect adaptation. RNA mediated motifs had a mild regulatory effect; they also exhibited a lower peaking tendency and mean overshoot. Protein mediated feedforward motifs showed higher overshoot and lower peak time compared to the corresponding feedback motifs.
SLIDER: a generic metaheuristic for the discovery of correlated motifs in protein-protein interaction networks.

PubMed

Boyen, Peter; Van Dyck, Dries; Neven, Frank; van Ham, Roeland C H J; van Dijk, Aalt D J

2011-01-01

Correlated motif mining (cmm) is the problem of finding overrepresented pairs of patterns, called motifs, in sequences of interacting proteins. Algorithmic solutions for cmm thereby provide a computational method for predicting binding sites for protein interaction. In this paper, we adopt a motif-driven approach where the support of candidate motif pairs is evaluated in the network. We experimentally establish the superiority of the Chi-square-based support measure over other support measures. Furthermore, we obtain that cmm is an np-hard problem for a large class of support measures (including Chi-square) and reformulate the search for correlated motifs as a combinatorial optimization problem. We then present the generic metaheuristic slider which uses steepest ascent with a neighborhood function based on sliding motifs and employs the Chi-square-based support measure. We show that slider outperforms existing motif-driven cmm methods and scales to large protein-protein interaction networks. The slider-implementation and the data used in the experiments are available on http://bioinformatics.uhasselt.be.
Statistical tests to compare motif count exceptionalities

PubMed Central

Robin, Stéphane; Schbath, Sophie; Vandewalle, Vincent

2007-01-01

Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use. PMID:17346349
Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs.

PubMed

Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude

2011-06-20

One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.
Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs

PubMed Central

2011-01-01

Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins. PMID:21689388
The Nuclear Protein Database (NPD): sub-nuclear localisation and functional annotation of the nuclear proteome

PubMed Central

Dellaire, G.; Farrall, R.; Bickmore, W.A.

2003-01-01

The Nuclear Protein Database (NPD) is a curated database that contains information on more than 1300 vertebrate proteins that are thought, or are known, to localise to the cell nucleus. Each entry is annotated with information on predicted protein size and isoelectric point, as well as any repeats, motifs or domains within the protein sequence. In addition, information on the sub-nuclear localisation of each protein is provided and the biological and molecular functions are described using Gene Ontology (GO) terms. The database is searchable by keyword, protein name, sub-nuclear compartment and protein domain/motif. Links to other databases are provided (e.g. Entrez, SWISS-PROT, OMIM, PubMed, PubMed Central). Thus, NPD provides a gateway through which the nuclear proteome may be explored. The database can be accessed at http://npd.hgu.mrc.ac.uk and is updated monthly. PMID:12520015

Epigenetics of the myotonic dystrophy-associated DMPK gene neighborhood

PubMed Central

Buckley, Lauren; Lacey, Michelle; Ehrlich, Melanie

2016-01-01

Aim: Identify epigenetic marks in the vicinity of DMPK (linked to myotonic dystrophy, DM1) that help explain tissue-specific differences in its expression. Materials & methods: At DMPK and its flanking genes (DMWD, SIX5, BHMG1 and RSPH6A), we analyzed many epigenetic and transcription profiles from myoblasts, myotubes, skeletal muscle, heart and 30 nonmuscle samples. Results: In the DMPK gene neighborhood, muscle-associated DNA hypermethylation and hypomethylation, enhancer chromatin, and CTCF binding were seen. Myogenic DMPK hypermethylation correlated with high expression and decreased alternative promoter usage. Testis/sperm hypomethylation of BHMG1 and RSPH6A was associated with testis-specific expression. G-quadruplex (G4) motifs and sperm-specific hypomethylation were found near the DM1-linked CTG repeats within DMPK. Conclusion: Tissue-specific epigenetic features in DMPK and neighboring genes help regulate its expression. G4 motifs in DMPK DNA and RNA might contribute to DM1 pathology. PMID:26756355
Development of Genome Engineering Tools from Plant-Specific PPR Proteins Using Animal Cultured Cells.

PubMed

Kobayashi, Takehito; Yagi, Yusuke; Nakamura, Takahiro

2016-01-01

The pentatricopeptide repeat (PPR) motif is a sequence-specific RNA/DNA-binding module. Elucidation of the RNA/DNA recognition mechanism has enabled engineering of PPR motifs as new RNA/DNA manipulation tools in living cells, including for genome editing. However, the biochemical characteristics of PPR proteins remain unknown, mostly due to the instability and/or unfolding propensities of PPR proteins in heterologous expression systems such as bacteria and yeast. To overcome this issue, we constructed reporter systems using animal cultured cells. The cell-based system has highly attractive features for PPR engineering: robust eukaryotic gene expression; availability of various vectors, reagents, and antibodies; highly efficient DNA delivery ratio (>80 %); and rapid, high-throughput data production. In this chapter, we introduce an example of such reporter systems: a PPR-based sequence-specific translational activation system. The cell-based reporter system can be applied to characterize plant genes of interested and to PPR engineering.
PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

PubMed

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Comparative Analysis of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) of Streptococcus thermophilus St-I and its Bacteriophage-Insensitive Mutants (BIM) Derivatives.

PubMed

Li, Wan; Bian, Xin; Evivie, Smith Etareri; Huo, Gui-Cheng

2016-09-01

The CRISPR-Cas (CRISPR together with CRISPR-associated proteins) modules are the adaptive immune system, acting as an adaptive and heritable immune system in bacteria and archaea. CRISPR-based immunity acts by integrating short virus sequences in the cell's CRISPR locus, allowing the cell to remember, recognize, and clear infections. In this study, the homology of CRISPRs sequence in BIMs (bacteriophage-insensitive mutants) of Streptococcus thermophilus St-I were analyzed. Secondary structures of the repeats and the PAMs (protospacer-associated motif) of each CRISPR locus were also predicted. Results showed that CRISPR1 has 27 repeat-spacer units, 5 of them had duplicates; CRISPR2 has one repeat-spacer unit; CRISPR3 has 28 repeat-spacer units. Only BIM1 had a new spacer acquisition in CRISPR3, while BIM2 and BIM3 had no new spacers' insertion, thus indicating that while most CRISPR1 were more active than CRISPR3, new spacer acquisition occurred just in CRSPR3 in some situations. These findings will help establish the foundation for the study of CRSPR-Cas systems in lactic acid bacteria.
Facilitation of Allergic Sensitization and Allergic Airway Inflammation by Pollen-Induced Innate Neutrophil Recruitment

PubMed Central

Hosoki, Koa; Aguilera-Aguirre, Leopoldo; Brasier, Allan R.; Kurosky, Alexander; Boldogh, Istvan

2016-01-01

Neutrophil recruitment is a hallmark of rapid innate immune responses. Exposure of airways of naive mice to pollens rapidly induces neutrophil recruitment. The innate mechanisms that regulate pollen-induced neutrophil recruitment and the contribution of this neutrophilic response to subsequent induction of allergic sensitization and inflammation need to be elucidated. Here we show that ragweed pollen extract (RWPE) challenge in naive mice induces C-X-C motif ligand (CXCL) chemokine synthesis, which stimulates chemokine (C-X-C motif) receptor 2 (CXCR2)-dependent recruitment of neutrophils into the airways. Deletion of Toll-like receptor 4 (TLR4) abolishes CXCL chemokine secretion and neutrophil recruitment induced by a single RWPE challenge and inhibits induction of allergic sensitization and airway inflammation after repeated exposures to RWPE. Forced induction of CXCL chemokine secretion and neutrophil recruitment in mice lacking TLR4 also reconstitutes the ability of multiple challenges of RWPE to induce allergic airway inflammation. Blocking RWPE-induced neutrophil recruitment in wild-type mice by administration of a CXCR2 inhibitor inhibits the ability of repeated exposures to RWPE to stimulate allergic sensitization and airway inflammation. Administration of neutrophils derived from naive donor mice into the airways of Tlr4 knockout recipient mice after each repeated RWPE challenge reconstitutes allergic sensitization and inflammation in these mice. Together these observations indicate that pollen-induced recruitment of neutrophils is TLR4 and CXCR2 dependent and that recruitment of neutrophils is a critical rate-limiting event that stimulates induction of allergic sensitization and airway inflammation. Inhibiting pollen-induced recruitment of neutrophils, such as by administration of CXCR2 antagonists, may be a novel strategy to prevent initiation of pollen-induced allergic airway inflammation. PMID:26086549
Facilitation of Allergic Sensitization and Allergic Airway Inflammation by Pollen-Induced Innate Neutrophil Recruitment.

PubMed

Hosoki, Koa; Aguilera-Aguirre, Leopoldo; Brasier, Allan R; Kurosky, Alexander; Boldogh, Istvan; Sur, Sanjiv

2016-01-01

Neutrophil recruitment is a hallmark of rapid innate immune responses. Exposure of airways of naive mice to pollens rapidly induces neutrophil recruitment. The innate mechanisms that regulate pollen-induced neutrophil recruitment and the contribution of this neutrophilic response to subsequent induction of allergic sensitization and inflammation need to be elucidated. Here we show that ragweed pollen extract (RWPE) challenge in naive mice induces C-X-C motif ligand (CXCL) chemokine synthesis, which stimulates chemokine (C-X-C motif) receptor 2 (CXCR2)-dependent recruitment of neutrophils into the airways. Deletion of Toll-like receptor 4 (TLR4) abolishes CXCL chemokine secretion and neutrophil recruitment induced by a single RWPE challenge and inhibits induction of allergic sensitization and airway inflammation after repeated exposures to RWPE. Forced induction of CXCL chemokine secretion and neutrophil recruitment in mice lacking TLR4 also reconstitutes the ability of multiple challenges of RWPE to induce allergic airway inflammation. Blocking RWPE-induced neutrophil recruitment in wild-type mice by administration of a CXCR2 inhibitor inhibits the ability of repeated exposures to RWPE to stimulate allergic sensitization and airway inflammation. Administration of neutrophils derived from naive donor mice into the airways of Tlr4 knockout recipient mice after each repeated RWPE challenge reconstitutes allergic sensitization and inflammation in these mice. Together these observations indicate that pollen-induced recruitment of neutrophils is TLR4 and CXCR2 dependent and that recruitment of neutrophils is a critical rate-limiting event that stimulates induction of allergic sensitization and airway inflammation. Inhibiting pollen-induced recruitment of neutrophils, such as by administration of CXCR2 antagonists, may be a novel strategy to prevent initiation of pollen-induced allergic airway inflammation.
In vitro selection of DNA elements highly responsive to the human T-cell lymphotropic virus type I transcriptional activator, Tax.

PubMed

Paca-Uccaralertkun, S; Zhao, L J; Adya, N; Cross, J V; Cullen, B R; Boros, I M; Giam, C Z

1994-01-01

The human T-cell lymphotropic virus type I (HTLV-I) transactivator, Tax, the ubiquitous transcriptional factor cyclic AMP (cAMP) response element-binding protein (CREB protein), and the 21-bp repeats in the HTLV-I transcriptional enhancer form a ternary nucleoprotein complex (L. J. Zhao and C. Z. Giam, Proc. Natl. Acad. Sci. USA 89:7070-7074, 1992). Using an antibody directed against the COOH-terminal region of Tax along with purified Tax and CREB proteins, we selected DNA elements bound specifically by the Tax-CREB complex in vitro. Two distinct but related groups of sequences containing the cAMP response element (CRE) flanked by long runs of G and C residues in the 5' and 3' regions, respectively, were preferentially recognized by Tax-CREB. In contrast, CREB alone binds only to CRE motifs (GNTGACG[T/C]) without neighboring G- or C-rich sequences. The Tax-CREB-selected sequences bear a striking resemblance to the 5' or 3' two-thirds of the HTLV-I 21-bp repeats and are highly inducible by Tax. Gel electrophoretic mobility shift assays, DNA transfection, and DNase I footprinting analyses indicated that the G- and C-rich sequences flanking the CRE motif are crucial for Tax-CREB-DNA ternary complex assembly and Tax transactivation but are not in direct contact with the Tax-CREB complex. These data show that Tax recruits CREB to form a multiprotein complex that specifically recognizes the viral 21-bp repeats. The expanded DNA binding specificity of Tax-CREB and the obligatory role the ternary Tax-CREB-DNA complex plays in transactivation reveal a novel mechanism for regulating the transcriptional activity of leucine zipper proteins like CREB.
Sel1-like repeat proteins in signal transduction.

PubMed

Mittl, Peer R E; Schneider-Brachert, Wulf

2007-01-01

Solenoid proteins, which are distinguished from general globular proteins by their modular architectures, are frequently involved in signal transduction pathways. Proteins from the tetratricopeptide repeat (TPR) and Sel1-like repeat (SLR) families share similar alpha-helical conformations but different consensus sequence lengths and superhelical topologies. Both families are characterized by low sequence similarity levels, rendering the identification of functional homologous difficult. Therefore current knowledge of the molecular and cellular functions of the SLR proteins Sel1, Hrd3, Chs4, Nif1, PodJ, ExoR, AlgK, HcpA, Hsp12, EnhC, LpnE, MotX, and MerG has been reviewed. Although SLR proteins possess different cellular functions they all seem to serve as adaptor proteins for the assembly of macromolecular complexes. Sel1, Hrd3, Hsp12 and LpnE are activated under cellular stress. The eukaryotic Sel1 and Hrd3 proteins are involved in the ER-associated protein degradation, whereas the bacterial LpnE, EnhC, HcpA, ExoR, and AlgK proteins mediate the interactions between bacterial and eukaryotic host cells. LpnE and EnhC are responsible for the entry of L. pneumophila into epithelial cells and macrophages. ExoR from the symbiotic microorganism S. melioti and AlgK from the pathogen P. aeruginosa regulate exopolysaccaride synthesis. Nif1 and Chs4 from yeast are responsible for the regulation of mitosis and septum formation during cell division, respectively, and PodJ guides the cellular differentiation during the cell cycle of the bacterium C. crescentus. Taken together the SLR motif establishes a link between signal transduction pathways from eukaryotes and bacteria. The SLR motif is so far absent from archaea. Therefore the SLR could have developed in the last common ancestor between eukaryotes and bacteria.
CRISPR-Cas systems target a diverse collection of invasive mobile genetic elements in human microbiomes

PubMed Central

2013-01-01

Background Bacteria and archaea develop immunity against invading genomes by incorporating pieces of the invaders' sequences, called spacers, into a clustered regularly interspaced short palindromic repeats (CRISPR) locus between repeats, forming arrays of repeat-spacer units. When spacers are expressed, they direct CRISPR-associated (Cas) proteins to silence complementary invading DNA. In order to characterize the invaders of human microbiomes, we use spacers from CRISPR arrays that we had previously assembled from shotgun metagenomic datasets, and identify contigs that contain these spacers' targets. Results We discover 95,000 contigs that are putative invasive mobile genetic elements, some targeted by hundreds of CRISPR spacers. We find that oral sites in healthy human populations have a much greater variety of mobile genetic elements than stool samples. Mobile genetic elements carry genes encoding diverse functions: only 7% of the mobile genetic elements are similar to known phages or plasmids, although a much greater proportion contain phage- or plasmid-related genes. A small number of contigs share similarity with known integrative and conjugative elements, providing the first examples of CRISPR defenses against this class of element. We provide detailed analyses of a few large mobile genetic elements of various types, and a relative abundance analysis of mobile genetic elements and putative hosts, exploring the dynamic activities of mobile genetic elements in human microbiomes. A joint analysis of mobile genetic elements and CRISPRs shows that protospacer-adjacent motifs drive their interaction network; however, some CRISPR-Cas systems target mobile genetic elements lacking motifs. Conclusions We identify a large collection of invasive mobile genetic elements in human microbiomes, an important resource for further study of the interaction between the CRISPR-Cas immune system and invaders. PMID:23628424
New paradigm in ankyrin repeats: Beyond protein-protein interaction module.

PubMed

Islam, Zeyaul; Nagampalli, Raghavendra Sashi Krishna; Fatima, Munazza Tamkeen; Ashraf, Ghulam Md

2018-04-01

Classically, ankyrin repeat (ANK) proteins are built from tandems of two or more repeats and form curved solenoid structures that are associated with protein-protein interactions. These are short, widespread structural motif of around 33 amino acids repeats in tandem, having a canonical helix-loop-helix fold, found individually or in combination with other domains. The multiplicity of structural pattern enables it to form assemblies of diverse sizes, required for their abilities to confer multiple binding and structural roles of proteins. Three-dimensional structures of these repeats determined to date reveal a degree of structural variability that translates into the considerable functional versatility of this protein superfamily. Recent work on the ANK has proposed novel structural information, especially protein-lipid, protein-sugar and protein-protein interaction. Self-assembly of these repeats was also shown to prevent the associated protein in forming filaments. In this review, we summarize the latest findings and how the new structural information has increased our understanding of the structural determinants of ANK proteins. We discussed latest findings on how these proteins participate in various interactions to diversify the ANK roles in numerous biological processes, and explored the emerging and evolving field of designer ankyrins and its framework for protein engineering emphasizing on biotechnological applications. Copyright © 2017 Elsevier B.V. All rights reserved.
P-Finder: Reconstruction of Signaling Networks from Protein-Protein Interactions and GO Annotations.

PubMed

Young-Rae Cho; Yanan Xin; Speegle, Greg

2015-01-01

Because most complex genetic diseases are caused by defects of cell signaling, illuminating a signaling cascade is essential for understanding their mechanisms. We present three novel computational algorithms to reconstruct signaling networks between a starting protein and an ending protein using genome-wide protein-protein interaction (PPI) networks and gene ontology (GO) annotation data. A signaling network is represented as a directed acyclic graph in a merged form of multiple linear pathways. An advanced semantic similarity metric is applied for weighting PPIs as the preprocessing of all three methods. The first algorithm repeatedly extends the list of nodes based on path frequency towards an ending protein. The second algorithm repeatedly appends edges based on the occurrence of network motifs which indicate the link patterns more frequently appearing in a PPI network than in a random graph. The last algorithm uses the information propagation technique which iteratively updates edge orientations based on the path strength and merges the selected directed edges. Our experimental results demonstrate that the proposed algorithms achieve higher accuracy than previous methods when they are tested on well-studied pathways of S. cerevisiae. Furthermore, we introduce an interactive web application tool, called P-Finder, to visualize reconstructed signaling networks.
Evaluation of a disintegrin-like and metalloprotease with thrombospondin type 1 repeat motifs 13 (ADAMTS13) activity enzyme-linked immunosorbent assay for measuring plasma ADAMTS13 activity in dogs.

PubMed

Maruyama, Haruhiko; Kaneko, Michiko; Otake, Taiga; Kano, Rui; Yamaya, Yoshiki; Watari, Toshihiro; Hasegawa, Atsuhiko; Kamata, Hiroshi

2014-03-01

A disintegrin-like and metalloprotease with thrombospondin type 1 repeat motifs 13 (ADAMTS13) is a von Willebrand factor (vWF)-cleaving protease. Deficiencies in ADAMTS13 activity are known to cause thrombotic diseases in human beings. The present study evaluated whether the human ADAMTS13 activity enzyme-linked immunosorbent assay (ELISA) kit containing human vWF73 (a minimal substrate) and anti-N10 antibody (which specifically recognizes the decapeptide of the C-terminal edge of cleaved vWF by human ADAMTS13) is applicable to the measurement of canine plasma ADAMTS13 activity. Human vWF73 fused with a GST-tag and a His-tag (GST-hvWF73-His) was reacted with recombinant canine (rc)ADAMTS13, canine plasma, and human plasma, and then used in Western blotting using anti-N10 antibody. Linearity and intra- and interassay reproducibility of the human ADAMTS13 activity ELISA kit in canine plasma were further evaluated. Finally, plasma ADAMTS13 activity was measured in 13 healthy dogs and 6 dogs with bacteremia using the human ADAMTS13 activity ELISA kit. Cleaved products with a 28-kDa GST-hvWF73-His were detected specifically in rcADAMTS13 as well as in human ADAMTS13, and also in canine plasma by anti-N10 antibody, showing excellent linearity. Intra-assay coefficient of variation (CV) was 3.0-12.4%, and interassay CV was 11.5-12.5%. The ADAMTS13 activity was significantly lower in dogs with bacteremia than in healthy dogs (P = 0.0025). The current study revealed that the human ADAMTS13 activity ELISA kit is applicable for measurement of canine plasma ADAMTS13 activity to elucidate the pathology of thrombotic diseases in dogs.
Partial information decomposition as a unified approach to the specification of neural goal functions.

PubMed

Wibral, Michael; Priesemann, Viola; Kay, Jim W; Lizier, Joseph T; Phillips, William A

2017-03-01

In many neural systems anatomical motifs are present repeatedly, but despite their structural similarity they can serve very different tasks. A prime example for such a motif is the canonical microcircuit of six-layered neo-cortex, which is repeated across cortical areas, and is involved in a number of different tasks (e.g. sensory, cognitive, or motor tasks). This observation has spawned interest in finding a common underlying principle, a 'goal function', of information processing implemented in this structure. By definition such a goal function, if universal, cannot be cast in processing-domain specific language (e.g. 'edge filtering', 'working memory'). Thus, to formulate such a principle, we have to use a domain-independent framework. Information theory offers such a framework. However, while the classical framework of information theory focuses on the relation between one input and one output (Shannon's mutual information), we argue that neural information processing crucially depends on the combination of multiple inputs to create the output of a processor. To account for this, we use a very recent extension of Shannon Information theory, called partial information decomposition (PID). PID allows to quantify the information that several inputs provide individually (unique information), redundantly (shared information) or only jointly (synergistic information) about the output. First, we review the framework of PID. Then we apply it to reevaluate and analyze several earlier proposals of information theoretic neural goal functions (predictive coding, infomax and coherent infomax, efficient coding). We find that PID allows to compare these goal functions in a common framework, and also provides a versatile approach to design new goal functions from first principles. Building on this, we design and analyze a novel goal function, called 'coding with synergy', which builds on combining external input and prior knowledge in a synergistic manner. We suggest that this novel goal function may be highly useful in neural information processing. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Repeat-containing protein effectors of plant-associated organisms

PubMed Central

Mesarich, Carl H.; Bowen, Joanna K.; Hamiaux, Cyril; Templeton, Matthew D.

2015-01-01

Many plant-associated organisms, including microbes, nematodes, and insects, deliver effector proteins into the apoplast, vascular tissue, or cell cytoplasm of their prospective hosts. These effectors function to promote colonization, typically by altering host physiology or by modulating host immune responses. The same effectors however, can also trigger host immunity in the presence of cognate host immune receptor proteins, and thus prevent colonization. To circumvent effector-triggered immunity, or to further enhance host colonization, plant-associated organisms often rely on adaptive effector evolution. In recent years, it has become increasingly apparent that several effectors of plant-associated organisms are repeat-containing proteins (RCPs) that carry tandem or non-tandem arrays of an amino acid sequence or structural motif. In this review, we highlight the diverse roles that these repeat domains play in RCP effector function. We also draw attention to the potential role of these repeat domains in adaptive evolution with regards to RCP effector function and the evasion of effector-triggered immunity. The aim of this review is to increase the profile of RCP effectors from plant-associated organisms. PMID:26557126
Repeat-containing protein effectors of plant-associated organisms.

PubMed

Mesarich, Carl H; Bowen, Joanna K; Hamiaux, Cyril; Templeton, Matthew D

2015-01-01

Many plant-associated organisms, including microbes, nematodes, and insects, deliver effector proteins into the apoplast, vascular tissue, or cell cytoplasm of their prospective hosts. These effectors function to promote colonization, typically by altering host physiology or by modulating host immune responses. The same effectors however, can also trigger host immunity in the presence of cognate host immune receptor proteins, and thus prevent colonization. To circumvent effector-triggered immunity, or to further enhance host colonization, plant-associated organisms often rely on adaptive effector evolution. In recent years, it has become increasingly apparent that several effectors of plant-associated organisms are repeat-containing proteins (RCPs) that carry tandem or non-tandem arrays of an amino acid sequence or structural motif. In this review, we highlight the diverse roles that these repeat domains play in RCP effector function. We also draw attention to the potential role of these repeat domains in adaptive evolution with regards to RCP effector function and the evasion of effector-triggered immunity. The aim of this review is to increase the profile of RCP effectors from plant-associated organisms.
Whole-Gene Positive Selection, Elevated Synonymous Substitution Rates, Duplication, and Indel Evolution of the Chloroplast clpP1 Gene

PubMed Central

Erixon, Per; Oxelman, Bengt

2008-01-01

Background Synonymous DNA substitution rates in the plant chloroplast genome are generally relatively slow and lineage dependent. Non-synonymous rates are usually even slower due to purifying selection acting on the genes. Positive selection is expected to speed up non-synonymous substitution rates, whereas synonymous rates are expected to be unaffected. Until recently, positive selection has seldom been observed in chloroplast genes, and large-scale structural rearrangements leading to gene duplications are hitherto supposed to be rare. Methodology/Principle Findings We found high substitution rates in the exons of the plastid clpP1 gene in Oenothera (the Evening Primrose family) and three separate lineages in the tribe Sileneae (Caryophyllaceae, the Carnation family). Introns have been lost in some of the lineages, but where present, the intron sequences have substitution rates similar to those found in other introns of their genomes. The elevated substitution rates of clpP1 are associated with statistically significant whole-gene positive selection in three branches of the phylogeny. In two of the lineages we found multiple copies of the gene. Neighboring genes present in the duplicated fragments do not show signs of elevated substitution rates or positive selection. Although non-synonymous substitutions account for most of the increase in substitution rates, synonymous rates are also markedly elevated in some lineages. Whereas plant clpP1 genes experiencing negative (purifying) selection are characterized by having very conserved lengths, genes under positive selection often have large insertions of more or less repetitive amino acid sequence motifs. Conclusions/Significance We found positive selection of the clpP1 gene in various plant lineages to correlated with repeated duplication of the clpP1 gene and surrounding regions, repetitive amino acid sequences, and increase in synonymous substitution rates. The present study sheds light on the controversial issue of whether negative or positive selection is to be expected after gene duplications by providing evidence for the latter alternative. The observed increase in synonymous substitution rates in some of the lineages indicates that the detection of positive selection may be obscured under such circumstances. Future studies are required to explore the functional significance of the large inserted repeated amino acid motifs, as well as the possibility that synonymous substitution rates may be affected by positive selection. PMID:18167545
RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections

PubMed Central

Jaeger, Sébastien; Thieffry, Denis

2017-01-01

Abstract Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. PMID:28591841
Discovery of phosphorylation motif mixtures in phosphoproteomics data

PubMed Central

Ritz, Anna; Shakhnarovich, Gregory; Salomon, Arthur R.; Raphael, Benjamin J.

2009-01-01

Motivation: Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Phosphorylation sites on proteins are determined in part through particular patterns, or motifs, present in the amino acid sequence. Results: We describe an algorithm that simultaneously discovers multiple motifs in a set of peptides that were phosphorylated by several different kinases. Such sets of peptides are routinely produced in proteomics experiments.Our motif-finding algorithm uses the principle of minimum description length to determine a mixture of sequence motifs that distinguish a foreground set of phosphopeptides from a background set of unphosphorylated peptides. We show that our algorithm outperforms existing motif-finding algorithms on synthetic datasets consisting of mixtures of known phosphorylation sites. We also derive a motif specificity score that quantifies whether or not the phosphoproteins containing an instance of a motif have a significant number of known interactions. Application of our motif-finding algorithm to recently published human and mouse proteomic studies recovers several known phosphorylation motifs and reveals a number of novel motifs that are enriched for interactions with a particular kinase or phosphatase. Our tools provide a new approach for uncovering the sequence specificities of uncharacterized kinases or phosphatases. Availability: Software is available at http:/cs.brown.edu/people/braphael/software.html. Contact: aritz@cs.brown.edu; braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:18996944
A motif detection and classification method for peptide sequences using genetic programming.

PubMed

Tomita, Yasuyuki; Kato, Ryuji; Okochi, Mina; Honda, Hiroyuki

2008-08-01

An exploration of common rules (property motifs) in amino acid sequences has been required for the design of novel sequences and elucidation of the interactions between molecules controlled by the structural or physical environment. In the present study, we developed a new method to search property motifs that are common in peptide sequence data. Our method comprises the following two characteristics: (i) the automatic determination of the position and length of common property motifs by calculating the physicochemical similarity of amino acids, and (ii) the quick and effective exploration of motif candidates that discriminates the positives and negatives by the introduction of genetic programming (GP). Our method was evaluated by two types of model data sets. First, the intentionally buried property motifs were searched in the artificially derived peptide data containing intentionally buried property motifs. As a result, the expected property motifs were correctly extracted by our algorithm. Second, the peptide data that interact with MHC class II molecules were analyzed as one of the models of biologically active peptides with buried motifs in various lengths. Twofold MHC class II binding peptides were identified with the rule using our method, compared to the existing scoring matrix method. In conclusion, our GP based motif searching approach enabled to obtain knowledge of functional aspects of the peptides without any prior knowledge.
Myotonin protein-kinase [AGC]n trinucleotide repeat in seven nonhuman primates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Novelli, G.; Sineo, L.; Pontieri, E.

Myotonic dystrophy (DM) is due to a genomic instability of a trinucleotide [AGC]n motif, located at the 3{prime} UTR region of a protein-kinase gene (myotonin protein kinase, MT-PK). The [AGC] repeat is meiotically and mitotically unstable, and it is directly related to the manifestations of the disorder. Although a gene dosage effect of the MT-PK has been demonstrated n DM muscle, the mechanism(s) by which the intragenic repeat expansion leads to disease is largely unknown. This non-standard mutational event could reflect an evolutionary mechanism widespread among animal genomes. We have isolated and sequenced the complete 3{prime}UTR region of the MT-PKmore » gene in seven primates (macaque, orangutan, gorilla, chimpanzee, gibbon, owl monkey, saimiri), and examined by comparative sequence nucleotide analysis the [AGC]n intragenic repeat and the surrounding nucleotides. The genomic organization, including the [AGC]n repeat structure, was conserved in all examined species, excluding the gibbon (Hylobates agilis), in which the [AGC]n upstream sequence (GGAA) is replaced by a GA dinucleotide. The number of [AGC]n in the examined species ranged between 7 (gorilla) and 13 repeats (owl monkeys), with a polymorphism informative content (PIC) similar to that observed in humans. These results indicate that the 3{prime}UTR [AGC] repeat within the MT-PK gene is evolutionarily conserved, supporting that this region has important regulatory functions.« less

Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins

PubMed Central

Kinjo, Akira R.; Nakamura, Haruki

2012-01-01

Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478
Detecting Statistically Significant Communities of Triangle Motifs in Undirected Networks

DTIC Science & Technology

2015-03-16

moderately-sized networks. As a consequence, throughout this effort, a simulated annealing (SA) algorithm will be employed to effectively search the...then increment k by 1 and repeat the search to find z∗3. Once can continue to increment k until W < zδ, at which point the algorithm will stop and...collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources
Fission yeast RNA triphosphatase reads an Spt5 CTD code

DOE PAGES

Doamekpor, Selom K.; Schwer, Beate; Sanchez, Ana M.; ...

2014-11-20

mRNA capping enzymes are directed to nascent RNA polymerase II (Pol2) transcripts via interactions with the carboxy-terminal domains (CTDs) of Pol2 and transcription elongation factor Spt5. Fission yeast RNA triphosphatase binds to the Spt5 CTD, comprising a tandem repeat of nonapeptide motif TPAWNSGSK. Here we report the crystal structure of a Pct1·Spt5-CTD complex, which revealed two CTD docking sites on the Pct1 homodimer that engage TPAWN segments of the motif. Each Spt5 CTD interface, composed of elements from both subunits of the homodimer, is dominated by van der Waals contacts from Pct1 to the tryptophan of the CTD. The boundmore » CTD adopts a distinctive conformation in which the peptide backbone makes a tight U-turn so that the proline stacks over the tryptophan. We show that Pct1 binding to Spt5 CTD is antagonized by threonine phosphorylation. Our results fortify an emerging concept of an “Spt5 CTD code” in which (i) the Spt5 CTD is structurally plastic and can adopt different conformations that are templated by particular cellular Spt5 CTD receptor proteins; and (ii) threonine phosphorylation of the Spt5 CTD repeat inscribes a binary on–off switch that is read by diverse CTD receptors, each in its own distinctive manner.« less
Fission yeast RNA triphosphatase reads an Spt5 CTD code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Doamekpor, Selom K.; Schwer, Beate; Sanchez, Ana M.

mRNA capping enzymes are directed to nascent RNA polymerase II (Pol2) transcripts via interactions with the carboxy-terminal domains (CTDs) of Pol2 and transcription elongation factor Spt5. Fission yeast RNA triphosphatase binds to the Spt5 CTD, comprising a tandem repeat of nonapeptide motif TPAWNSGSK. Here we report the crystal structure of a Pct1·Spt5-CTD complex, which revealed two CTD docking sites on the Pct1 homodimer that engage TPAWN segments of the motif. Each Spt5 CTD interface, composed of elements from both subunits of the homodimer, is dominated by van der Waals contacts from Pct1 to the tryptophan of the CTD. The boundmore » CTD adopts a distinctive conformation in which the peptide backbone makes a tight U-turn so that the proline stacks over the tryptophan. We show that Pct1 binding to Spt5 CTD is antagonized by threonine phosphorylation. Our results fortify an emerging concept of an “Spt5 CTD code” in which (i) the Spt5 CTD is structurally plastic and can adopt different conformations that are templated by particular cellular Spt5 CTD receptor proteins; and (ii) threonine phosphorylation of the Spt5 CTD repeat inscribes a binary on–off switch that is read by diverse CTD receptors, each in its own distinctive manner.« less
A MicroRNA Superfamily Regulates Nucleotide Binding Site–Leucine-Rich Repeats and Other mRNAs[W][OA

PubMed Central

Shivaprasad, Padubidri V.; Chen, Ho-Ming; Patel, Kanu; Bond, Donna M.; Santos, Bruno A.C.M.; Baulcombe, David C.

2012-01-01

Analysis of tomato (Solanum lycopersicum) small RNA data sets revealed the presence of a regulatory cascade affecting disease resistance. The initiators of the cascade are microRNA members of an unusually diverse superfamily in which miR482 and miR2118 are prominent members. Members of this superfamily are variable in sequence and abundance in different species, but all variants target the coding sequence for the P-loop motif in the mRNA sequences for disease resistance proteins with nucleotide binding site (NBS) and leucine-rich repeat (LRR) motifs. We confirm, using transient expression in Nicotiana benthamiana, that miR482 targets mRNAs for NBS-LRR disease resistance proteins with coiled-coil domains at their N terminus. The targeting causes mRNA decay and production of secondary siRNAs in a manner that depends on RNA-dependent RNA polymerase 6. At least one of these secondary siRNAs targets other mRNAs of a defense-related protein. The miR482-mediated silencing cascade is suppressed in plants infected with viruses or bacteria so that expression of mRNAs with miR482 or secondary siRNA target sequences is increased. We propose that this process allows pathogen-inducible expression of NBS-LRR proteins and that it contributes to a novel layer of defense against pathogen attack. PMID:22408077
Identification of a common single nucleotide polymorphism at the primer binding site of D2S1360 that causes heterozygote peak imbalance when using the Investigator HDplex Kit.

PubMed

Inokuchi, Shota; Yamashita, Yasuhiro; Nishimura, Kazuma; Nakanishi, Hiroaki; Saito, Kazuyuki

2017-11-01

Phenomena known as null alleles and peak imbalance can occur because of mutations in the primer binding sites used for DNA typing. In these cases, an accurate statistical evaluation of DNA typing is difficult. The estimated likelihood ratio is incorrectly calculated because of the null allele and allele dropout caused by mutation-induced peak imbalance. Although a number of studies have attempted to uncover examples of these phenomena, few reports are available on the human identification kit manufactured by Qiagen. In this study, 196 Japanese individuals who were heterozygous at D2S1360 were genotyped using an Investigator HDplex Kit with optimal amounts of DNA. A peak imbalance was frequently observed at the D2S1360 locus. We performed a sequencing analysis of the area surrounding the D2S1360 repeat motif to identify the cause for peak imbalance. A point mutation (G>A transition) 136 nucleotides upstream from the D2S1360 repeat motif was discovered in a number of samples. The allele frequency of the mutation was 0.0566 in the Japanese population. Therefore, human identification or kinship testing using the Investigator HDplex Kit requires caution because of the higher frequency of single nucleotide polymorphisms at the primer binding site of D2S1360 locus in the Japanese population.
Gentamicin Binds to the Megalin Receptor as a Competitive Inhibitor Using the Common Ligand Binding Motif of Complement Type Repeats

PubMed Central

Dagil, Robert; O'Shea, Charlotte; Nykjær, Anders; Bonvin, Alexandre M. J. J.; Kragelund, Birthe B.

2013-01-01

Gentamicin is an aminoglycoside widely used in treatments of, in particular, enterococcal, mycobacterial, and severe Gram-negative bacterial infections. Large doses of gentamicin cause nephrotoxicity and ototoxicity, entering the cell via the receptor megalin. Until now, no structural information has been available to describe the interaction with gentamicin in atomic detail, and neither have any three-dimensional structures of domains from the human megalin receptor been solved. To address this gap in our knowledge, we have solved the NMR structure of the 10th complement type repeat of human megalin and investigated its interaction with gentamicin. Using NMR titration data in HADDOCK, we have generated a three-dimensional model describing the complex between megalin and gentamicin. Gentamicin binds to megalin with low affinity and exploits the common ligand binding motif previously described (Jensen, G. A., Andersen, O. M., Bonvin, A. M., Bjerrum-Bohr, I., Etzerodt, M., Thogersen, H. C., O'Shea, C., Poulsen, F. M., and Kragelund, B. B. (2006) J. Mol. Biol. 362, 700–716) utilizing the indole side chain of Trp-1126 and the negatively charged residues Asp-1129, Asp-1131, and Asp-1133. Binding to megalin is highly similar to gentamicin binding to calreticulin. We discuss the impact of this novel insight for the future structure-based design of gentamicin antagonists. PMID:23275343
The Arabidopsis Mediator Complex Subunits MED16, MED14, and MED2 Regulate Mediator and RNA Polymerase II Recruitment to CBF-Responsive Cold-Regulated Genes[C][W][OPEN

PubMed Central

Hemsley, Piers A.; Hurst, Charlotte H.; Kaliyadasa, Ewon; Lamb, Rebecca; Knight, Marc R.; De Cothi, Elizabeth A.; Steele, John F.; Knight, Heather

2014-01-01

The Mediator16 (MED16; formerly termed SENSITIVE TO FREEZING6 [SFR6]) subunit of the plant Mediator transcriptional coactivator complex regulates cold-responsive gene expression in Arabidopsis thaliana, acting downstream of the C-repeat binding factor (CBF) transcription factors to recruit the core Mediator complex to cold-regulated genes. Here, we use loss-of-function mutants to show that RNA polymerase II recruitment to CBF-responsive cold-regulated genes requires MED16, MED2, and MED14 subunits. Transcription of genes known to be regulated via CBFs binding to the C-repeat motif/drought-responsive element promoter motif requires all three Mediator subunits, as does cold acclimation–induced freezing tolerance. In addition, these three subunits are required for low temperature–induced expression of some other, but not all, cold-responsive genes, including genes that are not known targets of CBFs. Genes inducible by darkness also required MED16 but required a different combination of Mediator subunits for their expression than the genes induced by cold. Together, our data illustrate that plants control transcription of specific genes through the action of subsets of Mediator subunits; the specific combination defined by the nature of the stimulus but also by the identity of the gene induced. PMID:24415770
Plasmodium cysteine repeat modular proteins 1-4: complex proteins with roles throughout the malaria parasite life cycle.

PubMed

Thompson, Joanne; Fernandez-Reyes, Delmiro; Sharling, Lisa; Moore, Sally G; Eling, Wijnand M; Kyes, Sue A; Newbold, Christopher I; Kafatos, Fotis C; Janse, Chris J; Waters, Andrew P

2007-06-01

The Cysteine Repeat Modular Proteins (PCRMP1-4) of Plasmodium, are encoded by a small gene family that is conserved in malaria and other Apicomplexan parasites. They are very large, predicted surface proteins with multipass transmembrane domains containing motifs that are conserved within families of cysteine-rich, predicted surface proteins in a range of unicellular eukaryotes, and a unique combination of protein-binding motifs, including a >100 kDa cysteine-rich modular region, an epidermal growth factor-like domain and a Kringle domain. PCRMP1 and 2 are expressed in life cycle stages in both the mosquito and vertebrate. They colocalize with PfEMP1 (P. falciparum Erythrocyte Membrane Antigen-1) during its export from P. falciparum blood-stage parasites and are exposed on the surface of haemolymph- and salivary gland-sporozoites in the mosquito, consistent with a role in host tissue targeting and invasion. Gene disruption of pcrmp1 and 2 in the rodent malaria model, P. berghei, demonstrated that both are essential for transmission of the parasite from the mosquito to the mouse and has established their discrete and important roles in sporozoite targeting to the mosquito salivary gland. The unprecedented expression pattern and structural features of the PCRMPs thus suggest a variety of roles mediating host-parasite interactions throughout the parasite life cycle.
A flexible motif search technique based on generalized profiles.

PubMed

Bucher, P; Karplus, K; Moeri, N; Hofmann, K

1996-03-01

A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition language; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence. The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignment in the framework of stochastic models, allowing for clear, simple significance tests. A mathematical statement of the motif search problem defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignments.
Characterization of OfWRKY3, a transcription factor that positively regulates the carotenoid cleavage dioxygenase gene OfCCD4 in Osmanthus fragrans.

PubMed

Han, Yuanji; Wu, Miao; Cao, Liya; Yuan, Wangjun; Dong, Meifang; Wang, Xiaohui; Chen, Weicai; Shang, Fude

2016-07-01

The sweet osmanthus carotenoid cleavage dioxygenase 4 (OfCCD4) cleaves carotenoids such as β-carotene and zeaxanthin to yield β-ionone. OfCCD4 is a member of the CCD gene family, and its promoter contains a W-box palindrome with two reversely oriented TGAC repeats, which are the proposed binding sites of WRKY transcription factors. We isolated three WRKY cDNAs from the petal of Osmanthus fragrans. One of them, OfWRKY3, encodes a protein containing two WRKY domains and two zinc finger motifs. OfWRKY3 and OfCCD4 had nearly identical expression profile in petals of 'Dangui' and 'Yingui' at different flowering stages and showed similar expression patterns in petals treated by salicylic acid, jasmonic acid and abscisic acid. Activation of OfCCD4pro:GUS by OfWRKY3 was detected in coinfiltrated tobacco leaves and very weak GUS activity was detected in control tissues, indicating that OfWRKY3 can interact with the OfCCD4 promoter. Yeast one-hybrid and electrophoretic mobility shift assay showed that OfWRKY3 was able to bind to the W-box palindrome motif present in the OfCCD4 promoter. These results suggest that OfWRKY3 is a positive regulator of the OfCCD4 gene, and might partly account for the biosynthesis of β-ionone in sweet osmanthus.
Arabidopsis ASYMMETRIC LEAVES2 protein required for leaf morphogenesis consistently forms speckles during mitosis of tobacco BY-2 cells via signals in its specific sequence.

PubMed

Luo, Lilan; Ando, Sayuri; Sasabe, Michiko; Machida, Chiyoko; Kurihara, Daisuke; Higashiyama, Tetsuya; Machida, Yasunori

2012-09-01

Leaf primordia with high division and developmental competencies are generated around the periphery of stem cells at the shoot apex. Arabidopsis ASYMMETRIC-LEAVES2 (AS2) protein plays a key role in the regulation of many genes responsible for flat symmetric leaf formation. The AS2 gene, expressed in leaf primordia, encodes a plant-specific nuclear protein containing an AS2/LOB domain with cysteine repeats (C-motif). AS2 proteins are present in speckles in and around the nucleoli, and in the nucleoplasm of some leaf epidermal cells. We used the tobacco cultured cell line BY-2 expressing the AS2-fused yellow fluorescent protein to examine subnuclear localization of AS2 in dividing cells. AS2 mainly localized to speckles (designated AS2 bodies) in cells undergoing mitosis and distributed in a pairwise manner during the separation of sets of daughter chromosomes. Few interphase cells contained AS2 bodies. Deletion analyses showed that a short stretch of the AS2 amino-terminal sequence and the C-motif play negative and positive roles, respectively, in localizing AS2 to the bodies. These results suggest that AS2 bodies function to properly distribute AS2 to daughter cells during cell division in leaf primordia; and this process is controlled at least partially by signals encoded by the AS2 sequence itself.
A type III-B CRISPR-Cas effector complex mediating massive target DNA destruction.

PubMed

Han, Wenyuan; Li, Yingjun; Deng, Ling; Feng, Mingxia; Peng, Wenfang; Hallstrøm, Søren; Zhang, Jing; Peng, Nan; Liang, Yun Xiang; White, Malcolm F; She, Qunxin

2017-02-28

The CRISPR (clustered regularly interspaced short palindromic repeats) system protects archaea and bacteria by eliminating nucleic acid invaders in a crRNA-guided manner. The Sulfolobus islandicus type III-B Cmr-α system targets invading nucleic acid at both RNA and DNA levels and DNA targeting relies on the directional transcription of the protospacer in vivo. To gain further insight into the involved mechanism, we purified a native effector complex of III-B Cmr-α from S. islandicus and characterized it in vitro. Cmr-α cleaved RNAs complementary to crRNA present in the complex and its ssDNA destruction activity was activated by target RNA. The ssDNA cleavage required mismatches between the 5΄-tag of crRNA and the 3΄-flanking region of target RNA. An invader plasmid assay showed that mutation either in the histidine-aspartate acid (HD) domain (a quadruple mutation) or in the GGDD motif of the Cmr-2α protein resulted in attenuation of the DNA interference in vivo. However, double mutation of the HD motif only abolished the DNase activity in vitro. Furthermore, the activated Cmr-α binary complex functioned as a highly active DNase to destroy a large excess DNA substrate, which could provide a powerful means to rapidly degrade replicating viral DNA. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
The highly dynamic CRISPR1 system of Streptococcus agalactiae controls the diversity of its mobilome.

PubMed

Lopez-Sanchez, Maria-José; Sauvage, Elisabeth; Da Cunha, Violette; Clermont, Dominique; Ratsima Hariniaina, Elisoa; Gonzalez-Zorn, Bruno; Poyart, Claire; Rosinski-Chupin, Isabelle; Glaser, Philippe

2012-09-01

Clustered regularly interspaced short palindromic repeats (CRISPR) confer immunity against mobile genetic elements (MGEs) in prokaryotes. Streptococcus agalactiae, a leading cause of neonatal infections contains in its genome two CRISPR/Cas systems. We show that type 1-C CRISPR2 is present in few strains but type 2-A CRISPR1 is ubiquitous. Comparative sequence analysis of the CRISPR1 spacer content of 351 S. agalactiae strains revealed that it is extremely diverse due to the acquisition of new spacers, spacer duplications and spacer deletions that witness the dynamics of this system. The spacer content profile mirrors the S. agalactiae population structure. Transfer of a conjugative transposon targeted by CRISPR1 selected for spacer rearrangements, suggesting that deletions and duplications pre-exist in the population. The comparison of protospacers located within MGE or the core genome and protospacer-associated motif-shuffling demonstrated that the GG motif is sufficient to discriminate self and non-self and for spacer selection and integration. Strikingly more than 40% of the 949 different CRISPR1 spacers identified target MGEs found in S. agalactiae genomes. We thus propose that the S. agalactiae type II-A CRISPR1/Cas system modulates the cohabitation of the species with its mobilome, as such contributing to the diversity of MGEs in the population. © 2012 Blackwell Publishing Ltd.
Molecular dynamics simulations of electrostatics and hydration distributions around RNA and DNA motifs

NASA Astrophysics Data System (ADS)

Marlowe, Ashley E.; Singh, Abhishek; Semichaevsky, Andrey V.; Yingling, Yaroslava G.

2009-03-01

Nucleic acid nanoparticles can self-assembly through the formation of complementary loop-loop interactions or stem-stem interactions. Presence and concentration of ions can significantly affect the self-assembly process and the stability of the nanostructure. In this presentation we use explicit molecular dynamics simulations to examine the variations in cationic distributions and hydration environment around DNA and RNA helices and loop-loop interactions. Our simulations show that the potassium and sodium ionic distributions are different around RNA and DNA motifs which could be indicative of ion mediated relative stability of loop-loop complexes. Moreover in RNA loop-loop motifs ions are consistently present and exchanged through a distinct electronegative channel. We will also show how we used the specific RNA loop-loop motif to design a RNA hexagonal nanoparticle.
Evolutionary Dynamics of Microsatellite Distribution in Plants: Insight from the Comparison of Sequenced Brassica, Arabidopsis and Other Angiosperm Species

PubMed Central

Shi, Jiaqin; Huang, Shunmou; Fu, Donghui; Yu, Jinyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

2013-01-01

Despite their ubiquity and functional importance, microsatellites have been largely ignored in comparative genomics, mostly due to the lack of genomic information. In the current study, microsatellite distribution was characterized and compared in the whole genomes and both the coding and non-coding DNA sequences of the sequenced Brassica, Arabidopsis and other angiosperm species to investigate their evolutionary dynamics in plants. The variation in the microsatellite frequencies of these angiosperm species was much smaller than those for their microsatellite numbers and genome sizes, suggesting that microsatellite frequency may be relatively stable in plants. The microsatellite frequencies of these angiosperm species were significantly negatively correlated with both their genome sizes and transposable elements contents. The pattern of microsatellite distribution may differ according to the different genomic regions (such as coding and non-coding sequences). The observed differences in many important microsatellite characteristics (especially the distribution with respect to motif length, type and repeat number) of these angiosperm species were generally accordant with their phylogenetic distance, which suggested that the evolutionary dynamics of microsatellite distribution may be generally consistent with plant divergence/evolution. Importantly, by comparing these microsatellite characteristics (especially the distribution with respect to motif type) the angiosperm species (aside from a few species) all clustered into two obviously different groups that were largely represented by monocots and dicots, suggesting a complex and generally dichotomous evolutionary pattern of microsatellite distribution in angiosperms. Polyploidy may lead to a slight increase in microsatellite frequency in the coding sequences and a significant decrease in microsatellite frequency in the whole genome/non-coding sequences, but have little effect on the microsatellite distribution with respect to motif length, type and repeat number. Interestingly, several microsatellite characteristics seemed to be constant in plant evolution, which can be well explained by the general biological rules. PMID:23555856
PERF: an exhaustive algorithm for ultra-fast and efficient identification of microsatellites from large DNA sequences.

PubMed

Avvaru, Akshay Kumar; Sowpati, Divya Tej; Mishra, Rakesh Kumar

2018-03-15

Microsatellites or Simple Sequence Repeats (SSRs) are short tandem repeats of DNA motifs present in all genomes. They have long been used for a variety of purposes in the areas of population genetics, genotyping, marker-assisted selection and forensics. Numerous studies have highlighted their functional roles in genome organization and gene regulation. Though several tools are currently available to identify SSRs from genomic sequences, they have significant limitations. We present a novel algorithm called PERF for extremely fast and comprehensive identification of microsatellites from DNA sequences of any size. PERF is several fold faster than existing algorithms and uses up to 5-fold lesser memory. It provides a clean and flexible command-line interface to change the default settings, and produces output in an easily-parseable tab-separated format. In addition, PERF generates an interactive and stand-alone HTML report with charts and tables for easy downstream analysis. PERF is implemented in the Python programming language. It is freely available on PyPI under the package name perf_ssr, and can be installed directly using pip or easy_install. The documentation of PERF is available at https://github.com/rkmlab/perf. The source code of PERF is deposited in GitHub at https://github.com/rkmlab/perf under an MIT license. tej@ccmb.res.in. Supplementary data are available at Bioinformatics online.
Genetic determinants of mate recognition in Brachionus manjavacas (Rotifera)

PubMed Central

Snell, Terry W; Shearer, Tonya L; Smith, Hilary A; Kubanek, Julia; Gribble, Kristin E; Welch, David B Mark

2009-01-01

Background Mate choice is of central importance to most animals, influencing population structure, speciation, and ultimately the survival of a species. Mating behavior of male brachionid rotifers is triggered by the product of a chemosensory gene, a glycoprotein on the body surface of females called the mate recognition pheromone. The mate recognition pheromone has been biochemically characterized, but little was known about the gene(s). We describe the isolation and characterization of the mate recognition pheromone gene through protein purification, N-terminal amino acid sequence determination, identification of the mate recognition pheromone gene from a cDNA library, sequencing, and RNAi knockdown to confirm the functional role of the mate recognition pheromone gene in rotifer mating. Results A 29 kD protein capable of eliciting rotifer male circling was isolated by high-performance liquid chromatography. Two transcript types containing the N-terminal sequence were identified in a cDNA library; further characterization by screening a genomic library and by polymerase chain reaction revealed two genes belonging to each type. Each gene begins with a signal peptide region followed by nearly perfect repeats of an 87 to 92 codon motif with no codons between repeats and the final motif prematurely terminated by the stop codon. The two Type A genes contain four and seven repeats and the two Type B genes contain three and five repeats, respectively. Only the Type B gene with three repeats encodes a peptide with a molecular weight of 29 kD. Each repeat of the Type B gene products contains three asparagines as potential sites for N-glycosylation; there are no asparagines in the Type A genes. RNAi with Type A double-stranded RNA did not result in less circling than in the phosphate-buffered saline control, but transfection with Type B double-stranded RNA significantly reduced male circling by 17%. The very low divergence between repeat units, even at synonymous positions, suggests that the repeats are kept nearly identical through a process of concerted evolution. Information-rich molecules like surface glycoproteins are well adapted for chemical communication and aquatic animals may have evolved signaling systems based on these compounds, whereas insects use cuticular hydrocarbons. Conclusion Owing to its critical role in mating, the mate recognition pheromone gene will be a useful molecular marker for exploring the mechanisms and rates of selection and the evolution of reproductive isolation and speciation using rotifers as a model system. The phylogenetic variation in the mate recognition pheromone gene can now be studied in conjunction with the large amount of ecological and population genetic data being gathered for the Brachionus plicatilis species complex to understand better the evolutionary drivers of cryptic speciation. PMID:19740420
Genetic determinants of mate recognition in Brachionus manjavacas (Rotifera).

PubMed

Snell, Terry W; Shearer, Tonya L; Smith, Hilary A; Kubanek, Julia; Gribble, Kristin E; Welch, David B Mark

2009-09-09

Mate choice is of central importance to most animals, influencing population structure, speciation, and ultimately the survival of a species. Mating behavior of male brachionid rotifers is triggered by the product of a chemosensory gene, a glycoprotein on the body surface of females called the mate recognition pheromone. The mate recognition pheromone has been biochemically characterized, but little was known about the gene(s). We describe the isolation and characterization of the mate recognition pheromone gene through protein purification, N-terminal amino acid sequence determination, identification of the mate recognition pheromone gene from a cDNA library, sequencing, and RNAi knockdown to confirm the functional role of the mate recognition pheromone gene in rotifer mating. A 29 kD protein capable of eliciting rotifer male circling was isolated by high-performance liquid chromatography. Two transcript types containing the N-terminal sequence were identified in a cDNA library; further characterization by screening a genomic library and by polymerase chain reaction revealed two genes belonging to each type. Each gene begins with a signal peptide region followed by nearly perfect repeats of an 87 to 92 codon motif with no codons between repeats and the final motif prematurely terminated by the stop codon. The two Type A genes contain four and seven repeats and the two Type B genes contain three and five repeats, respectively. Only the Type B gene with three repeats encodes a peptide with a molecular weight of 29 kD. Each repeat of the Type B gene products contains three asparagines as potential sites for N-glycosylation; there are no asparagines in the Type A genes. RNAi with Type A double-stranded RNA did not result in less circling than in the phosphate-buffered saline control, but transfection with Type B double-stranded RNA significantly reduced male circling by 17%. The very low divergence between repeat units, even at synonymous positions, suggests that the repeats are kept nearly identical through a process of concerted evolution. Information-rich molecules like surface glycoproteins are well adapted for chemical communication and aquatic animals may have evolved signaling systems based on these compounds, whereas insects use cuticular hydrocarbons. Owing to its critical role in mating, the mate recognition pheromone gene will be a useful molecular marker for exploring the mechanisms and rates of selection and the evolution of reproductive isolation and speciation using rotifers as a model system. The phylogenetic variation in the mate recognition pheromone gene can now be studied in conjunction with the large amount of ecological and population genetic data being gathered for the Brachionus plicatilis species complex to understand better the evolutionary drivers of cryptic speciation.
Effect of C(60) fullerene on the duplex formation of i-motif DNA with complementary DNA in solution.

PubMed

Jin, Kyeong Sik; Shin, Su Ryon; Ahn, Byungcheol; Jin, Sangwoo; Rho, Yecheol; Kim, Heesoo; Kim, Seon Jeong; Ree, Moonhor

2010-04-15

The structural effects of fullerene on i-motif DNA were investigated by characterizing the structures of fullerene-free and fullerene-bound i-motif DNA, in the presence of cDNA and in solutions of varying pH, using circular dichroism and synchrotron small-angle X-ray scattering. To facilitate a direct structural comparison between the i-motif and duplex structures in response to pH stimulus, we developed atomic scale structural models for the duplex and i-motif DNA structures, and for the C(60)/i-motif DNA hybrid associated with the cDNA strand, assuming that the DNA strands are present in an ideal right-handed helical conformation. We found that fullerene shifted the pH-induced conformational transition between the i-motif and the duplex structure, possibly due to the hydrophobic interactions between the terminal fullerenes and between the terminal fullerenes and an internal TAA loop in the DNA strand. The hybrid structure showed a dramatic reduction in cyclic hysteresis.

Anion induced conformational preference of Cα NN motif residues in functional proteins.

PubMed

Patra, Piya; Ghosh, Mahua; Banerjee, Raja; Chakrabarti, Jaydeb

2017-12-01

Among different ligand binding motifs, anion binding C α NN motif consisting of peptide backbone atoms of three consecutive residues are observed to be important for recognition of free anions, like sulphate or biphosphate and participate in different key functions. Here we study the interaction of sulphate and biphosphate with C α NN motif present in different proteins. Instead of total protein, a peptide fragment has been studied keeping C α NN motif flanked in between other residues. We use classical force field based molecular dynamics simulations to understand the stability of this motif. Our data indicate fluctuations in conformational preferences of the motif residues in absence of the anion. The anion gives stability to one of these conformations. However, the anion induced conformational preferences are highly sequence dependent and specific to the type of anion. In particular, the polar residues are more favourable compared to the other residues for recognising the anion. © 2017 Wiley Periodicals, Inc.
The binding of Varp to VAMP7 traps VAMP7 in a closed, fusogenically inactive conformation

PubMed Central

Schäfer, Ingmar B.; Hesketh, Geoffrey G.; Bright, Nicholas A.; Gray, Sally R.; Pryor, Paul R.; Evans, Philip R; Luzio, J. Paul; Owen, David J.

2012-01-01

SNAREs provide energy and specificity to membrane fusion events. Fusogenic trans-SNARE complexes are assembled from Q-SNAREs embedded in one membrane and an R–SNARE embedded in the other. Regulation of membrane fusion events is crucial for intracellular trafficking. We identify the endosomal protein Varp as an R-SNARE-binding regulator of SNARE complex formation. Varp co-localises with and binds to VAMP7, an R-SNARE involved in both endocytic and secretory pathways. We present the structure of the second ankyrin repeat domain of mammalian Varp in complex with the cytosolic portion of VAMP7. The VAMP7 SNARE motif is trapped between Varp and the VAMP7 longin domain and hence Varp kinetically inhibits VAMP7’s ability to form SNARE complexes. This inhibition will be increased when Varp can also bind to other proteins present on the same membrane as the VAMP7 such as Rab32:GTP. PMID:23104059
New horizons for lipoprotein receptors: communication by β-propellers

PubMed Central

Andersen, Olav M.; Dagil, Robert; Kragelund, Birthe B.

2013-01-01

The lipoprotein receptor (LR) family constitutes a large group of structurally closely related receptors with broad ligand-binding specificity. Traditionally, ligand binding to LRs has been anticipated to involve merely the complement type repeat (CR)-domains omnipresent in the family. Recently, this dogma has transformed with the observation that β-propellers of some LRs actively engage in complex formation too. Based on an in-depth decomposition of current structures and sequences, we suggest that exploitation of the β-propellers as binding targets depends on receptor subgroups. In particular, we highlight the shutter mechanism of β-propellers as a general recognition motif for NxI-containing ligands, and we present indications that the generalized β-propeller-induced ligand release mechanism is not applicable for the larger LRs. For the giant LR members, we present evidence that their β-propellers may also actively engage in ligand binding. We therefore advocate for an increased focus on solving the structure-function relationship of this group of important biological receptors. PMID:23881912
Amyloid fibril formation from sequences of a natural beta-structured fibrous protein, the adenovirus fiber.

PubMed

Papanikolopoulou, Katerina; Schoehn, Guy; Forge, Vincent; Forsyth, V Trevor; Riekel, Christian; Hernandez, Jean-François; Ruigrok, Rob W H; Mitraki, Anna

2005-01-28

Amyloid fibrils are fibrous beta-structures that derive from abnormal folding and assembly of peptides and proteins. Despite a wealth of structural studies on amyloids, the nature of the amyloid structure remains elusive; possible connections to natural, beta-structured fibrous motifs have been suggested. In this work we focus on understanding amyloid structure and formation from sequences of a natural, beta-structured fibrous protein. We show that short peptides (25 to 6 amino acids) corresponding to repetitive sequences from the adenovirus fiber shaft have an intrinsic capacity to form amyloid fibrils as judged by electron microscopy, Congo Red binding, infrared spectroscopy, and x-ray fiber diffraction. In the presence of the globular C-terminal domain of the protein that acts as a trimerization motif, the shaft sequences adopt a triple-stranded, beta-fibrous motif. We discuss the possible structure and arrangement of these sequences within the amyloid fibril, as compared with the one adopted within the native structure. A 6-amino acid peptide, corresponding to the last beta-strand of the shaft, was found to be sufficient to form amyloid fibrils. Structural analysis of these amyloid fibrils suggests that perpendicular stacking of beta-strand repeat units is an underlying common feature of amyloid formation.
β-Catenin recognizes a specific RNA motif in the cyclooxygenase-2 mRNA 3′-UTR and interacts with HuR in colon cancer cells

PubMed Central

Kim, Inae; Kwak, Hoyun; Lee, Hee Kyu; Hyun, Soonsil; Jeong, Sunjoo

2012-01-01

RNA-binding proteins regulate multiple steps of RNA metabolism through both dynamic and combined binding. In addition to its crucial roles in cell adhesion and Wnt-activated transcription in cancer cells, β-catenin regulates RNA alternative splicing and stability possibly by binding to target RNA in cells. An RNA aptamer was selected for specific binding to β-catenin to address RNA recognition by β-catenin more specifically. Here, we characterized the structural properties of the RNA aptamer as a model and identified a β-catenin RNA motif. Similar RNA motif was found in cellular RNA, Cyclooxygenase-2 (COX-2) mRNA 3′-untranslated region (3′-UTR). More significantly, the C-terminal domain of β-catenin interacted with HuR and the Armadillo repeat domain associated with RNA to form the RNA–β-catenin–HuR complex in vitro and in cells. Furthermore, the tertiary RNA–protein complex was predominantly found in the cytoplasm of colon cancer cells; thus, it might be related to COX-2 protein level and cancer progression. Taken together, the β-catenin RNA aptamer was valuable for deducing the cellular RNA aptamer and identifying novel and oncogenic RNA–protein networks in colon cancer cells. PMID:22544606
Sequence motifs and prokaryotic expression of the reptilian paramyxovirus fusion protein

USGS Publications Warehouse

Franke, J.; Batts, W.N.; Ahne, W.; Kurath, G.; Winton, J.R.

2006-01-01

Fourteen reptilian paramyxovirus isolates were chosen to represent the known extent of genetic diversity among this novel group of viruses. Selected regions of the fusion (F) gene were sequenced, analyzed and compared. The F gene of all isolates contained conserved motifs homologous to those described for other members of the family Paramyxoviridae including: signal peptide, transmembrane domain, furin cleavage site, fusion peptide, N-linked glycosylation sites, and two heptad repeats, the second of which (HRB-LZ) had the characteristics of a leucine zipper. Selected regions of the fusion gene of isolate Gono-GER85 were inserted into a prokaryotic expression system to generate three recombinant protein fragments of various sizes. The longest recombinant protein was cleaved by furin into two fragments of predicted length. Western blot analysis with virus-neutralizing rabbit-antiserum against this isolate demonstrated that only the longest construct reacted with the antiserum. This construct was unique in containing 30 additional C-terminal amino acids that included most of the HRB-LZ. These results indicate that the F genes of reptilian paramyxoviruses contain highly conserved motifs typical of other members of the family and suggest that the HRB-LZ domain of the reptilian paramyxovirus F protein contains a linear antigenic epitope. ?? Springer-Verlag 2005.
Interaction of Cu(+) with cytosine and formation of i-motif-like C-M(+)-C complexes: alkali versus coinage metals.

PubMed

Gao, Juehan; Berden, Giel; Rodgers, M T; Oomens, Jos

2016-03-14

The Watson-Crick structure of DNA is among the most well-known molecular structures of our time. However, alternative base-pairing motifs are also known to occur, often depending on base sequence, pH, or the presence of cations. Pairing of cytosine (C) bases induced by the sharing of a single proton (C-H(+)-C) may give rise to the so-called i-motif, which occurs primarily in expanded trinucleotide repeats and the telomeric region of DNA, particularly at low pH. At physiological pH, silver cations were recently found to stabilize C dimers in a C-Ag(+)-C structure analogous to the hemiprotonated C-dimer. Here we use infrared ion spectroscopy in combination with density functional theory calculations at the B3LYP/6-311G+(2df,2p) level to show that copper in the 1+ oxidation state induces an analogous formation of C-Cu(+)-C structures. In contrast to protons and these transition metal ions, alkali metal ions induce a different dimer structure, where each ligand coordinates the alkali metal ion in a bidentate fashion in which the N3 and O2 atoms of both cytosine ligands coordinate to the metal ion, sacrificing hydrogen-bonding interactions between the ligands for improved chelation of the metal cation.
Modeling gene regulatory network motifs using statecharts

PubMed Central

2012-01-01

Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967
Proteolytic dissection of Zab, the Z-DNA-binding domain of human ADAR1

NASA Technical Reports Server (NTRS)

Schwartz, T.; Lowenhaupt, K.; Kim, Y. G.; Li, L.; Brown, B. A. 2nd; Herbert, A.; Rich, A.

1999-01-01

Zalpha is a peptide motif that binds to Z-DNA with high affinity. This motif binds to alternating dC-dG sequences stabilized in the Z-conformation by means of bromination or supercoiling, but not to B-DNA. Zalpha is part of the N-terminal region of double-stranded RNA adenosine deaminase (ADAR1), a candidate enzyme for nuclear pre-mRNA editing in mammals. Zalpha is conserved in ADAR1 from many species; in each case, there is a second similar motif, Zbeta, separated from Zalpha by a more divergent linker. To investigate the structure-function relationship of Zalpha, its domain structure was studied by limited proteolysis. Proteolytic profiles indicated that Zalpha is part of a domain, Zab, of 229 amino acids (residues 133-361 in human ADAR1). This domain contains both Zalpha and Zbeta as well as a tandem repeat of a 49-amino acid linker module. Prolonged proteolysis revealed a minimal core domain of 77 amino acids (positions 133-209), containing only Zalpha, which is sufficient to bind left-handed Z-DNA; however, the substrate binding is strikingly different from that of Zab. The second motif, Zbeta, retains its structural integrity only in the context of Zab and does not bind Z-DNA as a separate entity. These results suggest that Zalpha and Zbeta act as a single bipartite domain. In the presence of substrate DNA, Zab becomes more resistant to proteases, suggesting that it adopts a more rigid structure when bound to its substrate, possibly with conformational changes in parts of the protein.
Localization and characterization of the calsequestrin-binding domain of triadin 1. Evidence for a charged beta-strand in mediating the protein-protein interaction.

PubMed

Kobayashi, Y M; Alseikhan, B A; Jones, L R

2000-06-09

Triadin is an integral membrane protein of the junctional sarcoplasmic reticulum that binds to the high capacity Ca(2+)-binding protein calsequestrin and anchors it to the ryanodine receptor. The lumenal domain of triadin contains multiple repeats of alternating lysine and glutamic acid residues, which have been defined as KEKE motifs and have been proposed to promote protein associations. Here we identified the specific residues of triadin responsible for binding to calsequestrin by mutational analysis of triadin 1, the major cardiac isoform. A series of deletional fusion proteins of triadin 1 was generated, and by using metabolically labeled calsequestrin in filter-overlay assays, the calsequestrin-binding domain of triadin 1 was localized to a single KEKE motif comprised of 25 amino acids. Alanine mutagenesis within this motif demonstrated that the critical amino acids of triadin binding to calsequestrin are the even-numbered residues Lys(210), Lys(212), Glu(214), Lys(216), Gly(218), Gln(220), Lys(222), and Lys(224). Replacement of the odd-numbered residues within this motif by alanine had no effect on calsequestrin binding to triadin. The results suggest a model in which residues 210-224 of triadin form a beta-strand, with the even-numbered residues in the strand interacting with charged residues of calsequestrin, stabilizing a "polar zipper" that links the two proteins together. This small, highly charged beta-strand of triadin may tether calsequestrin to the junctional face membrane, allowing calsequestrin to sequester Ca(2+) in the vicinity of the ryanodine receptor during Ca(2+) uptake and Ca(2+) release.
CRISPR-spacer integration reporter plasmids reveal distinct genuine acquisition specificities among CRISPR-Cas I-E variants of Escherichia coli

PubMed Central

Díez-Villaseñor, César; Guzmán, Noemí M.; Almendros, Cristóbal; García-Martínez, Jesús; Mojica, Francisco J.M.

2013-01-01

Prokaryotes immunize themselves against transmissible genetic elements by the integration (acquisition) in clustered regularly interspaced short palindromic repeats (CRISPR) loci of spacers homologous to invader nucleic acids, defined as protospacers. Following acquisition, mono-spacer CRISPR RNAs (termed crRNAs) guide CRISPR-associated (Cas) proteins to degrade (interference) protospacers flanked by an adjacent motif in extrachomosomal DNA. During acquisition, selection of spacer-precursors adjoining the protospacer motif and proper orientation of the integrated fragment with respect to the leader (sequence leading transcription of the flanking CRISPR array) grant efficient interference by at least some CRISPR-Cas systems. This adaptive stage of the CRISPR action is poorly characterized, mainly due to the lack of appropriate genetic strategies to address its study and, at least in Escherichia coli, the need of Cas overproduction for insertion detection. In this work, we describe the development and application in Escherichia coli strains of an interference-independent assay based on engineered selectable CRISPR-spacer integration reporter plasmids. By using this tool without the constraint of interference or cas overexpression, we confirmed fundamental aspects of this process such as the critical requirement of Cas1 and Cas2 and the identity of the CTT protospacer motif for the E. coli K12 system. In addition, we defined the CWT motif for a non-K12 CRISPR-Cas variant, and obtained data supporting the implication of the leader in spacer orientation, the preferred acquisition from plasmids harboring cas genes and the occurrence of a sequential cleavage at the insertion site by a ruler mechanism. PMID:23445770
CRISPR-spacer integration reporter plasmids reveal distinct genuine acquisition specificities among CRISPR-Cas I-E variants of Escherichia coli.

PubMed

Díez-Villaseñor, César; Guzmán, Noemí M; Almendros, Cristóbal; García-Martínez, Jesús; Mojica, Francisco J M

2013-05-01

Prokaryotes immunize themselves against transmissible genetic elements by the integration (acquisition) in clustered regularly interspaced short palindromic repeats (CRISPR) loci of spacers homologous to invader nucleic acids, defined as protospacers. Following acquisition, mono-spacer CRISPR RNAs (termed crRNAs) guide CRISPR-associated (Cas) proteins to degrade (interference) protospacers flanked by an adjacent motif in extrachomosomal DNA. During acquisition, selection of spacer-precursors adjoining the protospacer motif and proper orientation of the integrated fragment with respect to the leader (sequence leading transcription of the flanking CRISPR array) grant efficient interference by at least some CRISPR-Cas systems. This adaptive stage of the CRISPR action is poorly characterized, mainly due to the lack of appropriate genetic strategies to address its study and, at least in Escherichia coli, the need of Cas overproduction for insertion detection. In this work, we describe the development and application in Escherichia coli strains of an interference-independent assay based on engineered selectable CRISPR-spacer integration reporter plasmids. By using this tool without the constraint of interference or cas overexpression, we confirmed fundamental aspects of this process such as the critical requirement of Cas1 and Cas2 and the identity of the CTT protospacer motif for the E. coli K12 system. In addition, we defined the CWT motif for a non-K12 CRISPR-Cas variant, and obtained data supporting the implication of the leader in spacer orientation, the preferred acquisition from plasmids harboring cas genes and the occurrence of a sequential cleavage at the insertion site by a ruler mechanism.
Motif Discovery in Speech: Application to Monitoring Alzheimer's Disease.

PubMed

Garrard, Peter; Nemes, Vanda; Nikolic, Dragana; Barney, Anna

2017-01-01

Perseveration - repetition of words, phrases or questions in speech - is commonly described in Alzheimer's disease (AD). Measuring perseveration is difficult, but may index cognitive performance, aiding diagnosis and disease monitoring. Continuous recording of speech would produce a large quantity of data requiring painstaking manual analysis, and risk violating patients' and others' privacy. A secure record and an automated approach to analysis are required. To record bone-conducted acoustic energy fluctuations from a subject's vocal apparatus using an accelerometer, to describe the recording and analysis stages in detail, and demonstrate that the approach is feasible in AD. Speech-related vibration was captured by an accelerometer, affixed above the temporomandibular joint. Healthy subjects read a script with embedded repetitions. Features were extracted from recorded signals and combined using Principal Component Analysis to obtain a one-dimensional representation of the feature vector. Motif discovery techniques were used to detect repeated segments. The equipment was tested in AD patients to determine device acceptability and recording quality. Comparison with the known location of embedded motifs suggests that, with appropriate parameter tuning, the motif discovery method can detect repetitions. The device was acceptable to patients and produced adequate signal quality in their home environments. We established that continuously recording bone-conducted speech and detecting perseverative patterns were both possible. In future studies we plan to associate the frequency of verbal repetitions with stage, progression and type of dementia. It is possible that the method could contribute to the assessment of disease-modifying treatments. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Computational study of the fibril organization of polyglutamine repeats reveals a common motif identified in beta-helices.

PubMed

Zanuy, David; Gunasekaran, Kannan; Lesk, Arthur M; Nussinov, Ruth

2006-04-21

The formation of fibril aggregates by long polyglutamine sequences is assumed to play a major role in neurodegenerative diseases such as Huntington. Here, we model peptides rich in glutamine, through a series of molecular dynamics simulations. Starting from a rigid nanotube-like conformation, we have obtained a new conformational template that shares structural features of a tubular helix and of a beta-helix conformational organization. Our new model can be described as a super-helical arrangement of flat beta-sheet segments linked by planar turns or bends. Interestingly, our comprehensive analysis of the Protein Data Bank reveals that this is a common motif in beta-helices (termed beta-bend), although it has not been identified so far. The motif is based on the alternation of beta-sheet and helical conformation as the protein sequence is followed from the N to the C termini (beta-alpha(R)-beta-polyPro-beta). We further identify this motif in the ssNMR structure of the protofibril of the amyloidogenic peptide Abeta(1-40). The recurrence of the beta-bend suggests a general mode of connecting long parallel beta-sheet segments that would allow the growth of partially ordered fibril structures. The design allows the peptide backbone to change direction with a minimal loss of main chain hydrogen bonds. The identification of a coherent organization beyond that of the beta-sheet segments in different folds rich in parallel beta-sheets suggests a higher degree of ordered structure in protein fibrils, in agreement with their low solubility and dense molecular packing.
DNA motif alignment by evolving a population of Markov chains.

PubMed

Bi, Chengpeng

2009-01-30

Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.
Disparate requirements for the Walker A and B ATPase motifs of human RAD51D in homologous recombination.

PubMed

Wiese, Claudia; Hinz, John M; Tebbs, Robert S; Nham, Peter B; Urbin, Salustra S; Collins, David W; Thompson, Larry H; Schild, David

2006-01-01

In vertebrates, homologous recombinational repair (HRR) requires RAD51 and five RAD51 paralogs (XRCC2, XRCC3, RAD51B, RAD51C and RAD51D) that all contain conserved Walker A and B ATPase motifs. In human RAD51D we examined the requirement for these motifs in interactions with XRCC2 and RAD51C, and for survival of cells in response to DNA interstrand crosslinks (ICLs). Ectopic expression of wild-type human RAD51D or mutants having a non-functional A or B motif was used to test for complementation of a rad51d knockout hamster CHO cell line. Although A-motif mutants complement very efficiently, B-motif mutants do not. Consistent with these results, experiments using the yeast two- and three-hybrid systems show that the interactions between RAD51D and its XRCC2 and RAD51C partners also require a functional RAD51D B motif, but not motif A. Similarly, hamster Xrcc2 is unable to bind to the non-complementing human RAD51D B-motif mutants in co-immunoprecipitation assays. We conclude that a functional Walker B motif, but not A motif, is necessary for RAD51D's interactions with other paralogs and for efficient HRR. We present a model in which ATPase sites are formed in a bipartite manner between RAD51D and other RAD51 paralogs.
RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections.

PubMed

Castro-Mondragon, Jaime Abraham; Jaeger, Sébastien; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

2017-07-27

Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Encryption of agonistic motifs for TLR4 into artificial antigens augmented the maturation of antigen-presenting cells.

PubMed

Ito, Masaki; Hayashi, Kazumi; Minamisawa, Tamiko; Homma, Sadamu; Koido, Shigeo; Shiba, Kiyotaka

2017-01-01

Adjuvants are indispensable for achieving a sufficient immune response from vaccinations. From a functional viewpoint, adjuvants are classified into two categories: "physical adjuvants" increase the efficacy of antigen presentation by antigen-presenting cells (APC) and "signal adjuvants" induce the maturation of APC. Our previous study has demonstrated that a physical adjuvant can be encrypted into proteinous antigens by creating artificial proteins from combinatorial assemblages of epitope peptides and those peptide sequences having propensities to form certain protein structures (motif programming). However, the artificial antigens still require a signal adjuvant to maturate the APC; for example, co-administration of the Toll-like receptor 4 (TLR4) agonist monophosphoryl lipid A (MPLA) was required to induce an in vivo immunoreaction. In this study, we further modified the previous artificial antigens by appending the peptide motifs, which have been reported to have agonistic activity for TLR4, to create "adjuvant-free" antigens. The created antigens with triple TLR4 agonistic motifs in their C-terminus have activated NF-κB signaling pathways through TLR4. These proteins also induced the production of the inflammatory cytokine TNF-α, and the expression of the co-stimulatory molecule CD40 in APC, supporting the maturation of APC in vitro. Unexpectedly, these signal adjuvant-encrypted proteins have lost their ability to be physical adjuvants because they did not induce cytotoxic T lymphocytes (CTL) in vivo, while the parental proteins induced CTL. These results confirmed that the manifestation of a motif's function is context-dependent and simple addition does not always work for motif-programing. Further optimization of the molecular context of the TLR4 agonistic motifs in antigens should be required to create adjuvant-free antigens.
ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data

PubMed Central

Krestel, Ralf; Ohler, Uwe; Vingron, Martin; Marsico, Annalisa

2017-01-01

Abstract RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM’s model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image. PMID:28977546
QuadBase2: web server for multiplexed guanine quadruplex mining and visualization

PubMed Central

Dhapola, Parashar; Chowdhury, Shantanu

2016-01-01

DNA guanine quadruplexes or G4s are non-canonical DNA secondary structures which affect genomic processes like replication, transcription and recombination. G4s are computationally identified by specific nucleotide motifs which are also called putative G4 (PG4) motifs. Despite the general relevance of these structures, there is currently no tool available that can allow batch queries and genome-wide analysis of these motifs in a user-friendly interface. QuadBase2 (quadbase.igib.res.in) presents a completely reinvented web server version of previously published QuadBase database. QuadBase2 enables users to mine PG4 motifs in up to 178 eukaryotes through the EuQuad module. This module interfaces with Ensembl Compara database, to allow users mine PG4 motifs in the orthologues of genes of interest across eukaryotes. PG4 motifs can be mined across genes and their promoter sequences in 1719 prokaryotes through ProQuad module. This module includes a feature that allows genome-wide mining of PG4 motifs and their visualization as circular histograms. TetraplexFinder, the module for mining PG4 motifs in user-provided sequences is now capable of handling up to 20 MB of data. QuadBase2 is a comprehensive PG4 motif mining tool that further expands the configurations and algorithms for mining PG4 motifs in a user-friendly way. PMID:27185890

CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-07-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.
Structural analysis of Notch-regulating Rumi reveals basis for pathogenic mutations

PubMed Central

Yu, Hongjun; Takeuchi, Hideyuki; Takeuchi, Megumi; Liu, Qun; Kantharia, Joshua; Haltiwanger, Robert S.; Li, Huilin

2016-01-01

Rumi O-glucosylates the EGF repeats of a growing list of proteins essential in metazoan development including Notch. Rumi is essential for Notch signaling, and Rumi dysregulation is linked to several human diseases. Despite Rumi’s critical roles, it is unknown how Rumi glucosylates a serine of many but not all EGF repeats. Here we report crystal structures of Drosophila Rumi as binary or ternary complexes with a folded EGF repeat and/or donor substrates. These structures provide insights into the catalytic mechanism, and show that Rumi recognizes structural signatures of the EGF motif, the U-shaped consensus sequence, C-X-S-X-(P/A)-C and a conserved hydrophobic region. We found that five Rumi mutations identified in cancers and Dowling-Degos disease are clustered around the enzyme active site and adversely affect its activity. Our study suggests that loss of Rumi activity may underlie these diseases, and the mechanistic insights may facilitate the development of modulators of Notch signaling. PMID:27428513
Constraints and consequences of the emergence of amino acid repeats in eukaryotic proteins.

PubMed

Chavali, Sreenivas; Chavali, Pavithra L; Chalancon, Guilhem; de Groot, Natalia Sanchez; Gemayel, Rita; Latysheva, Natasha S; Ing-Simmons, Elizabeth; Verstrepen, Kevin J; Balaji, Santhanam; Babu, M Madan

2017-09-01

Proteins with amino acid homorepeats have the potential to be detrimental to cells and are often associated with human diseases. Why, then, are homorepeats prevalent in eukaryotic proteomes? In yeast, homorepeats are enriched in proteins that are essential and pleiotropic and that buffer environmental insults. The presence of homorepeats increases the functional versatility of proteins by mediating protein interactions and facilitating spatial organization in a repeat-dependent manner. During evolution, homorepeats are preferentially retained in proteins with stringent proteostasis, which might minimize repeat-associated detrimental effects such as unregulated phase separation and protein aggregation. Their presence facilitates rapid protein divergence through accumulation of amino acid substitutions, which often affect linear motifs and post-translational-modification sites. These substitutions may result in rewiring protein interaction and signaling networks. Thus, homorepeats are distinct modules that are often retained in stringently regulated proteins. Their presence facilitates rapid exploration of the genotype-phenotype landscape of a population, thereby contributing to adaptation and fitness.
Development of Novel SSR Markers for Flax (Linum usitatissimum L.) Using Reduced-Representation Genome Sequencing.

PubMed

Wu, Jianzhong; Zhao, Qian; Wu, Guangwen; Zhang, Shuquan; Jiang, Tingbo

2016-01-01

Flax ( Linum usitatissimum L.) is a major fiber and oil yielding crop grown in northeastern China. Identification of flax molecular markers is a key step toward improving flax yield and quality via marker-assisted breeding. Simple sequence repeat (SSR) markers, which are based on genomic structural variation, are considered the most valuable type of genetic marker for this purpose. In this study, we screened 1574 microsatellites from Linum usitatissimum L. obtained using reduced representation genome sequencing (RRGS) to systematically identify SSR markers. The resulting set of microsatellites consisted mainly of trinucleotide (56.10%) and dinucleotide (35.23%) repeats, with each motif consisting of 5-8 repeats. We then evaluated marker sensitivity and specificity based on samples of 48 flax isolates obtained from northeastern China. Using the new SSR panel, the results demonstrated that fiber flax and oilseed flax varieties clustered into two well separated groups. The novel SSR markers developed in this study show potential value for selection of varieties for use in flax breeding programs.
Genetic Fingerprinting Using Microsatellite Markers in a Multiplex PCR Reaction: A Compilation of Methodological Approaches from Primer Design to Detection Systems.

PubMed

Krüger, Jacqueline; Schleinitz, Dorit

2017-01-01

Microsatellites are polymorphic DNA loci comprising repeated sequence motifs of two to five base pairs which are dispersed throughout the genome. Genotyping of microsatellites is a widely accepted tool for diagnostic and research purposes such as forensic investigations and parentage testing, but also in clinics (e.g. monitoring of bone marrow transplantation), as well as for the agriculture and food industries. The co-amplification of several short tandem repeat (STR) systems in a multiplex reaction with simultaneous detection helps to obtain more information from a DNA sample where its availability may be limited. Here, we introduce and describe this commonly used genotyping technique, providing an overview on available resources on STRs, multiplex design, and analysis.
Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

PubMed

Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

2015-01-01

Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.
Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae)

PubMed Central

Bonatelli, Isabel A. S.; Carstens, Bryan C.; Moraes, Evandro M.

2015-01-01

Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396
The Evolution of Dark Matter in the Mitogenome of Seed Beetles

PubMed Central

Sayadi, Ahmed; Immonen, Elina; Tellgren-Roth, Christian

2017-01-01

Abstract Animal mitogenomes are generally thought of as being economic and optimized for rapid replication and transcription. We use long-read sequencing technology to assemble the remarkable mitogenomes of four species of seed beetles. These are the largest circular mitogenomes ever assembled in insects, ranging from 24,496 to 26,613 bp in total length, and are exceptional in that some 40% consists of non-coding DNA. The size expansion is due to two very long intergenic spacers (LIGSs), rich in tandem repeats. The two LIGSs are present in all species but vary greatly in length (114–10,408 bp), show very low sequence similarity, divergent tandem repeat motifs, a very high AT content and concerted length evolution. The LIGSs have been retained for at least some 45 my but must have undergone repeated reductions and expansions, despite strong purifying selection on protein coding mtDNA genes. The LIGSs are located in two intergenic sites where a few recent studies of insects have also reported shorter LIGSs (>200 bp). These sites may represent spaces that tolerate neutral repeat array expansions or, alternatively, the LIGSs may function to allow a more economic translational machinery. Mitochondrial respiration in adult seed beetles is based almost exclusively on fatty acids, which reduces the need for building complex I of the oxidative phosphorylation pathway (NADH dehydrogenase). One possibility is thus that the LIGSs may allow depressed transcription of NAD genes. RNA sequencing showed that LIGSs are partly transcribed and transcriptional profiling suggested that all seven mtDNA NAD genes indeed show low levels of transcription and co-regulation of transcription across sexes and tissues. PMID:29048527
The origin and evolution of human glutaminases and their atypical C-terminal ankyrin repeats

DOE PAGES

Pasquali, Camila Cristina; Islam, Zeyaul; Adamoski, Douglas; ...

2017-05-19

On the basis of tissue-specific enzyme activity and inhibition by catalytic products, Hans Krebs first demonstrated the existence of multiple glutaminases in mammals. Currently, two human genes are known to encode at least four glutaminase isoforms. But, the phylogeny of these medically relevant enzymes remains unclear, prompting us to investigate their origin and evolution. Using prokaryotic and eukaryotic glutaminase sequences, we built a phylogenetic tree whose topology suggested that the multidomain architecture was inherited from bacterial ancestors, probably simultaneously with the hosting of the proto-mitochondrion endosymbiont. We propose an evolutionary model wherein the appearance of the most active enzyme isoform,more » glutaminase C (GAC), which is expressed in many cancers, was a late retrotransposition event that occurred in fishes from the Chondrichthyes class. The ankyrin (ANK) repeats in the glutaminases were acquired early in their evolution. In order to obtain information on ANK folding, we solved two high-resolution structures of the ANK repeat-containing C termini of both kidney-type glutaminase (KGA) and GLS2 isoforms (glutaminase B and liver-type glutaminase). We also found that the glutaminase ANK repeats form unique intramolecular contacts through two highly conserved motifs; curiously, this arrangement occludes a region usually involved in ANK-mediated protein-protein interactions. We also solved the crystal structure of full-length KGA and present a small-angle X-ray scattering model for full-length GLS2. These structures explain these proteins' compromised ability to assemble into catalytically active supra-tetrameric filaments, as previously shown for GAC. Collectively, these results provide information about glutaminases that may aid in the design of isoform-specific glutaminase inhibitors.« less
The origin and evolution of human glutaminases and their atypical C-terminal ankyrin repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pasquali, Camila Cristina; Islam, Zeyaul; Adamoski, Douglas

On the basis of tissue-specific enzyme activity and inhibition by catalytic products, Hans Krebs first demonstrated the existence of multiple glutaminases in mammals. Currently, two human genes are known to encode at least four glutaminase isoforms. But, the phylogeny of these medically relevant enzymes remains unclear, prompting us to investigate their origin and evolution. Using prokaryotic and eukaryotic glutaminase sequences, we built a phylogenetic tree whose topology suggested that the multidomain architecture was inherited from bacterial ancestors, probably simultaneously with the hosting of the proto-mitochondrion endosymbiont. We propose an evolutionary model wherein the appearance of the most active enzyme isoform,more » glutaminase C (GAC), which is expressed in many cancers, was a late retrotransposition event that occurred in fishes from the Chondrichthyes class. The ankyrin (ANK) repeats in the glutaminases were acquired early in their evolution. In order to obtain information on ANK folding, we solved two high-resolution structures of the ANK repeat-containing C termini of both kidney-type glutaminase (KGA) and GLS2 isoforms (glutaminase B and liver-type glutaminase). We also found that the glutaminase ANK repeats form unique intramolecular contacts through two highly conserved motifs; curiously, this arrangement occludes a region usually involved in ANK-mediated protein-protein interactions. We also solved the crystal structure of full-length KGA and present a small-angle X-ray scattering model for full-length GLS2. These structures explain these proteins' compromised ability to assemble into catalytically active supra-tetrameric filaments, as previously shown for GAC. Collectively, these results provide information about glutaminases that may aid in the design of isoform-specific glutaminase inhibitors.« less
Self-organisation of an oligodeoxynucleotide containing the G- and C-rich stretches of the direct repeats of the human mitochondrial DNA.

PubMed

Nonin-Lecomte, Sylvie; Dardel, Frédéric; Lestienne, Patrick

2005-08-01

Stretches of cytosines and guanosines have been shown in vitro to adopt non-canonical structures known as i-motifs and G-quartets, respectively. When combined, such sequences are expected to either retain their structure or form duplexes or triple helices. All these structures may occur in vivo whenever the sequence criteria are met. Such stretches are present in the circular genome of human mitochondria, as two 10 nucleotide-long perfect tandem direct repeats (DR1 and DR2). The DR1 and DR2 repeats are G-rich on the heavy strand and C-rich on the light strand. Previous results suggested that during replication, transient formation of a parallel GGC triple helix between the neo-synthesised G-rich DR1 and the double-stranded homologous DR2 could be involved in a rearrangement process leading to genome instability. In order to get structural insights into the interaction between the two repeats, we have studied by nuclear magnetic resonance (NMR) the assembly properties of a 24-mer oligodeoxyribonucleotide in which the C- and G-rich segments of the DRs are covalently tethered by a TTTT linker. We show here that this 24-mer self-associates into a triplex-containing symmetrical tetramer. The core of the structure is composed of anti-parallel Watson-Crick (WC) base pairs. Two additional strands are hydrogen-bonded to the Hoogsteen side of the Gs, thus forming CGC(+) triple helices, with G-rich ends folding into G-quartets. These results suggest that such structures could occur when the two DRs are put to close proximity in a biological context.
Different distribution of Helicobacter pylori EPIYA- cagA motifs and dupA genes in the upper gastrointestinal diseases and correlation with clinical outcomes in iranian patients.

PubMed

Haddadi, Mohammad Hossein; Bazargani, Abdollah; Khashei, Reza; Fattahi, Mohammad Reza; Bagheri Lankarani, Kamran; Moini, Maryam; Rokni Hosseini, Seyed Mohammad Hossein

2015-01-01

Our aim was to determine the EPIYA-cagA Phosphorylation sites and dupA gene in H. pylori isolates among patients with upper gastrointestinal diseases. Pathogenicity of the cagA-positive Helicobacter pylori is associated with EPIYA motifs and higher number of EPIYA-C segments is a risk factor of gastric cancer, while duodenal ulcer-promoting gene (dupA) is determined as a protective factor against gastric cancer. A total of 280 non-repeated gastric biopsies obtained from patients undergoing endoscopy from January 2013 till July 2013. Samples were cultured on selective horse blood agar and incubated in microaerophilic atmosphere. The isolated organisms were identified as H. pylori by Gram staining and positive oxidase, catalase, and urease tests. Various motif types of cagA and the prevalence of dupA were determined by PCR method. Out of 280 specimens, 128 (54.7%) isolated organisms were identified as H. pylori. Of 120 H. pylori isolates, 35.9% were dupA positive and 56.26% were cagA positive, while cagA with ABC and ABCC motifs were 55.5% and 44.5%, respectively. Fifty six percent of the isolates with the ABCC motif have had dupA genes. We also found a significant association between strains with genotypes of dupA-ABC and duodenal ulcer disease (p = 0.007). The results of this study showed that the prevalence of cagA-positive H. pylori in Shiraz was as high as in western countries and higher numbers of EPIYA-C segments were seen in gastric cancer patients. We may also use dupA as a prognostic and pathogenic marker for duodenal ulcer disease and cagA with the segment C for gastric cancer and gastric ulcer disease in this region.
Different distribution of Helicobacter pylori EPIYA- cagA motifs and dupA genes in the upper gastrointestinal diseases and correlation with clinical outcomes in iranian patients

PubMed Central

Haddadi, Mohammad Hossein; Bazargani, Abdollah; Khashei, Reza; Fattahi, Mohammad Reza; Bagheri Lankarani, Kamran; Moini, Maryam; Rokni Hosseini, Seyed Mohammad Hossein

2015-01-01

Aim: Our aim was to determine the EPIYA-cagA Phosphorylation sites and dupA gene in H. pylori isolates among patients with upper gastrointestinal diseases. Background: Pathogenicity of the cagA-positive Helicobacter pylori is associated with EPIYA motifs and higher number of EPIYA-C segments is a risk factor of gastric cancer, while duodenal ulcer-promoting gene (dupA) is determined as a protective factor against gastric cancer. Patients and methods: A total of 280 non-repeated gastric biopsies obtained from patients undergoing endoscopy from January 2013 till July 2013. Samples were cultured on selective horse blood agar and incubated in microaerophilic atmosphere. The isolated organisms were identified as H. pylori by Gram staining and positive oxidase, catalase, and urease tests. Various motif types of cagA and the prevalence of dupA were determined by PCR method. Results: Out of 280 specimens, 128 (54.7%) isolated organisms were identified as H. pylori. Of 120 H. pylori isolates, 35.9% were dupA positive and 56.26% were cagA positive, while cagA with ABC and ABCC motifs were 55.5% and 44.5%, respectively. Fifty six percent of the isolates with the ABCC motif have had dupA genes. We also found a significant association between strains with genotypes of dupA-ABC and duodenal ulcer disease (p = 0.007). Conclusion: The results of this study showed that the prevalence of cagA-positive H. pylori in Shiraz was as high as in western countries and higher numbers of EPIYA-C segments were seen in gastric cancer patients. We may also use dupA as a prognostic and pathogenic marker for duodenal ulcer disease and cagA with the segment C for gastric cancer and gastric ulcer disease in this region. PMID:26171136
Ser/Thr Motifs in Transmembrane Proteins: Conservation Patterns and Effects on Local Protein Structure and Dynamics

PubMed Central

del Val, Coral; White, Stephen H.

2014-01-01

We combined systematic bioinformatics analyses and molecular dynamics simulations to assess the conservation patterns of Ser and Thr motifs in membrane proteins, and the effect of such motifs on the structure and dynamics of α-helical transmembrane (TM) segments. We find that Ser/Thr motifs are often present in β-barrel TM proteins. At least one Ser/Thr motif is present in almost half of the sequences of α-helical proteins analyzed here. The extensive bioinformatics analyses and inspection of protein structures led to the identification of molecular transporters with noticeable numbers of Ser/Thr motifs within the TM region. Given the energetic penalty for burying multiple Ser/Thr groups in the membrane hydrophobic core, the observation of transporters with multiple membrane-embedded Ser/Thr is intriguing and raises the question of how the presence of multiple Ser/Thr affects protein local structure and dynamics. Molecular dynamics simulations of four different Ser-containing model TM peptides indicate that backbone hydrogen bonding of membrane-buried Ser/Thr hydroxyl groups can significantly change the local structure and dynamics of the helix. Ser groups located close to the membrane interface can hydrogen bond to solvent water instead of protein backbone, leading to an enhanced local solvation of the peptide. PMID:22836667
Crystallographic and Computational Studies of a Class II MHC Complex with a Nonconforming Peptide: HLA-DRA/DRB3*0101

NASA Astrophysics Data System (ADS)

Parry, Christian S.; Gorski, Jack; Stern, Lawrence J.

2003-03-01

The stable binding of processed foreign peptide to a class II major histocompatibility (MHC) molecule and subsequent presentation to a T cell receptor is a central event in immune recognition and regulation. Polymorphic residues on the floor of the peptide binding site form pockets that anchor peptide side chains. These and other residues in the helical wall of the groove determine the specificity of each allele and define a motif. Allele specific motifs allow the prediction of epitopes from the sequence of pathogens. There are, however, known epitopes that do not satisfy these motifs: anchor motifs are not adequate for predicting epitopes as there are apparently major and minor motifs. We present crystallographic studies into the nature of the interactions that govern the binding of these so called nonconforming peptides. We would like to understand the role of the P10 pocket and find out whether the peptides that do not obey the consensus anchor motif bind in the canonical conformation observed in in prior structures of class II MHC-peptide complexes. HLA-DRB3*0101 complexed with peptide crystallized in unit cell 92.10 x 92.10 x 248.30 (90, 90, 90), P41212, and the diffraction data is reliable to 2.2ÅWe are complementing our studies with dynamical long time simulations to answer these questions, particularly the interplay of the anchor motifs in peptide binding, the range of protein and ligand conformations, and water hydration structures.
Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

PubMed Central

Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.

2016-01-01

Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825
SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data

PubMed Central

Dotu, Ivan; Adamson, Scott I.; Coleman, Benjamin; Fournier, Cyril; Ricart-Altimiras, Emma; Eyras, Eduardo

2018-01-01

RNA-protein binding is critical to gene regulation, controlling fundamental processes including splicing, translation, localization and stability, and aberrant RNA-protein interactions are known to play a role in a wide variety of diseases. However, molecular understanding of RNA-protein interactions remains limited; in particular, identification of RNA motifs that bind proteins has long been challenging, especially when such motifs depend on both sequence and structure. Moreover, although RNA binding proteins (RBPs) often contain more than one binding domain, algorithms capable of identifying more than one binding motif simultaneously have not been developed. In this paper we present a novel pipeline to determine binding peaks in crosslinking immunoprecipitation (CLIP) data, to discover multiple possible RNA sequence/structure motifs among them, and to experimentally validate such motifs. At the core is a new semi-automatic algorithm SARNAclust, the first unsupervised method to identify and deconvolve multiple sequence/structure motifs simultaneously. SARNAclust computes similarity between sequence/structure objects using a graph kernel, providing the ability to isolate the impact of specific features through the bulge graph formalism. Application of SARNAclust to synthetic data shows its capability of clustering 5 motifs at once with a V-measure value of over 0.95, while GraphClust achieves only a V-measure of 0.083 and RNAcontext cannot detect any of the motifs. When applied to existing eCLIP sets, SARNAclust finds known motifs for SLBP and HNRNPC and novel motifs for several other RBPs such as AGGF1, AKAP8L and ILF3. We demonstrate an experimental validation protocol, a targeted Bind-n-Seq-like high-throughput sequencing approach that relies on RNA inverse folding for oligo pool design, that can validate the components within the SLBP motif. Finally, we use this protocol to experimentally interrogate the SARNAclust motif predictions for protein ILF3. Our results support a newly identified partially double-stranded UUUUUGAGA motif similar to that known for the splicing factor HNRNPC. PMID:29596423
Membrane cofactor protein (CD46) is a keratinocyte receptor for the M protein of the group A streptococcus.

PubMed

Okada, N; Liszewski, M K; Atkinson, J P; Caparon, M

1995-03-28

The pathogenic Gram-positive bacterium Streptococcus pyogenes (group A streptococcus) is the causative agent of numerous suppurative diseases of human skin. The M protein of S. pyogenes mediates the adherence of the bacterium to keratinocytes, the most numerous cell type in the epidermis. In this study, we have constructed and analyzed a series of mutant M proteins and have shown that the C repeat domain of the M molecule is responsible for cell recognition. The binding of factor H, a serum regulator of complement activation, to the C repeat region of M protein blocked bacterial adherence. Factor H is a member of a large family of complement regulatory proteins that share a homologous structural motif termed the short consensus repeat. Membrane cofactor protein (MCP), or CD46, is a short consensus repeat-containing protein found on the surface of keratinocytes, and purified MCP could competitively inhibit the adherence of S. pyogenes to these cells. Furthermore, the M protein was found to bind directly to MCP, whereas mutant M proteins that lacked the C repeat domain did not bind MCP, suggesting that recognition of MCP plays an important role in the ability of the streptococcus to adhere to keratinocytes.
Automatic annotation of protein motif function with Gene Ontology terms.

PubMed

Lu, Xinghua; Zhai, Chengxiang; Gopalakrishnan, Vanathi; Buchanan, Bruce G

2004-09-02

Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, a much needed and important task is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO) project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. This paper presents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifs is viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association is found to be a very useful feature. We take advantage of the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correct association. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about the functions of newly discovered candidate protein motifs.
Simultaneously learning DNA motif along with its position and sequence rank preferences through expectation maximization algorithm.

PubMed

Zhang, ZhiZhuo; Chang, Cheng Wei; Hugo, Willy; Cheung, Edwin; Sung, Wing-Kin

2013-03-01

Although de novo motifs can be discovered through mining over-represented sequence patterns, this approach misses some real motifs and generates many false positives. To improve accuracy, one solution is to consider some additional binding features (i.e., position preference and sequence rank preference). This information is usually required from the user. This article presents a de novo motif discovery algorithm called SEME (sampling with expectation maximization for motif elicitation), which uses pure probabilistic mixture model to model the motif's binding features and uses expectation maximization (EM) algorithms to simultaneously learn the sequence motif, position, and sequence rank preferences without asking for any prior knowledge from the user. SEME is both efficient and accurate thanks to two important techniques: the variable motif length extension and importance sampling. Using 75 large-scale synthetic datasets, 32 metazoan compendium benchmark datasets, and 164 chromatin immunoprecipitation sequencing (ChIP-Seq) libraries, we demonstrated the superior performance of SEME over existing programs in finding transcription factor (TF) binding sites. SEME is further applied to a more difficult problem of finding the co-regulated TF (coTF) motifs in 15 ChIP-Seq libraries. It identified significantly more correct coTF motifs and, at the same time, predicted coTF motifs with better matching to the known motifs. Finally, we show that the learned position and sequence rank preferences of each coTF reveals potential interaction mechanisms between the primary TF and the coTF within these sites. Some of these findings were further validated by the ChIP-Seq experiments of the coTFs. The application is available online.

Analyses of carnivore microsatellites and their intimate association with tRNA-derived SINEs.

PubMed

López-Giráldez, Francesc; Andrés, Olga; Domingo-Roura, Xavier; Bosch, Montserrat

2006-10-23

The popularity of microsatellites has greatly increased in the last decade on account of their many applications. However, little is currently understood about the factors that influence their genesis and distribution among and within species genomes. In this work, we analyzed carnivore microsatellite clones from GenBank to study their association with interspersed repeats and elucidate the role of the latter in microsatellite genesis and distribution. We constructed a comprehensive carnivore microsatellite database comprising 1236 clones from GenBank. Thirty-three species of 11 out of 12 carnivore families were represented, although two distantly related species, the domestic dog and cat, were clearly overrepresented. Of these clones, 330 contained tRNALys-derived SINEs and 357 contained other interspersed repeats. Our rough estimates of tRNA SINE copies per haploid genome were much higher than published ones. Our results also revealed a distinct juxtaposition of AG and A-rich repeats and tRNALys-derived SINEs suggesting their coevolution. Both microsatellites arose repeatedly in two regions of the interspersed repeat. Moreover, microsatellites associated with tRNALys-derived SINEs showed the highest complexity and less potential instability. Our results suggest that tRNALys-derived SINEs are a significant source for microsatellite generation in carnivores, especially for AG and A-rich repeat motifs. These observations indicate two modes of microsatellite generation: the expansion and variation of pre-existing tandem repeats and the conversion of sequences with high cryptic simplicity into a repeat array; mechanisms which are not specific to tRNALys-derived SINEs. Microsatellite and interspersed repeat coevolution could also explain different distribution of repeat types among and within species genomes.Finally, due to their higher complexity and lower potential informative content of microsatellites associated with tRNALys-derived SINEs, we recommend avoiding their use as genetic markers.
The multiple roles of epidermal growth factor repeat O-glycans in animal development

PubMed Central

Haltom, Amanda R; Jafar-Nejad, Hamed

2015-01-01

The epidermal growth factor (EGF)-like repeat is a common, evolutionarily conserved motif found in secreted proteins and the extracellular domain of transmembrane proteins. EGF repeats harbor six cysteine residues which form three disulfide bonds and help generate the three-dimensional structure of the EGF repeat. A subset of EGF repeats harbor consensus sequences for the addition of one or more specific O-glycans, which are initiated by O-glucose, O-fucose or O-N-acetylglucosamine. These glycans are relatively rare compared to mucin-type O-glycans. However, genetic experiments in model organisms and cell-based assays indicate that at least some of the glycosyltransferases involved in the addition of O-glycans to EGF repeats play important roles in animal development. These studies, combined with state-of-the-art biochemical and structural biology experiments have started to provide an in-depth picture of how these glycans regulate the function of the proteins to which they are linked. In this review, we will discuss the biological roles assigned to EGF repeat O-glycans and the corresponding glycosyltransferases. Since Notch receptors are the best studied proteins with biologically-relevant O-glycans on EGF repeats, a significant part of this review is devoted to the role of these glycans in the regulation of the Notch signaling pathway. We also discuss recently identified proteins other than Notch which depend on EGF repeat glycans to function properly. Several glycosyltransferases involved in the addition or elongation of O-glycans on EGF repeats are mutated in human diseases. Therefore, mechanistic understanding of the functional roles of these carbohydrate modifications is of interest from both basic science and translational perspectives. PMID:26175457
cWINNOWER Algorithm for Finding Fuzzy DNA Motifs

NASA Technical Reports Server (NTRS)

Liang, Shoudan

2003-01-01

The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if multiple mutated copies of the motif (i.e., the signals) are present in the DNA sequence in sufficient abundance. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum number of detectable motifs qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc, by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12000 for (l,d) = (15,4).
Discovering Motifs in Biological Sequences Using the Micron Automata Processor.

PubMed

Roy, Indranil; Aluru, Srinivas

2016-01-01

Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the (l, d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with each occurrence differing from the motif in at most d substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is (26,11). We propose a novel algorithm for the (l,d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the (l, d) motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances (39,18) and (40,17). The paper serves as a useful guide to solving problems using this new accelerator technology.
Structure and Dynamics of RNA Repeat Expansions That Cause Huntington's Disease and Myotonic Dystrophy Type 1.

PubMed

Chen, Jonathan L; VanEtten, Damian M; Fountain, Matthew A; Yildirim, Ilyas; Disney, Matthew D

2017-07-11

RNA repeat expansions cause a host of incurable, genetically defined diseases. The most common class of RNA repeats consists of trinucleotide repeats. These long, repeating transcripts fold into hairpins containing 1 × 1 internal loops that can mediate disease via a variety of mechanism(s) in which RNA is the central player. Two of these disorders are Huntington's disease and myotonic dystrophy type 1, which are caused by r(CAG) and r(CUG) repeats, respectively. We report the structures of two RNA constructs containing three copies of a r(CAG) [r(3×CAG)] or r(CUG) [r(3×CUG)] motif that were modeled with nuclear magnetic resonance spectroscopy and simulated annealing with restrained molecular dynamics. The 1 × 1 internal loops of r(3×CAG) are stabilized by one-hydrogen bond (cis Watson-Crick/Watson-Crick) AA pairs, while those of r(3×CUG) prefer one- or two-hydrogen bond (cis Watson-Crick/Watson-Crick) UU pairs. Assigned chemical shifts for the residues depended on the identity of neighbors or next nearest neighbors. Additional insights into the dynamics of these RNA constructs were gained by molecular dynamics simulations and a discrete path sampling method. Results indicate that the global structures of the RNA are A-form and that the loop regions are dynamic. The results will be useful for understanding the dynamic trajectory of these RNA repeats but also may aid in the development of therapeutics.
Development and Molecular Characterization of Novel Polymorphic Genomic DNA SSR Markers in Lentinula edodes.

PubMed

Moon, Suyun; Lee, Hwa-Yong; Shim, Donghwan; Kim, Myungkil; Ka, Kang-Hyeon; Ryoo, Rhim; Ko, Han-Gyu; Koo, Chang-Duck; Chung, Jong-Wook; Ryu, Hojin

2017-06-01

Sixteen genomic DNA simple sequence repeat (SSR) markers of Lentinula edodes were developed from 205 SSR motifs present in 46.1-Mb long L. edodes genome sequences. The number of alleles ranged from 3-14 and the major allele frequency was distributed from 0.17-0.96. The values of observed and expected heterozygosity ranged from 0.00-0.76 and 0.07-0.90, respectively. The polymorphic information content value ranged from 0.07-0.89. A dendrogram, based on 16 SSR markers clustered by the paired hierarchical clustering' method, showed that 33 shiitake cultivars could be divided into three major groups and successfully identified. These SSR markers will contribute to the efficient breeding of this species by providing diversity in shiitake varieties. Furthermore, the genomic information covered by the markers can provide a valuable resource for genetic linkage map construction, molecular mapping, and marker-assisted selection in the shiitake mushroom.
Recent advances in heterobimetallic catalysis across a "transition metal-tin" motif.

PubMed

Das, Debjit; Mohapatra, Swapna Sarita; Roy, Sujit

2015-06-07

Heterobimetallic catalysts, bearing a metal-metal bond between a transition metal (TM) and a tin atom, are very promising due to their ability in mediating a wide variety of organic transformations. Indeed the utilization of such catalysts is a challenging and evolving area in the field of homogeneous catalysis. Catalysis across a 'TM-Sn' motif is an emerging area in the broader domain of multimetallic catalysis. The present review apprises the chemists' community of the past, present and future scope of this versatile catalytic motif. The TM-Sn catalyzed reactions presented include, among others, Friedel-Crafts alkylation, carbonylation, polymerization, cyclization, olefin metathesis, Heck coupling, hydroarylation Michael addition and tandem coupling. The mechanistic aspects of the reactions have been highlighted as well.
13C, 2H NMR Studies of Structural and Dynamical Modifications of Glucose-Exposed Porcine Aortic Elastin

PubMed Central

Silverstein, Moshe C.; Bilici, Kübra; Morgan, Steven W.; Wang, Yunjie; Zhang, Yanhang; Boutis, Gregory S.

2015-01-01

Elastin, the principal component of the elastic fiber of the extracellular matrix, imparts to vertebrate tissues remarkable resilience and longevity. This work focuses on elucidating dynamical and structural modifications of porcine aortic elastin exposed to glucose by solid-state NMR spectroscopic and relaxation methodologies. Results from macroscopic stress-strain tests are also presented and indicate that glucose-treated elastin is mechanically stiffer than the same tissue without glucose treatment. These measurements show a large hysteresis in the stress-strain behavior of glucose-treated elastin—a well-known signature of viscoelasticity. Two-dimensional relaxation NMR methods were used to investigate the correlation time, distribution, and population of water in these samples. Differences are observed between the relative populations of water, whereas the measured correlation times of tumbling motion of water across the samples were similar. 13C magic-angle-spinning NMR methods were applied to investigate structural and dynamical modifications after glucose treatment. Although some overall structure is preserved, the process of glucose exposure results in more heterogeneous structures and slower mobility. The correlation times of tumbling motion of the 13C-1H internuclear vectors in the glucose-treated sample are larger than in untreated samples, pointing to their more rigid structure. The 13C cross-polarization spectra reveal a notably increased α-helical character in the alanine motifs after glucose exposure. Results from molecular dynamics simulations are provided that add further insight into dynamical and structural changes of a short repeat, [VPGVG]5, an alanine pentamer, desmosine, and isodesmosine sites with and without glucose. The simulations point to changes in the entropic and energetic contributions in the retractive forces of VPGVG and AAAAA motifs. The most notable change is the increase of the energetic contribution in the retractive force due to peptide-glucose interactions of the VPGVG motif, which may play an important role in the observed stiffening in glucose-treated elastin. PMID:25863067
Proteome-wide Identification of Novel Ceramide-binding Proteins by Yeast Surface cDNA Display and Deep Sequencing.

PubMed

Bidlingmaier, Scott; Ha, Kevin; Lee, Nam-Kyung; Su, Yang; Liu, Bin

2016-04-01

Although the bioactive sphingolipid ceramide is an important cell signaling molecule, relatively few direct ceramide-interacting proteins are known. We used an approach combining yeast surface cDNA display and deep sequencing technology to identify novel proteins binding directly to ceramide. We identified 234 candidate ceramide-binding protein fragments and validated binding for 20. Most (17) bound selectively to ceramide, although a few (3) bound to other lipids as well. Several novel ceramide-binding domains were discovered, including the EF-hand calcium-binding motif, the heat shock chaperonin-binding motif STI1, the SCP2 sterol-binding domain, and the tetratricopeptide repeat region motif. Interestingly, four of the verified ceramide-binding proteins (HPCA, HPCAL1, NCS1, and VSNL1) and an additional three candidate ceramide-binding proteins (NCALD, HPCAL4, and KCNIP3) belong to the neuronal calcium sensor family of EF hand-containing proteins. We used mutagenesis to map the ceramide-binding site in HPCA and to create a mutant HPCA that does not bind to ceramide. We demonstrated selective binding to ceramide by mammalian cell-produced wild type but not mutant HPCA. Intriguingly, we also identified a fragment from prostaglandin D2synthase that binds preferentially to ceramide 1-phosphate. The wide variety of proteins and domains capable of binding to ceramide suggests that many of the signaling functions of ceramide may be regulated by direct binding to these proteins. Based on the deep sequencing data, we estimate that our yeast surface cDNA display library covers ∼60% of the human proteome and our selection/deep sequencing protocol can identify target-interacting protein fragments that are present at extremely low frequency in the starting library. Thus, the yeast surface cDNA display/deep sequencing approach is a rapid, comprehensive, and flexible method for the analysis of protein-ligand interactions, particularly for the study of non-protein ligands. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Dienogest inhibits C-C motif chemokine ligand 20 expression in human endometriotic epithelial cells.

PubMed

Mita, Shizuka; Nakakuki, Masanori; Ichioka, Masayuki; Shimizu, Yutaka; Hashiba, Masamichi; Miyazaki, Hiroyasu; Kyo, Satoru

2017-07-01

C-C motif chemokine ligand 20 is thought to contribute to the development of endometriosis by recruiting Th17 lymphocytes into endometriotic foci. The present study investigated the effects of dienogest, a progesterone receptor agonist used to treat endometriosis, on C-C motif chemokine ligand 20 expression by endometriotic cells. Effects of dienogest on mRNA expression and protein secretion of C-C motif chemokine ligand 20 induced by interleukin 1β were assessed in three immortalized endometriotic epithelial cell lines, parental cells (EMosis-CC/TERT1), and stably expressing human progesterone receptor isoform A (EMosis-CC/TERT1/PRA+) or isoform B (EMosis-CC/TERT1/PRA-/PRB+). Dienogest markedly inhibited interleukin 1β-stimulated C-C motif chemokine ligand 20 mRNA expression and protein secretion in EMosis-CC/TERT1/PRA-/PRB+, which was abrogated by the progesterone receptor antagonist RU486. In EMosis-CC/TERT1/PRA+, dienogest slightly inhibited C-C motif chemokine ligand 20 mRNA and protein. In EMosis-CC/TERT1, dienogest slightly inhibited C-C motif chemokine ligand 20 mRNA, but had no effect on C-C motif chemokine ligand 20 protein. Dienogest inhibited interleukin 1β-induced up-regulation of C-C motif chemokine ligand 20 in endometriotic epithelial cells, mainly mediated by progesterone receptor B. Copyright © 2017 Elsevier B.V. All rights reserved.
Noncoding RNA danger motifs bridge innate and adaptive immunity and are potent adjuvants for vaccination

PubMed Central

Wang, Lilin; Smith, Dan; Bot, Simona; Dellamary, Luis; Bloom, Amy; Bot, Adrian

2002-01-01

The adaptive immune response is triggered by recognition of T and B cell epitopes and is influenced by “danger” motifs that act via innate immune receptors. This study shows that motifs associated with noncoding RNA are essential features in the immune response reminiscent of viral infection, mediating rapid induction of proinflammatory chemokine expression, recruitment and activation of antigen-presenting cells, modulation of regulatory cytokines, subsequent differentiation of Th1 cells, isotype switching, and stimulation of cross-priming. The heterogeneity of RNA-associated motifs results in differential binding to cellular receptors, and specifically impacts the immune profile. Naturally occurring double-stranded RNA (dsRNA) triggered activation of dendritic cells and enhancement of specific immunity, similar to selected synthetic dsRNA motifs. Based on the ability of specific RNA motifs to block tolerance induction and effectively organize the immune defense during viral infection, we conclude that such RNA species are potent danger motifs. We also demonstrate the feasibility of using selected RNA motifs as adjuvants in the context of novel aerosol carriers for optimizing the immune response to subunit vaccines. In conclusion, RNA-associated motifs produced during viral infection bridge the early response with the late adaptive phase, regulating the activation and differentiation of antigen-specific B and T cells, in addition to a short-term impact on innate immunity. PMID:12393853
Structure of thrombospondin type 3 repeats in bacterial outer membrane protein A reveals its intra-repeat disulfide bond-dependent calcium-binding capability

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dai, Shuyan; Sun, Cancan; Tan, Kemin

Eukaryotic thrombospondin type 3 repeat (TT3R) is an efficient calcium ion (Ca2+) binding motif only found in mammalian thrombospondin family. TT3R has also been found in prokaryotic cellulase Cel5G, which was thought to forfeit the Ca2+-binding capability due to the formation of intra-repeat disulfide bonds, instead of the inter-repeat ones possessed by eukaryotic TT3Rs. In this study, we have identified an enormous number of prokaryotic TT3R-containing proteins belonging to several different protein families, including outer membrane protein A (OmpA), an important structural protein connecting the outer membrane and the periplasmic peptidoglycan layer in gram-negative bacteria. Here, we report the crystalmore » structure of the periplasmic region of OmpA from Capnocytophaga gingivalis, which contains a linker region comprising five consecutive TT3Rs. The structure of OmpA-TT3R exhibits a well-ordered architecture organized around two tightly-coordinated Ca2+ and confirms the presence of abnormal intra-repeat disulfide bonds. Further mutagenesis studies showed that the Ca2+-binding capability of OmpA-TT3R is indeed dependent on the proper formation of intra-repeat disulfide bonds, which help to fix a conserved glycine residue at its proper position for Ca2+ coordination. Additionally, despite lacking inter repeat disulfide bonds, the interfaces between adjacent OmpA-TT3Rs are enhanced by both hydrophobic and conserved aromatic-proline interactions.« less
Microsatellites in the Genome of the Edible Mushroom, Volvariella volvacea

PubMed Central

Chen, Mingjie; Wang, Hong; Bao, Dapeng

2014-01-01

Using bioinformatics software and database, we have characterized the microsatellite pattern in the V. volvacea genome and compared it with microsatellite patterns found in the genomes of four other edible fungi: Coprinopsis cinerea, Schizophyllum commune, Agaricus bisporus, and Pleurotus ostreatus. A total of 1346 microsatellites have been identified, with mono-nucleotides being the most frequent motif. The relative abundance of microsatellites was lower in coding regions with 21 No./Mb. However, the microsatellites in the V. volvacea gene models showed a greater tendency to be located in the CDS regions. There was also a higher preponderance of trinucleotide repeats, especially in the kinase genes, which implied a possible role in phenotypic variation. Among the five fungal genomes, microsatellite abundance appeared to be unrelated to genome size. Furthermore, the short motifs (mono- to tri-nucleotides) outnumbered other categories although these differed in proportion. Data analysis indicated a possible relationship between the most frequent microsatellite types and the genetic distance between the five fungal genomes. PMID:24575404
Microsatellites in the genome of the edible mushroom, Volvariella volvacea.

PubMed

Wang, Ying; Chen, Mingjie; Wang, Hong; Wang, Jing-Fang; Bao, Dapeng

2014-01-01

Using bioinformatics software and database, we have characterized the microsatellite pattern in the V. volvacea genome and compared it with microsatellite patterns found in the genomes of four other edible fungi: Coprinopsis cinerea, Schizophyllum commune, Agaricus bisporus, and Pleurotus ostreatus. A total of 1346 microsatellites have been identified, with mono-nucleotides being the most frequent motif. The relative abundance of microsatellites was lower in coding regions with 21 No./Mb. However, the microsatellites in the V. volvacea gene models showed a greater tendency to be located in the CDS regions. There was also a higher preponderance of trinucleotide repeats, especially in the kinase genes, which implied a possible role in phenotypic variation. Among the five fungal genomes, microsatellite abundance appeared to be unrelated to genome size. Furthermore, the short motifs (mono- to tri-nucleotides) outnumbered other categories although these differed in proportion. Data analysis indicated a possible relationship between the most frequent microsatellite types and the genetic distance between the five fungal genomes.
Tn552 transposase purification and in vitro activities.

PubMed Central

Rowland, S J; Sherratt, D J; Stark, W M; Boocock, M R

1995-01-01

The Staphylococcus aureus transposon Tn552 encodes a protein (p480) containing the 'D,D(35)E' motif common to retroviral integrases and the transposases of a number of bacterial elements, including phage Mu, the integron-containing element Tn5090, Tn7 and IS3. p480 and a histidine-tagged derivative were overexpressed in Escherichia coli and purified by methods involving denaturation and renaturation. DNase I footprinting and gel binding assays demonstrated that p480 binds to two adjacent, directly repeated 23 bp motifs at each end of Tn552. Although donor strand cleavage by p480 was not detected, in vitro conditions were defined for strand transfer activity with transposon end fragments having pre-cleaved 3' termini. Strand transfer was Mn(2+)-dependent and appeared to join a single left or right end fragment to target DNA. The importance of the terminal dinucleotide CA-3' was demonstrated by mutation. The in vitro activities of p480 are consistent with its proposed function as the Tn552 transposase. Images PMID:7828593
The PE/PPE multigene family codes for virulence factors and is a possible source of mycobacterial antigenic variation: perhaps more?

PubMed

Akhter, Yusuf; Ehebauer, Matthias T; Mukhopadhyay, Sangita; Hasnain, Seyed E

2012-01-01

The PE/PPE multigene family codes for approximately 10% of the Mycobacterium tuberculosis proteome and is encoded by 176 open reading frames. These proteins possess, and have been named after, the conserved proline-glutamate (PE) or proline-proline-glutamate (PPE) motifs at their N-terminus. Their genes have a conserved structure and repeat motifs that could be a potential source of antigenic variation in M. tuberculosis. PE/PPE genes are scattered throughout the genome and PE/PPE pairs are usually encoded in bicistronic operons although this is not universally so. This gene family has evolved by specific gene duplication events. PE/PPE proteins are either secreted or localized to the cell surface. Several are thought to be virulence factors, which participate in evasion of the host immune response. This review summarizes the current knowledge about the gene family in order to better understand its biological function. Copyright © 2011 Elsevier Masson SAS. All rights reserved.
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms.

PubMed

Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

2014-02-17

As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots.
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms

PubMed Central

2014-01-01

Background As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Results Recently, an algorithm called “LDsplit” has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. Conclusions LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots. PMID:24533858
Direct Interaction between the WD40 Repeat Protein WDR-23 and SKN-1/Nrf Inhibits Binding to Target DNA

PubMed Central

Leung, Chi K.; Hasegawa, Koichi; Wang, Ying; Deonarine, Andrew; Tang, Lanlan; Miwa, Johji

2014-01-01

SKN-1/Nrf transcription factors activate cytoprotective genes in response to reactive small molecules and strongly influence stress resistance, longevity, and development. The molecular mechanisms of SKN-1/Nrf regulation are poorly defined. We previously identified the WD40 repeat protein WDR-23 as a repressor of Caenorhabditis elegans SKN-1 that functions with a ubiquitin ligase to presumably target the factor for degradation. However, SKN-1 activity and nuclear accumulation are not always correlated, suggesting that there could be additional regulatory mechanisms. Here, we integrate forward genetics and biochemistry to gain insights into how WDR-23 interacts with and regulates SKN-1. We provide evidence that WDR-23 preferentially regulates one of three SKN-1 variants through a direct interaction that is required for normal stress resistance and development. Homology modeling predicts that WDR-23 folds into a β-propeller, and we identify the top of this structure and four motifs at the termini of SKN-1c as essential for the interaction. Two of these SKN-1 motifs are highly conserved in human Nrf1 and Nrf2 and two directly interact with target DNA. Lastly, we demonstrate that WDR-23 can block the ability of SKN-1c to interact with DNA sequences of target promoters identifying a new mechanism of regulation that is independent of the ubiquitin proteasome system, which can become occupied with damaged proteins during stress. PMID:24912676
Molecular interactions involved in the transactivation of the human T-cell leukemia virus type 1 promoter mediated by Tax and CREB-2 (ATF-4).

PubMed

Gachon, F; Thebault, S; Peleraux, A; Devaux, C; Mesnard, J M

2000-05-01

The human T-cell leukemia virus type 1 (HTLV-1) Tax protein activates viral transcription through three 21-bp repeats located in the U3 region of the HTLV-1 long terminal repeat and called Tax-responsive elements (TxREs). Each TxRE contains nucleotide sequences corresponding to imperfect cyclic AMP response elements (CRE). In this study, we demonstrate that the bZIP transcriptional factor CREB-2 is able to bind in vitro to the TxREs and that CREB-2 binding to each of the 21-bp motifs is enhanced by Tax. We also demonstrate that Tax can weakly interact with CREB-2 bound to a cellular palindromic CRE motif such as that found in the somatostatin promoter. Mutagenesis of Tax and CREB-2 demonstrates that both N- and C-terminal domains of Tax and the C-terminal region of CREB-2 are required for direct interaction between the two proteins. In addition, the Tax mutant M47, defective for HTLV-1 activation, is unable to form in vitro a ternary complex with CREB-2 and TxRE. In agreement with recent results suggesting that Tax can recruit the coactivator CREB-binding protein (CBP) on the HTLV-1 promoter, we provide evidence that Tax, CREB-2, and CBP are capable of cooperating to stimulate viral transcription. Taken together, our data highlight the major role played by CREB-2 in Tax-mediated transactivation.

Mechanistic Insights from Structural Analyses of Ran-GTPase-Driven Nuclear Export of Proteins and RNAs.

PubMed

Matsuura, Yoshiyuki

2016-05-22

Understanding how macromolecules are rapidly exchanged between the nucleus and the cytoplasm through nuclear pore complexes is a fundamental problem in biology. Exportins are Ran-GTPase-dependent nuclear transport factors that belong to the karyopherin-β family and mediate nuclear export of a plethora of proteins and RNAs, except for bulk mRNA nuclear export. Exportins bind cargo macromolecules in a Ran-GTP-dependent manner in the nucleus, forming exportin-cargo-Ran-GTP complexes (nuclear export complexes). Transient weak interactions between exportins and nucleoporins containing characteristic FG (phenylalanine-glycine) repeat motifs facilitate nuclear pore complex passage of nuclear export complexes. In the cytoplasm, nuclear export complexes are disassembled, thereby releasing the cargo. GTP hydrolysis by Ran promoted in the cytoplasm makes the disassembly reaction virtually irreversible and provides thermodynamic driving force for the overall export reaction. In the past decade, X-ray crystallography of some of the exportins in various functional states coupled with functional analyses, single-particle electron microscopy, molecular dynamics simulations, and small-angle solution X-ray scattering has provided rich insights into the mechanism of cargo binding and release and also begins to elucidate how exportins interact with the FG repeat motifs. The knowledge gained from structural analyses of nuclear export is being translated into development of clinically useful inhibitors of nuclear export to treat human diseases such as cancer and influenza. Copyright © 2015 Elsevier Ltd. All rights reserved.
Molecular Interactions Involved in the Transactivation of the Human T-Cell Leukemia Virus Type 1 Promoter Mediated by Tax and CREB-2 (ATF-4)

PubMed Central

Gachon, Frederic; Thebault, Sabine; Peleraux, Annick; Devaux, Christian; Mesnard, Jean-Michel

2000-01-01

The human T-cell leukemia virus type 1 (HTLV-1) Tax protein activates viral transcription through three 21-bp repeats located in the U3 region of the HTLV-1 long terminal repeat and called Tax-responsive elements (TxREs). Each TxRE contains nucleotide sequences corresponding to imperfect cyclic AMP response elements (CRE). In this study, we demonstrate that the bZIP transcriptional factor CREB-2 is able to bind in vitro to the TxREs and that CREB-2 binding to each of the 21-bp motifs is enhanced by Tax. We also demonstrate that Tax can weakly interact with CREB-2 bound to a cellular palindromic CRE motif such as that found in the somatostatin promoter. Mutagenesis of Tax and CREB-2 demonstrates that both N- and C-terminal domains of Tax and the C-terminal region of CREB-2 are required for direct interaction between the two proteins. In addition, the Tax mutant M47, defective for HTLV-1 activation, is unable to form in vitro a ternary complex with CREB-2 and TxRE. In agreement with recent results suggesting that Tax can recruit the coactivator CREB-binding protein (CBP) on the HTLV-1 promoter, we provide evidence that Tax, CREB-2, and CBP are capable of cooperating to stimulate viral transcription. Taken together, our data highlight the major role played by CREB-2 in Tax-mediated transactivation. PMID:10779337
An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

PubMed

Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

2016-08-09

Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance progress in elucidating transcription regulation mechanism, thus provide benefit to the genomic research community and prokaryotic genome researchers in particular.
Mining for class-specific motifs in protein sequence classification

PubMed Central

2013-01-01

Background In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms. PMID:23496846
Euglena gracilis and Trypanosomatids possess common patterns in predicted mitochondrial targeting presequences.

PubMed

Krnáčová, Katarína; Vesteg, Matej; Hampl, Vladimír; Vlček, Čestmír; Horváth, Anton

2012-10-01

Euglena gracilis possessing chloroplasts of secondary green algal origin and parasitic trypanosomatids Trypanosoma brucei, Trypanosoma cruzi and Leishmania major belong to the protist phylum Euglenozoa. Euglenozoa might be among the earliest eukaryotic branches bearing ancestral traits reminiscent of the last eukaryotic common ancestor (LECA) or missing features present in other eukaryotes. LECA most likely possessed mitochondria of endosymbiotic α-proteobacterial origin. In this study, we searched for the presence of homologs of mitochondria-targeted proteins from other organisms in the currently available EST dataset of E. gracilis. The common motifs in predicted N-terminal presequences and corresponding homologs from T. brucei, T. cruzi and L. major (if found) were analyzed. Other trypanosomatid mitochondrial protein precursor (e.g., those involved in RNA editing) were also included in the analysis. Mitochondrial presequences of E. gracilis and these trypanosomatids seem to be highly variable in sequence length (5-118 aa), but apparently share statistically significant similarities. In most cases, the common (M/L)RR motif is present at the N-terminus and it is probably responsible for recognition via import apparatus of mitochondrial outer membrane. Interestingly, this motif is present inside the predicted presequence region in some cases. In most presequences, this motif is followed by a hydrophobic region rich in alanine, leucine, and valine. In conclusion, either RR motif or arginine-rich region within hydrophobic aa-s present at the N-terminus of a preprotein can be sufficient signals for mitochondrial import irrespective of presequence length in Euglenozoa.
Di-isodityrosine is the intermolecular cross-link of isodityrosine-rich extensin analogs cross-linked in vitro.

PubMed

Held, Michael A; Tan, Li; Kamyab, Abdolreza; Hare, Michael; Shpak, Elena; Kieliszewski, Marcia J

2004-12-31

Extensins are cell wall hydroxyproline-rich glycoproteins that form covalent networks putatively involving tyrosyl and lysyl residues in cross-links catalyzed by one or more extensin peroxidases. The precise cross-links remain to be chemically identified both as network components in muro and as enzymic products generated in vitro with native extensin monomers as substrates. However, some extensin monomers contain variations within their putative cross-linking motifs that complicate cross-link identification. Other simpler extensins are recalcitrant to isolation including the ubiquitous P3-type extensin whose major repetitive motif, Hyp)(4)-Ser-Hyp-Ser-(Hyp)(4)-Tyr-Tyr-Tyr-Lys, is of particular interest, not least because its Tyr-Tyr-Tyr intramolecular isodityrosine cross-link motifs are also putative candidates for further intermolecular cross-linking to form di-isodityrosine. Therefore, we designed a set of extensin analogs encoding tandem repeats of the P3 motif, including Tyr --> Phe and Lys --> Leu variations. Expression of these P3 analogs in Nicotiana tabacum cells yielded glycoproteins with virtually all Pro residues hydroxylated and subsequently arabinosylated and with likely galactosylated Ser residues. This was consistent with earlier analyses of P3 glycopeptides isolated from cell wall digests and the predictions of the Hyp contiguity hypothesis. The tyrosine-rich P3 analogs also contained isodityrosine, formed in vivo. Significantly, these isodityrosine-containing analogs were further cross-linked in vitro by an extensin peroxidase to form the tetra-tyrosine intermolecular cross-link amino acid di-isodityrosine. This is the first identification of an inter-molecular cross-link amino acid in an extensin module and corroborates earlier suggestions that di-isodityrosine represents one mechanism for cross-linking extensins in muro.
Multivalent binding of formin-binding protein 21 (FBP21)-tandem-WW domains fosters protein recognition in the pre-spliceosome.

PubMed

Klippel, Stefan; Wieczorek, Marek; Schümann, Michael; Krause, Eberhard; Marg, Berenice; Seidel, Thorsten; Meyer, Tim; Knapp, Ernst-Walter; Freund, Christian

2011-11-04

The high abundance of repetitive but nonidentical proline-rich sequences in spliceosomal proteins raises the question of how these known interaction motifs recruit their interacting protein domains. Whereas complex formation of these adaptors with individual motifs has been studied in great detail, little is known about the binding mode of domains arranged in tandem repeats and long proline-rich sequences including multiple motifs. Here we studied the interaction of the two adjacent WW domains of spliceosomal protein FBP21 with several ligands of different lengths and composition to elucidate the hallmarks of multivalent binding for this class of recognition domains. First, we show that many of the proteins that define the cellular proteome interacting with FBP21-WW1-WW2 contain multiple proline-rich motifs. Among these is the newly identified binding partner SF3B4. Fluorescence resonance energy transfer (FRET) analysis reveals the tandem-WW domains of FBP21 to interact with splicing factor 3B4 (SF3B4) in nuclear speckles where splicing takes place. Isothermal titration calorimetry and NMR shows that the tandem arrangement of WW domains and the multivalency of the proline-rich ligands both contribute to affinity enhancement. However, ligand exchange remains fast compared with the NMR time scale. Surprisingly, a N-terminal spin label attached to a bivalent ligand induces NMR line broadening of signals corresponding to both WW domains of the FBP21-WW1-WW2 protein. This suggests that distinct orientations of the ligand contribute to a delocalized and semispecific binding mode that should facilitate search processes within the spliceosome.
Multivalent Binding of Formin-binding Protein 21 (FBP21)-Tandem-WW Domains Fosters Protein Recognition in the Pre-spliceosome*

PubMed Central

Klippel, Stefan; Wieczorek, Marek; Schümann, Michael; Krause, Eberhard; Marg, Berenice; Seidel, Thorsten; Meyer, Tim; Knapp, Ernst-Walter; Freund, Christian

2011-01-01

The high abundance of repetitive but nonidentical proline-rich sequences in spliceosomal proteins raises the question of how these known interaction motifs recruit their interacting protein domains. Whereas complex formation of these adaptors with individual motifs has been studied in great detail, little is known about the binding mode of domains arranged in tandem repeats and long proline-rich sequences including multiple motifs. Here we studied the interaction of the two adjacent WW domains of spliceosomal protein FBP21 with several ligands of different lengths and composition to elucidate the hallmarks of multivalent binding for this class of recognition domains. First, we show that many of the proteins that define the cellular proteome interacting with FBP21-WW1-WW2 contain multiple proline-rich motifs. Among these is the newly identified binding partner SF3B4. Fluorescence resonance energy transfer (FRET) analysis reveals the tandem-WW domains of FBP21 to interact with splicing factor 3B4 (SF3B4) in nuclear speckles where splicing takes place. Isothermal titration calorimetry and NMR shows that the tandem arrangement of WW domains and the multivalency of the proline-rich ligands both contribute to affinity enhancement. However, ligand exchange remains fast compared with the NMR time scale. Surprisingly, a N-terminal spin label attached to a bivalent ligand induces NMR line broadening of signals corresponding to both WW domains of the FBP21-WW1-WW2 protein. This suggests that distinct orientations of the ligand contribute to a delocalized and semispecific binding mode that should facilitate search processes within the spliceosome. PMID:21917930
Identification and characterization of gene-based SSR markers in date palm (Phoenix dactylifera L.).

PubMed

Zhao, Yongli; Williams, Roxanne; Prakash, C S; He, Guohao

2012-12-15

Date palm (Phoenix dactylifera L.) is an important tree in the Middle East and North Africa due to the nutritional value of its fruit. Molecular Breeding would accelerate genetic improvement of fruit tree through marker assisted selection. However, the lack of molecular markers in date palm restricts the application of molecular breeding. In this study, we analyzed 28,889 EST sequences from the date palm genome database to identify simple-sequence repeats (SSRs) and to develop gene-based markers, i.e. expressed sequence tag-SSRs (EST-SSRs). We identified 4,609 ESTs as containing SSRs, among which, trinucleotide motifs (69.7%) were the most common, followed by tetranucleotide (10.4%) and dinucleotide motifs (9.6%). The motif AG (85.7%) was most abundant in dinucleotides, while motifs AGG (26.8%), AAG (19.3%), and AGC (16.1%) were most common among trinucleotides. A total of 4,967 primer pairs were designed for EST-SSR markers from the computational data. In a follow up laboratory study, we tested a sample of 20 random selected primer pairs for amplification and polymorphism detection using genomic DNA from date palm cultivars. Nearly one-third of these primer pairs detected DNA polymorphism to differentiate the twelve date palm cultivars used. Functional categorization of EST sequences containing SSRs revealed that 3,108 (67.4%) of such ESTs had homology with known proteins. Date palm EST sequences exhibits a good resource for developing gene-based markers. These genic markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in date palm, such as diversity study, QTL mapping, and molecular breeding.
Candida species biofilm and Candida albicans ALS3 polymorphisms in clinical isolates

PubMed Central

Bruder-Nascimento, Ariane; Camargo, Carlos Henrique; Mondelli, Alessandro Lia; Sugizaki, Maria Fátima; Sadatsune, Terue; Bagagli, Eduardo

2014-01-01

Over the last decades, there have been important changes in the epidemiology of Candida infections. In recent years, Candida species have emerged as important causes of invasive infections mainly among immunocompromised patients. This study analyzed Candida spp. isolates and compared the frequency and biofilm production of different species among the different sources of isolation: blood, urine, vulvovaginal secretions and peritoneal dialysis fluid. Biofilm production was quantified in 327 Candida isolates obtained from patients attended at a Brazilian tertiary public hospital (Botucatu, Sao Paulo). C. albicans ALS3 gene polymorphism was also evaluated by determining the number of repeated motifs in the central domain. Of the 198 total biofilm-positive isolates, 72 and 126 were considered as low and high biofilm producers, respectively. Biofilm production by C. albicans was significantly lower than that by non-albicans isolates and was most frequently observed in C. tropicalis. Biofilm production was more frequent among bloodstream isolates than other clinical sources, in urine, the isolates displayed a peculiar distribution by presenting two distinct peaks, one containing biofilm-negative isolates and the other containing isolates with intense biofilm production. The numbers of tandem-repeat copies per allele were not associated with biofilm production, suggesting the evolvement of other genetic determinants. PMID:25763043
1H, 15N and 13C resonance assignments for free and IEEVD peptide-bound forms of the tetratricopeptide repeat domain from the human E3 ubiquitin ligase CHIP.

PubMed

Zhang, Huaqun; McGlone, Cameron; Mannion, Matthew M; Page, Richard C

2017-04-01

The ubiquitin ligase CHIP catalyzes covalent attachment of ubiquitin to unfolded proteins chaperoned by the heat shock proteins Hsp70/Hsc70 and Hsp90. CHIP interacts with Hsp70/Hsc70 and Hsp90 by binding of a C-terminal IEEVD motif found in Hsp70/Hsc70 and Hsp90 to the tetratricopeptide repeat (TPR) domain of CHIP. Although recruitment of heat shock proteins to CHIP via interaction with the CHIP-TPR domain is well established, alterations in structure and dynamics of CHIP upon binding are not well understood. In particular, the absence of a structure for CHIP-TPR in the free form presents a significant limitation upon studies seeking to rationally design inhibitors that may disrupt interactions between CHIP and heat shock proteins. Here we report the 1 H, 13 C, and 15 N backbone and side chain chemical shift assignments for CHIP-TPR in the free form, and backbone chemical shift assignments for CHIP-TPR in the IEEVD-bound form. The NMR resonance assignments will enable further studies examining the roles of dynamics and structure in regulating interactions between CHIP and the heat shock proteins Hsp70/Hsc70 and Hsp90.
DNA fingerprinting in zoology: past, present, future.

PubMed

Chambers, Geoffrey K; Curtis, Caitlin; Millar, Craig D; Huynen, Leon; Lambert, David M

2014-02-03

In 1962, Thomas Kuhn famously argued that the progress of scientific knowledge results from periodic 'paradigm shifts' during a period of crisis in which new ideas dramatically change the status quo. Although this is generally true, Alec Jeffreys' identification of hypervariable repeat motifs in the human beta-globin gene, and the subsequent development of a technology known now as 'DNA fingerprinting', also resulted in a dramatic shift in the life sciences, particularly in ecology, evolutionary biology, and forensics. The variation Jeffreys recognized has been used to identify individuals from tissue samples of not just humans, but also of many animal species. In addition, the technology has been used to determine the sex of individuals, as well as paternity/maternity and close kinship. We review a broad range of such studies involving a wide diversity of animal species. For individual researchers, Jeffreys' invention resulted in many ecologists and evolutionary biologists being given the opportunity to develop skills in molecular biology to augment their whole organism focus. Few developments in science, even among the subsequent genome discoveries of the 21st century, have the same wide-reaching significance. Even the later development of PCR-based genotyping of individuals using microsatellite repeats sequences, and their use in determining multiple paternity, is conceptually rooted in Alec Jeffreys' pioneering work.
Detection and Preliminary Analysis of Motifs in Promoters of Anaerobically Induced Genes of Different Plant Species

PubMed Central

MOHANTY, BIJAYALAXMI; KRISHNAN, S. P. T.; SWARUP, SANJAY; BAJIC, VLADIMIR B.

2005-01-01

• Background and Aims Plants can suffer from oxygen limitation during flooding or more complete submergence and may therefore switch from Kreb's cycle respiration to fermentation in association with the expression of anaerobically inducible genes coding for enzymes involved in glycolysis and fermentation. The aim of this study was to clarify mechanisms of transcriptional regulation of these anaerobic genes by identifying motifs shared by their promoter regions. • Methods Statistically significant motifs were detected by an in silico method from 13 promoters of anaerobic genes. The selected motifs were common for the majority of analysed promoters. Their significance was evaluated by searching for their presence in transcription factor-binding site databases (TRANSFAC, PlantCARE and PLACE). Using several negative control data sets, it was tested whether the motifs found were specific to the anaerobic group. • Key Results Previously, anaerobic response elements have been identified in maize (Zea mays) and arabidopsis (Arabidopsis thaliana) genes. Known functional motifs were detected, such as GT and GC motifs, but also other motifs shared by most of the genes examined. Five motifs detected have not been found in plants hitherto but are present in the promoters of animal genes with various functions. The consensus sequences of these novel motifs are 5′-AAACAAA-3′, 5′-AGCAGC-3′, 5′-TCATCAC-3′, 5′-GTTT(A/C/T)GCAA-3′ and 5′-TTCCCTGTT-3′. • Conclusions It is believed that the promoter motifs identified could be functional by conferring anaerobic sensitivity to the genes that possess them. This proposal now requires experimental verification. PMID:16027132
Two Cathelicidin Genes Are Present in both Rainbow Trout (Oncorhynchus mykiss) and Atlantic Salmon (Salmo salar)

PubMed Central

Chang, Chin-I; Zhang, Yong-An; Zou, Jun; Nie, Pin; Secombes, Christopher J.

2006-01-01

Further to the previous finding of the rainbow trout rtCATH_1 gene, this paper describes three more cathelicidin genes found in salmonids: two in Atlantic salmon, named asCATH_1 and asCATH_2, and one in rainbow trout, named rtCATH_2. All the three new salmonid cathelicidin genes share the common characteristics of mammalian cathelicidin genes, such as consisting of four exons and possessing a highly conserved preproregion and four invariant cysteines clustered in the C-terminal region of the cathelin-like domain. The asCATH_1 gene is homologous to the rainbow trout rtCATH_1 gene, in that it possesses three repeat motifs of TGGGGGTGGC in exon IV and two cysteine residues in the predicted mature peptide, while the asCATH_2 gene and rtCATH_2 gene are homologues of each other, with 96% nucleotide identity. Salmonid cathelicidins possess the same elastase-sensitive residue, threonine, as hagfish cathelicidins and the rabbit CAP18 molecule. The cleavage site of the four salmonid cathelicidins is within a conserved amino acid motif of QKIRTRR, which is at the beginning of the sequence encoded by exon IV. Two 36-residue peptides corresponding to the core part of rtCATH_1 and rtCATH_2 were chemically synthesized and shown to exhibit potent antimicrobial activity. rtCATH_2 was expressed constitutively in gill, head kidney, intestine, skin and spleen, while the expression of rtCATH_1 was inducible in gill, head kidney, and spleen after bacterial challenge. Four cathelicidin genes have now been characterized in salmonids and two were identified in hagfish, confirming that cathelicidin genes evolved early and are likely present in all vertebrates. PMID:16377685
A novel approach to identifying regulatory motifs in distantly related genomes

PubMed Central

Van Hellemont, Ruth; Monsieurs, Pieter; Thijs, Gert; De Moor, Bart; Van de Peer, Yves; Marchal, Kathleen

2005-01-01

Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size. PMID:16420672
Comparison of the carboxy-terminal DP-repeat region in the co-chaperones Hop and Hip

PubMed Central

Nelson, Gregory M.; Huffman, Holly; Smith, David F.

2003-01-01

Functional steroid receptor complexes are assembled and maintained by an ordered pathway of interactions involving multiple components of the cellular chaperone machinery. Two of these components, Hop and Hip, serve as co-chaperones to the major heat shock proteins (Hsps), Hsp70 and Hsp90, and participate in intermediate stages of receptor assembly. In an effort to better understand the functions of Hop and Hip in the assembly process, we focused on a region of similarity located near the C-terminus of each co-chaperone. Contained within this region is a repeated sequence motif we have termed the DP repeat. Earlier mutagenesis studies implicated the DP repeat of either Hop or Hip in Hsp70 binding and in normal assembly of the co-chaperones with progesterone receptor (PR) complexes. We report here that the DP repeat lies within a protease-resistant domain that extends to or is near the C-terminus of both co-chaperones. Point mutations in the DP repeats render the C-terminal regions hypersensitive to proteolysis. In addition, a Hop DP mutant displays altered proteolytic digestion patterns, which suggest that the DP-repeat region influences the folding of other Hop domains. Although the respective DP regions of Hop and Hip share sequence and structural similarities, they are not functionally interchangeable. Moreover, a double-point mutation within the second DP-repeat unit of Hop that converts this to the sequence found in Hip disrupts Hop function; however, the corresponding mutation in Hip does not alter its function. We conclude that the DP repeats are important structural elements within a C-terminal domain, which is important for Hop and Hip function. PMID:14627198
Comparison of the carboxy-terminal DP-repeat region in the co-chaperones Hop and Hip.

PubMed

Nelson, Gregory M; Huffman, Holly; Smith, David F

2003-01-01

Functional steroid receptor complexes are assembled and maintained by an ordered pathway of interactions involving multiple components of the cellular chaperone machinery. Two of these components, Hop and Hip, serve as co-chaperones to the major heat shock proteins (Hsps), Hsp70 and Hsp90, and participate in intermediate stages of receptor assembly. In an effort to better understand the functions of Hop and Hip in the assembly process, we focused on a region of similarity located near the C-terminus of each co-chaperone. Contained within this region is a repeated sequence motif we have termed the DP repeat. Earlier mutagenesis studies implicated the DP repeat of either Hop or Hip in Hsp70 binding and in normal assembly of the co-chaperones with progesterone receptor (PR) complexes. We report here that the DP repeat lies within a protease-resistant domain that extends to or is near the C-terminus of both co-chaperones. Point mutations in the DP repeats render the C-terminal regions hypersensitive to proteolysis. In addition, a Hop DP mutant displays altered proteolytic digestion patterns, which suggest that the DP-repeat region influences the folding of other Hop domains. Although the respective DP regions of Hop and Hip share sequence and structural similarities, they are not functionally interchangeable. Moreover, a double-point mutation within the second DP-repeat unit of Hop that converts this to the sequence found in Hip disrupts Hop function; however, the corresponding mutation in Hip does not alter its function. We conclude that the DP repeats are important structural elements within a C-terminal domain, which is important for Hop and Hip function.
Analysis of secondary structural elements in human microRNA hairpin precursors.

PubMed

Liu, Biao; Childs-Disney, Jessica L; Znosko, Brent M; Wang, Dan; Fallahi, Mohammad; Gallo, Steven M; Disney, Matthew D

2016-03-01

MicroRNAs (miRNAs) regulate gene expression by targeting complementary mRNAs for destruction or translational repression. Aberrant expression of miRNAs has been associated with various diseases including cancer, thus making them interesting therapeutic targets. The composite of secondary structural elements that comprise miRNAs could aid the design of small molecules that modulate their function. We analyzed the secondary structural elements, or motifs, present in all human miRNA hairpin precursors and compared them to highly expressed human RNAs with known structures and other RNAs from various organisms. Amongst human miRNAs, there are 3808 are unique motifs, many residing in processing sites. Further, we identified motifs in miRNAs that are not present in other highly expressed human RNAs, desirable targets for small molecules. MiRNA motifs were incorporated into a searchable database that is freely available. We also analyzed the most frequently occurring bulges and internal loops for each RNA class and found that the smallest loops possible prevail. However, the distribution of loops and the preferred closing base pairs were unique to each class. Collectively, we have completed a broad survey of motifs found in human miRNA precursors, highly expressed human RNAs, and RNAs from other organisms. Interestingly, unique motifs were identified in human miRNA processing sites, binding to which could inhibit miRNA maturation and hence function.
Phosphatidylinositol-4-kinase type II alpha contains an AP-3-sorting motif and a kinase domain that are both required for endosome traffic.

PubMed

Craige, Branch; Salazar, Gloria; Faundez, Victor

2008-04-01

The adaptor complex 3 (AP-3) targets membrane proteins from endosomes to lysosomes, lysosome-related organelles and synaptic vesicles. Phosphatidylinositol-4-kinase type II alpha (PI4KIIalpha) is one of several proteins possessing catalytic domains that regulate AP-3-dependent sorting. Here we present evidence that PI4KIIalpha uniquely behaves both as a membrane protein cargo as well as an enzymatic regulator of adaptor function. In fact, AP-3 and PI4KIIalpha form a complex that requires a dileucine-sorting motif present in PI4KIIalpha. Mutagenesis of either the PI4KIIalpha-sorting motif or its kinase-active site indicates that both are necessary to interact with AP-3 and properly localize PI4KIIalpha to LAMP-1-positive endosomes. Similarly, both the kinase activity and the sorting signal present in PI4KIIalpha are necessary to rescue endosomal PI4KIIalpha siRNA-induced mutant phenotypes. We propose a mechanism whereby adaptors use canonical sorting motifs to selectively recruit a regulatory enzymatic activity to restricted membrane domains.
The Sam-Sam interaction between Ship2 and the EphA2 receptor: design and analysis of peptide inhibitors.

PubMed

Mercurio, Flavia Anna; Di Natale, Concetta; Pirone, Luciano; Iannitti, Roberta; Marasco, Daniela; Pedone, Emilia Maria; Palumbo, Rosanna; Leone, Marilisa

2017-12-12

The lipid phosphatase Ship2 represents a drug discovery target for the treatment of different diseases, including cancer. Its C-terminal sterile alpha motif domain (Ship2-Sam) associates with the Sam domain from the EphA2 receptor (EphA2-Sam). This interaction is expected to mainly induce pro-oncogenic effects in cells therefore, inhibition of the Ship2-Sam/EphA2-Sam complex may represent an innovative route to discover anti-cancer therapeutics. In the present work, we designed and analyzed several peptide sequences encompassing the interaction interface of EphA2-Sam for Ship2-Sam. Peptide conformational analyses and interaction assays with Ship2-Sam conducted through diverse techniques (CD, NMR, SPR and MST), identified a positively charged penta-amino acid native motif in EphA2-Sam, that once repeated three times in tandem, binds Ship2-Sam. NMR experiments show that the peptide targets the negatively charged binding site of Ship2-Sam for EphA2-Sam. Preliminary in vitro cell-based assays indicate that -at 50 µM concentration- it induces necrosis of PC-3 prostate cancer cells with more cytotoxic effect on cancer cells than on normal dermal fibroblasts. This work represents a pioneering study that opens further opportunities for the development of inhibitors of the Ship2-Sam/EphA2-Sam complex for therapeutic applications.

Molecular cloning and characterization of human trabeculin-alpha, a giant protein defining a new family of actin-binding proteins.

PubMed

Sun, Y; Zhang, J; Kraeft, S K; Auclair, D; Chang, M S; Liu, Y; Sutherland, R; Salgia, R; Griffin, J D; Ferland, L H; Chen, L B

1999-11-19

We describe the molecular cloning and characterization of a novel giant human cytoplasmic protein, trabeculin-alpha (M(r) = 614,000). Analysis of the deduced amino acid sequence reveals homologies with several putative functional domains, including a pair of alpha-actinin-like actin binding domains; regions of homology to plakins at either end of the giant polypeptide; 29 copies of a spectrin-like motif in the central region of the protein; two potential Ca(2+)-binding EF-hand motifs; and a Ser-rich region containing a repeated GSRX motif. With similarities to both plakins and spectrins, trabeculin-alpha appears to have evolved as a hybrid of these two families of proteins. The functionality of the actin binding domains located near the N terminus was confirmed with an F-actin binding assay using glutathione S-transferase fusion proteins comprising amino acids 9-486 of the deduced peptide. Northern and Western blotting and immunofluorescence studies suggest that trabeculin is ubiquitously expressed and is distributed throughout the cytoplasm, though the protein was found to be greatly up-regulated upon differentiation of myoblasts into myotubes. Finally, the presence of cDNAs similar to, yet distinct from, trabeculin-alpha in both human and mouse suggests that trabeculins may form a new subfamily of giant actin-binding/cytoskeletal cross-linking proteins.
Cryo-EM near-atomic structure of a dsRNA fungal virus shows ancient structural motifs preserved in the dsRNA viral lineage

PubMed Central

Luque, Daniel; Gómez-Blanco, Josué; Garriga, Damiá; Brilot, Axel F.; González, José M.; Havens, Wendy M.; Carrascosa, José L.; Trus, Benes L.; Verdaguer, Nuria; Ghabrial, Said A.; Castón, José R.

2014-01-01

Viruses evolve so rapidly that sequence-based comparison is not suitable for detecting relatedness among distant viruses. Structure-based comparisons suggest that evolution led to a small number of viral classes or lineages that can be grouped by capsid protein (CP) folds. Here, we report that the CP structure of the fungal dsRNA Penicillium chrysogenum virus (PcV) shows the progenitor fold of the dsRNA virus lineage and suggests a relationship between lineages. Cryo-EM structure at near-atomic resolution showed that the 982-aa PcV CP is formed by a repeated α-helical core, indicative of gene duplication despite lack of sequence similarity between the two halves. Superimposition of secondary structure elements identified a single “hotspot” at which variation is introduced by insertion of peptide segments. Structural comparison of PcV and other distantly related dsRNA viruses detected preferential insertion sites at which the complexity of the conserved α-helical core, made up of ancestral structural motifs that have acted as a skeleton, might have increased, leading to evolution of the highly varied current structures. Analyses of structural motifs only apparent after systematic structural comparisons indicated that the hallmark fold preserved in the dsRNA virus lineage shares a long (spinal) α-helix tangential to the capsid surface with the head-tailed phage and herpesvirus viral lineage. PMID:24821769
A Small Molecule that Targets r(CGG)exp and Improves Defects in Fragile X-Associated Tremor Ataxia Syndrome

PubMed Central

Disney, Matthew D.; Liu, Biao; Yang, Wang-Yong; Sellier, Chantal; Tran, Tuan; Charlet-Berguerand, Nicolas; Childs-Disney, Jessica L.

2012-01-01

The development of small molecule chemical probes or therapeutics that target RNA remains a significant challenge despite the great interest in such compounds. The most significant barrier to compound development is a lack of knowledge of the chemical and RNA motif spaces that interact specifically. Herein, we describe a bioactive small molecule probe that targets expanded r(CGG) repeats, or r(CGG)exp , that causes Fragile X-associated Tremor Ataxia Syndrome (FXTAS). The compound was identified by using information on the chemotypes and RNA motifs that interact. Specifically, 9-hydroxy-5,11-dimethyl-2-(2-(piperidin-1-yl)ethyl)-6H-pyrido[4,3-b]carbazol-2-ium, binds the 5’CGG/3’GGC motifs in r(CGG)exp and disrupts a toxic r(CGG)exp -protein complex in vitro. Structure-activity relationships (SAR) studies determined that the alkylated pyridyl and phenolic side chains are important chemotypes that drive molecular recognition to r(CGG)exp . Importantly, the compound is efficacious in FXTAS model cellular systems as evidenced by its ability to improve FXTAS-associated pre-mRNA splicing defects and to reduce the size and number of r(CGG)exp -protein aggregates. This approach may establish a general strategy to identify lead ligands that target RNA while also providing a chemical probe to dissect the varied mechanisms by which r(CGG)exp promotes toxicity. PMID:22948243
A small molecule that targets r(CGG)(exp) and improves defects in fragile X-associated tremor ataxia syndrome.

PubMed

Disney, Matthew D; Liu, Biao; Yang, Wang-Yong; Sellier, Chantal; Tran, Tuan; Charlet-Berguerand, Nicolas; Childs-Disney, Jessica L

2012-10-19

The development of small molecule chemical probes or therapeutics that target RNA remains a significant challenge despite the great interest in such compounds. The most significant barrier to compound development is defining which chemical and RNA motif spaces interact specifically. Herein, we describe a bioactive small molecule probe that targets expanded r(CGG) repeats, or r(CGG)(exp), that causes Fragile X-associated Tremor Ataxia Syndrome (FXTAS). The compound was identified by using information on the chemotypes and RNA motifs that interact. Specifically, 9-hydroxy-5,11-dimethyl-2-(2-(piperidin-1-yl)ethyl)-6H-pyrido[4,3-b]carbazol-2-ium binds the 5'CGG/3'GGC motifs in r(CGG)(exp) and disrupts a toxic r(CGG)(exp)-protein complex in vitro. Structure-activity relationship studies determined that the alkylated pyridyl and phenolic side chains are important chemotypes that drive molecular recognition of r(CGG)(exp). Importantly, the compound is efficacious in FXTAS model cellular systems as evidenced by its ability to improve FXTAS-associated pre-mRNA splicing defects and to reduce the size and number of r(CGG)(exp)-containing nuclear foci. This approach may establish a general strategy to identify lead ligands that target RNA while also providing a chemical probe to dissect the varied mechanisms by which r(CGG)(exp) promotes toxicity.
The Complete Mitochondrial Genome of the Rice Moth, Corcyra cephalonica

PubMed Central

Wu, Yu-Peng; Li, Jie; Zhao, Jin-Liang; Su, Tian-Juan; Luo, A-Rong; Fan, Ren-Jun; Chen, Ming-Chang; Wu, Chun-Sheng; Zhu, Chao-Dong

2012-01-01

The complete mitochondrial genome (mitogenome) of the rice moth, Corcyra cephalonica Stainton (Lepidoptera: Pyralidae) was determined as a circular molecular of 15,273 bp in size. The mitogenome composition (37 genes) and gene order are the same as the other lepidopterans. Nucleotide composition of the C. cephalonica mitogenome is highly A+T biased (80.43%) like other insects. Twelve protein-coding genes start with a typical ATN codon, with the exception of coxl gene, which uses CGA as the initial codon. Nine protein-coding genes have the common stop codon TAA, and the nad2, cox1, cox2, and nad4 have single T as the incomplete stop codon. 22 tRNA genes demonstrated cloverleaf secondary structure. The mitogenome has several large intergenic spacer regions, the spacer1 between trnQ gene and nad2 gene, which is common in Lepidoptera. The spacer 3 between trnE and trnF includes microsatellite-like repeat regions (AT)18 and (TTAT)3. The spacer 4 (16 bp) between trnS2 gene and nad1 gene has a motif ATACTAT; another species, Sesamia inferens encodes ATCATAT at the same position, while other lepidopteran insects encode a similar ATACTAA motif. The spacer 6 is A+T rich region, include motif ATAGA and a 20-bp poly(T) stretch and two microsatellite (AT)9, (AT)8 elements. PMID:23413968
The complete mitochondrial genome of the rice moth, Corcyra cephalonica.

PubMed

Wu, Yu-Peng; Li, Jie; Zhao, Jin-Liang; Su, Tian-Juan; Luo, A-Rong; Fan, Ren-Jun; Chen, Ming-Chang; Wu, Chun-Sheng; Zhu, Chao-Dong

2012-01-01

The complete mitochondrial genome (mitogenome) of the rice moth, Corcyra cephalonica Stainton (Lepidoptera: Pyralidae) was determined as a circular molecular of 15,273 bp in size. The mitogenome composition (37 genes) and gene order are the same as the other lepidopterans. Nucleotide composition of the C. cephalonica mitogenome is highly A+T biased (80.43%) like other insects. Twelve protein-coding genes start with a typical ATN codon, with the exception of coxl gene, which uses CGA as the initial codon. Nine protein-coding genes have the common stop codon TAA, and the nad2, cox1, cox2, and nad4 have single T as the incomplete stop codon. 22 tRNA genes demonstrated cloverleaf secondary structure. The mitogenome has several large intergenic spacer regions, the spacer1 between trnQ gene and nad2 gene, which is common in Lepidoptera. The spacer 3 between trnE and trnF includes microsatellite-like repeat regions (AT)18 and (TTAT)(3). The spacer 4 (16 bp) between trnS2 gene and nad1 gene has a motif ATACTAT; another species, Sesamia inferens encodes ATCATAT at the same position, while other lepidopteran insects encode a similar ATACTAA motif. The spacer 6 is A+T rich region, include motif ATAGA and a 20-bp poly(T) stretch and two microsatellite (AT)(9), (AT)(8) elements.
Organisation of the plant genome in chromosomes.

PubMed

Heslop-Harrison, J S Pat; Schwarzacher, Trude

2011-04-01

The plant genome is organized into chromosomes that provide the structure for the genetic linkage groups and allow faithful replication, transcription and transmission of the hereditary information. Genome sizes in plants are remarkably diverse, with a 2350-fold range from 63 to 149,000 Mb, divided into n=2 to n= approximately 600 chromosomes. Despite this huge range, structural features of chromosomes like centromeres, telomeres and chromatin packaging are well-conserved. The smallest genomes consist of mostly coding and regulatory DNA sequences present in low copy, along with highly repeated rDNA (rRNA genes and intergenic spacers), centromeric and telomeric repetitive DNA and some transposable elements. The larger genomes have similar numbers of genes, with abundant tandemly repeated sequence motifs, and transposable elements alone represent more than half the DNA present. Chromosomes evolve by fission, fusion, duplication and insertion events, allowing evolution of chromosome size and chromosome number. A combination of sequence analysis, genetic mapping and molecular cytogenetic methods with comparative analysis, all only becoming widely available in the 21st century, is elucidating the exact nature of the chromosome evolution events at all timescales, from the base of the plant kingdom, to intraspecific or hybridization events associated with recent plant breeding. As well as being of fundamental interest, understanding and exploiting evolutionary mechanisms in plant genomes is likely to be a key to crop development for food production. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.
A kinesin-1 binding motif in vaccinia virus that is widespread throughout the human genome

PubMed Central

Dodding, Mark P; Mitter, Richard; Humphries, Ashley C; Way, Michael

2011-01-01

Transport of cargoes by kinesin-1 is essential for many cellular processes. Nevertheless, the number of proteins known to recruit kinesin-1 via its cargo binding light chain (KLC) is still quite small. We also know relatively little about the molecular features that define kinesin-1 binding. We now show that a bipartite tryptophan-based kinesin-1 binding motif, originally identified in Calsyntenin is present in A36, a vaccinia integral membrane protein. This bipartite motif in A36 is required for kinesin-1-dependent transport of the virus to the cell periphery. Bioinformatic analysis reveals that related bipartite tryptophan-based motifs are present in over 450 human proteins. Using vaccinia as a surrogate cargo, we show that regions of proteins containing this motif can function to recruit KLC and promote virus transport in the absence of A36. These proteins interact with the kinesin light chain outside the context of infection and have distinct preferences for KLC1 and KLC2. Our observations demonstrate that KLC binding can be conferred by a common set of features that are found in a wide range of proteins associated with diverse cellular functions and human diseases. PMID:21915095
Cave acoustics in prehistory: Exploring the association of Palaeolithic visual motifs and acoustic response.

PubMed

Fazenda, Bruno; Scarre, Chris; Till, Rupert; Pasalodos, Raquel Jiménez; Guerra, Manuel Rojo; Tejedor, Cristina; Peredo, Roberto Ontañón; Watson, Aaron; Wyatt, Simon; Benito, Carlos García; Drinkall, Helen; Foulds, Frederick

2017-09-01

During the 1980 s, acoustic studies of Upper Palaeolithic imagery in French caves-using the technology then available-suggested a relationship between acoustic response and the location of visual motifs. This paper presents an investigation, using modern acoustic measurement techniques, into such relationships within the caves of La Garma, Las Chimeneas, La Pasiega, El Castillo, and Tito Bustillo in Northern Spain. It addresses methodological issues concerning acoustic measurement at enclosed archaeological sites and outlines a general framework for extraction of acoustic features that may be used to support archaeological hypotheses. The analysis explores possible associations between the position of visual motifs (which may be up to 40 000 yrs old) and localized acoustic responses. Results suggest that motifs, in general, and lines and dots, in particular, are statistically more likely to be found in places where reverberation is moderate and where the low frequency acoustic response has evidence of resonant behavior. The work presented suggests that an association of the location of Palaeolithic motifs with acoustic features is a statistically weak but tenable hypothesis, and that an appreciation of sound could have influenced behavior among Palaeolithic societies of this region.
RNA Bricks—a database of RNA 3D motifs and their interactions

PubMed Central

Chojnowski, Grzegorz; Waleń, Tomasz; Bujnicki, Janusz M.

2014-01-01

The RNA Bricks database (http://iimcb.genesilico.pl/rnabricks), stores information about recurrent RNA 3D motifs and their interactions, found in experimentally determined RNA structures and in RNA–protein complexes. In contrast to other similar tools (RNA 3D Motif Atlas, RNA Frabase, Rloom) RNA motifs, i.e. ‘RNA bricks’ are presented in the molecular environment, in which they were determined, including RNA, protein, metal ions, water molecules and ligands. All nucleotide residues in RNA bricks are annotated with structural quality scores that describe real-space correlation coefficients with the electron density data (if available), backbone geometry and possible steric conflicts, which can be used to identify poorly modeled residues. The database is also equipped with an algorithm for 3D motif search and comparison. The algorithm compares spatial positions of backbone atoms of the user-provided query structure and of stored RNA motifs, without relying on sequence or secondary structure information. This enables the identification of local structural similarities among evolutionarily related and unrelated RNA molecules. Besides, the search utility enables searching ‘RNA bricks’ according to sequence similarity, and makes it possible to identify motifs with modified ribonucleotide residues at specific positions. PMID:24220091
GPUmotif: An Ultra-Fast and Energy-Efficient Motif Analysis Program Using Graphics Processing Units

PubMed Central

Zandevakili, Pooya; Hu, Ming; Qin, Zhaohui

2012-01-01

Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. In our previous work, we have developed a novel algorithm called Hybrid Motif Sampler (HMS) that enables more scalable and accurate motif analysis. Despite much improvement, HMS is still time-consuming due to the requirement to calculate matching probabilities position-by-position. Using the NVIDIA CUDA toolkit, we developed a graphics processing unit (GPU)-accelerated motif analysis program named GPUmotif. We proposed a “fragmentation" technique to hide data transfer time between memories. Performance comparison studies showed that commonly-used model-based motif scan and de novo motif finding procedures such as HMS can be dramatically accelerated when running GPUmotif on NVIDIA graphics cards. As a result, energy consumption can also be greatly reduced when running motif analysis using GPUmotif. The GPUmotif program is freely available at http://sourceforge.net/projects/gpumotif/ PMID:22662128
Discovery of candidate KEN-box motifs using cell cycle keyword enrichment combined with native disorder prediction and motif conservation.

PubMed

Michael, Sushama; Travé, Gilles; Ramu, Chenna; Chica, Claudia; Gibson, Toby J

2008-02-15

KEN-box-mediated target selection is one of the mechanisms used in the proteasomal destruction of mitotic cell cycle proteins via the APC/C complex. While annotating the Eukaryotic Linear Motif resource (ELM, http://elm.eu.org/), we found that KEN motifs were significantly enriched in human protein entries with cell cycle keywords in the UniProt/Swiss-Prot database-implying that KEN-boxes might be more common than reported. Matches to short linear motifs in protein database searches are not, per se, significant. KEN-box enrichment with cell cycle Gene Ontology terms suggests that collectively these motifs are functional but does not prove that any given instance is so. Candidates were surveyed for native disorder prediction using GlobPlot and IUPred and for motif conservation in homologues. Among >25 strong new candidates, the most notable are human HIPK2, CHFR, CDC27, Dab2, Upf2, kinesin Eg5, DNA Topoisomerase 1 and yeast Cdc5 and Swi5. A similar number of weaker candidates were present. These proteins have yet to be tested for APC/C targeted destruction, providing potential new avenues of research.
De novo discovery of structural motifs in RNA 3D structures through clustering.

PubMed

Ge, Ping; Islam, Shahidul; Zhong, Cuncong; Zhang, Shaojie

2018-05-18

As functional components in three-dimensional (3D) conformation of an RNA, the RNA structural motifs provide an easy way to associate the molecular architectures with their biological mechanisms. In the past years, many computational tools have been developed to search motif instances by using the existing knowledge of well-studied families. Recently, with the rapidly increasing number of resolved RNA 3D structures, there is an urgent need to discover novel motifs with the newly presented information. In this work, we classify all the loops in non-redundant RNA 3D structures to detect plausible RNA structural motif families by using a clustering pipeline. Compared with other clustering approaches, our method has two benefits: first, the underlying alignment algorithm is tolerant to the variations in 3D structures. Second, sophisticated downstream analysis has been performed to ensure the clusters are valid and easily applied to further research. The final clustering results contain many interesting new variants of known motif families, such as GNAA tetraloop, kink-turn, sarcin-ricin and T-loop. We have also discovered potential novel functional motifs conserved in ribosomal RNA, sgRNA, SRP RNA, riboswitch and ribozyme.
GPUmotif: an ultra-fast and energy-efficient motif analysis program using graphics processing units.

PubMed

Zandevakili, Pooya; Hu, Ming; Qin, Zhaohui

2012-01-01

Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. In our previous work, we have developed a novel algorithm called Hybrid Motif Sampler (HMS) that enables more scalable and accurate motif analysis. Despite much improvement, HMS is still time-consuming due to the requirement to calculate matching probabilities position-by-position. Using the NVIDIA CUDA toolkit, we developed a graphics processing unit (GPU)-accelerated motif analysis program named GPUmotif. We proposed a "fragmentation" technique to hide data transfer time between memories. Performance comparison studies showed that commonly-used model-based motif scan and de novo motif finding procedures such as HMS can be dramatically accelerated when running GPUmotif on NVIDIA graphics cards. As a result, energy consumption can also be greatly reduced when running motif analysis using GPUmotif. The GPUmotif program is freely available at http://sourceforge.net/projects/gpumotif/
Transcription factor ThWRKY4 binds to a novel WLS motif and a RAV1A element in addition to the W-box to regulate gene expression.

PubMed

Xu, Hongyun; Shi, Xinxin; Wang, Zhibo; Gao, Caiqiu; Wang, Chao; Wang, Yucheng

2017-08-01

WRKY transcription factors play important roles in many biological processes, and mainly bind to the W-box element to regulate gene expression. Previously, we characterized a WRKY gene from Tamarix hispida, ThWRKY4, in response to abiotic stress, and showed that it bound to the W-box motif. However, whether ThWRKY4 could bind to other motifs remains unknown. In this study, we employed a Transcription Factor-Centered Yeast one Hybrid (TF-Centered Y1H) screen to study the motifs recognized by ThWRKY4. In addition to the W-box core cis-element (termed W-box), we identified that ThWRKY4 could bind to two other motifs: the RAV1A element (CAACA) and a novel motif with sequence of GTCTA (W-box like sequence, WLS). The distributions of these motifs were screened in the promoter regions of genes regulated by some WRKYs. The results showed that the W-box, RAV1A, and WLS motifs were all present in high numbers, suggesting that they play key roles in gene expression mediated by WRKYs. Furthermore, five WRKY proteins from different WRKY subfamilies in Arabidopsis thaliana were selected and confirmed to bind to the RAV1A and WLS motifs, indicating that they are recognized commonly by WRKYs. These findings will help to further reveal the functions of WRKY proteins. Copyright © 2017 Elsevier B.V. All rights reserved.
Unique molecular architecture of silk fibroin in the waxmoth, Galleria mellonella.

PubMed

Zurovec, Michal; Sehnal, Frantisek

2002-06-21

Proteins of silk fibers are characterized by reiterations of amino acid repeats. Physical properties of the fiber are determined by the amino acid composition, the complexity of repetitive units, and arrangement of these units into higher order arrays. Except for very short motifs of 6-10 residues, the length of repetitive units and the number of these units concatenated in higher order assemblies vary in all spider and lepidopteran silks analyzed so far. This paper describes an exceptional silk protein represented by the 500-kDa heavy chain fibroin (H-fibroin) of the waxmoth, Galleria mellonella. Its non-repetitive N-terminal (175 residues) and C-terminal (60 residues) parts, the overall gene organization, and the nucleotide sequence around the TATA box show that it is homologous to the H-fibroins of other Lepidoptera. However, over 95% of the protein consists of highly ordered repetitive structures that are unmatched in other species. The repetitive region includes 11 assemblies AB(1)AB(1)AB(1)AB(2)(AB(2))AB(2) of remarkably conserved polypeptide repeats A (63 amino acid residues), B(1) (43 residues), and B(2) (18 residues). The repeats contain a high proportion of Gly (31.6%), Ala (23.8%), Ser (18.1%), and of residues with long hydrophobic side chains (16% for Leu, Ile, and Val combined). The presence of the GLGGLG and SSAASAA(AA) motifs suggests formation of pleated beta-sheets and their stacking into crystallites. Conspicuous conservation of the apolar sequence VIVI followed by DD or ED is interpreted as indicating the importance of hydrophobicity and electrostatic charge in H-fibroin cross-linking. The environment of G. mellonella larvae within bee cultures requires continuous production of silk that must be both strong and elastic. The spectacular arrangement of the repetitive H-fibroin region apparently evolved to meet these requirements.
De novo Transcriptome Sequencing Reveals a Considerable Bias in the Incidence of Simple Sequence Repeats towards the Downstream of ‘Pre-miRNAs’ of Black Pepper

PubMed Central

Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

2013-01-01

Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of ‘43 pre-miRNA candidates bearing different types of SSR motifs’. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted ‘pre-miRNA candidates bearing SSRs’. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted ‘pre-miRNA candidates’. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of ‘tandem repeats’ in miRNAs. PMID:23469176
A multiplex PCR system for 13 RM Y-STRs with separate amplification of two different repeat motif structures in DYF403S1a.

PubMed

Lee, Eun Young; Lee, Hwan Young; Kwon, So Yeun; Oh, Yu Na; Yang, Woo Ick; Shin, Kyoung-Jin

2017-01-01

In forensic science and human genetics, Y-chromosomal short tandem repeats (Y-STRs) have been used as very useful markers. Recently, more Y-STR markers have been analyzed to enhance the resolution power in haplotype analysis, and 13 rapidly mutating (RM) Y-STRs have been suggested as revolutionary tools that can widen Y-chromosomal application from paternal lineage differentiation to male individualization. We have constructed two multiplex PCR sets for the amplification of 13 RM Y-STRs, which yield small-sized amplicons (<400bp) and a more balanced PCR efficiency with minimum PCR cycling. In particular, with the developed multiplex PCR system, we could separate three copies of DYF403S1a into two copies of DYF403S1a and one of DYF403S1b1. This is because DYF403S1b1 possesses distinguishable sequences from DYF403S1a at both the front and rear flanking regions of the repeat motif; therefore, the locus could be separately amplified using sequence-specific primers. In addition, the other copy, defined as DYF403S1b by Ballantyne et al., was renamed DYF403S1b2 because of its similar flanking region sequence to DYF403S1b1. By redefining DYF403S1 with the developed multiplex system, all genotypes of four copies could be successfully typed and more diverse haplotypes were obtained. We analyzed haplotype distributions in 705 Korean males based on four different Y-STR subsets: Yfiler, PowerPlex Y23, Yfiler Plus, and RM Y-STRs. All haplotypes obtained from RM Y-STRs were the most diverse and showed strong discriminatory power in Korean population. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
In vitro characterization of the RS motif in N-terminal head domain of goldfish germinal vesicle lamin B3 necessary for phosphorylation of the p34cdc2 target serine by SRPK1☆

PubMed Central

Yamaguchi, Akihiko; Iwatani, Miho; Ogawa, Mariko; Kitano, Hajime; Matsuyama, Michiya

2013-01-01

The nuclear envelopes surrounding the oocyte germinal vesicles of lower vertebrates (fish and frog) are supported by the lamina, which consists of the protein lamin B3 encoded by a gene found also in birds but lost in the lineage leading to mammals. Like other members of the lamin family, goldfish lamin B3 (gfLB3) contains two putative consensus phosphoacceptor p34cdc2 sites (Ser-28 and Ser-398) for the M-phase kinase to regulate lamin polymerization on the N- and C-terminal regions flanking a central rod domain. Partial phosphorylation of gfLB3 occurs on Ser-28 in the N-terminal head domain in immature oocytes prior to germinal vesicle breakdown, which suggests continual rearrangement of lamins by a novel lamin kinase in fish oocytes. We applied the expression-screening method to isolate lamin kinases by using phosphorylation site Ser-28-specific monoclonal antibody and a vector encoding substrate peptides from a goldfish ovarian cDNA library. As a result, SRPK1 was screened as a prominent lamin kinase candidate. The gfLB3 has a short stretch of the RS repeats (9-SRASTVRSSRRS-20) upstream of the Ser-28, within the N-terminal head. This stretch of repeats is conserved among fish lamin B3 but is not found in other lamins. In vitro phosphorylation studies and GST-pull down assay revealed that SRPK1 bound to the region of sequential RS repeats (9–20) with affinity and recruited serine into the active site by a grab-and-pull manner. These results indicate SRPK1 may phosphorylate the p34cdc2 site in the N-terminal head of GV-lamin B3 at the RS motifs, which have the general property of aggregation. PMID:23772390
Extensive T-Cell Epitope Repertoire Sharing among Human Proteome, Gastrointestinal Microbiome, and Pathogenic Bacteria: Implications for the Definition of Self

PubMed Central

Bremel, Robert D.; Homan, E. Jane

2015-01-01

T-cell receptor binding to MHC-bound peptides plays a key role in discrimination between self and non-self. Only a subset, typically a pentamer, of amino acids in a MHC-bound peptide form the motif exposed to the T-cell receptor. We categorize and compare the T-cell exposed amino acid motif repertoire of the total proteomes of two groups of bacteria, comprising pathogens and gastrointestinal microbiome organisms, with the human proteome and immunoglobulins. Given the maximum 205, or 3.2 million of such motifs that bind T-cell receptors, there is considerable overlap in motif usage. We show that the human proteome, exclusive of immunoglobulins, only comprises three quarters of the possible motifs, of which 65.3% are also present in both composite bacterial proteomes. Very few motifs are unique to the human proteome. Immunoglobulin variable regions carry a broad diversity of T-cell exposed motifs (TCEMs) that provides a stratified random sample of the motifs found in pathogens, microbiome, and the human proteome. Individual bacterial genera and species vary in the content of immunoglobulin and human proteome matched motifs that they carry. Mycobacteria and Burkholderia spp carry a particularly high content of such matched motifs. Some bacteria retain a unique motif signature and motif sharing pattern with the human proteome. The implication is that distinguishing self from non-self does not depend on individual TCEMs, but on a complex and dynamic overlay of signals wherein the same TCEM may play different roles in different organisms, and the frequency with which a particular TCEM appears influences its effect. The patterns observed provide clues to bacterial immune evasion and to strategies for intervention, including vaccine design. The breadth and distinct frequency patterns of the immunoglobulin-derived peptides suggest a role of immunoglobulins in maintaining a broadly responsive T-cell repertoire. PMID:26557118

Methods and compositions for targeting macromolecules into the nucleus

DOEpatents

Chook, Yuh Min

2013-06-25

The present invention includes compositions, methods and kits for directing an agent across the nuclear membrane of a cell. The present invention includes a Karyopherin beta2 translocation motif in a polypeptide having a slightly positively charged region or a slightly hydrophobic region and one or more R/K/H-X.sub.(2-5)-P-Y motifs. The polypeptide targets the agent into the cell nucleus.
Encryption of agonistic motifs for TLR4 into artificial antigens augmented the maturation of antigen-presenting cells

PubMed Central

Hayashi, Kazumi; Minamisawa, Tamiko; Homma, Sadamu; Koido, Shigeo; Shiba, Kiyotaka

2017-01-01

Adjuvants are indispensable for achieving a sufficient immune response from vaccinations. From a functional viewpoint, adjuvants are classified into two categories: “physical adjuvants” increase the efficacy of antigen presentation by antigen-presenting cells (APC) and “signal adjuvants” induce the maturation of APC. Our previous study has demonstrated that a physical adjuvant can be encrypted into proteinous antigens by creating artificial proteins from combinatorial assemblages of epitope peptides and those peptide sequences having propensities to form certain protein structures (motif programming). However, the artificial antigens still require a signal adjuvant to maturate the APC; for example, co-administration of the Toll-like receptor 4 (TLR4) agonist monophosphoryl lipid A (MPLA) was required to induce an in vivo immunoreaction. In this study, we further modified the previous artificial antigens by appending the peptide motifs, which have been reported to have agonistic activity for TLR4, to create “adjuvant-free” antigens. The created antigens with triple TLR4 agonistic motifs in their C-terminus have activated NF-κB signaling pathways through TLR4. These proteins also induced the production of the inflammatory cytokine TNF-α, and the expression of the co-stimulatory molecule CD40 in APC, supporting the maturation of APC in vitro. Unexpectedly, these signal adjuvant-encrypted proteins have lost their ability to be physical adjuvants because they did not induce cytotoxic T lymphocytes (CTL) in vivo, while the parental proteins induced CTL. These results confirmed that the manifestation of a motif’s function is context-dependent and simple addition does not always work for motif-programing. Further optimization of the molecular context of the TLR4 agonistic motifs in antigens should be required to create adjuvant-free antigens. PMID:29190754
A Noncoding Expansion in EIF4A3 Causes Richieri-Costa-Pereira Syndrome, a Craniofacial Disorder Associated with Limb Defects

PubMed Central

Favaro, Francine P.; Alvizi, Lucas; Zechi-Ceide, Roseli M.; Bertola, Debora; Felix, Temis M.; de Souza, Josiane; Raskin, Salmo; Twigg, Stephen R.F.; Weiner, Andrea M.J.; Armas, Pablo; Margarit, Ezequiel; Calcaterra, Nora B.; Andersen, Gregers R.; McGowan, Simon J.; Wilkie, Andrew O.M.; Richieri-Costa, Antonio; de Almeida, Maria L.G.; Passos-Bueno, Maria Rita

2014-01-01

Richieri-Costa-Pereira syndrome is an autosomal-recessive acrofacial dysostosis characterized by mandibular median cleft associated with other craniofacial anomalies and severe limb defects. Learning and language disabilities are also prevalent. We mapped the mutated gene to a 122 kb region at 17q25.3 through identity-by-descent analysis in 17 genealogies. Sequencing strategies identified an expansion of a region with several repeats of 18- or 20-nucleotide motifs in the 5′ untranslated region (5′ UTR) of EIF4A3, which contained from 14 to 16 repeats in the affected individuals and from 3 to 12 repeats in 520 healthy individuals. A missense substitution of a highly conserved residue likely to affect the interaction of eIF4AIII with the UPF3B subunit of the exon junction complex in trans with an expanded allele was found in an unrelated individual with an atypical presentation, thus expanding mutational mechanisms and phenotypic diversity of RCPS. EIF4A3 transcript abundance was reduced in both white blood cells and mesenchymal cells of RCPS-affected individuals as compared to controls. Notably, targeting the orthologous eif4a3 in zebrafish led to underdevelopment of several craniofacial cartilage and bone structures, in agreement with the craniofacial alterations seen in RCPS. Our data thus suggest that RCPS is caused by mutations in EIF4A3 and show that EIF4A3, a gene involved in RNA metabolism, plays a role in mandible, laryngeal, and limb morphogenesis. PMID:24360810
Physical organisation of simple sequence repeats (SSRs) in Triticeae: structural, functional and evolutionary implications.

PubMed

Cuadrado, A; Cardoso, M; Jouve, N

2008-01-01

A significant fraction of the nuclear DNA of all eukaryotes is occupied by simple sequence repeats (SSRs) or microsatellites. This type of sequence has sparked great interest as a means of studying genetic variation, linkage mapping, gene tagging and evolution. Although SSRs at different positions in a gene help determine the regulation of expression and the function of the protein produced, little attention has been paid to the chromosomal organisation and distribution of these sequences, even in model species. This review discusses the main achievements in the characterisation of long-range SSR organisation in the chromosomes of Triticum aestivum L., Secale cereale L., and Hordeum vulgare L. (all members of Triticeae). We have detected SSRs using an improved FISH technique based on the random primer labelling of synthetic oligonucleotides (15-24 bases) in multi-colour experiments. Detailed information on the presence and distribution of AC, AG and all the possible classes of trinucleotide repeats has been acquired. These data have revealed the motif-dependent and non-random chromosome distributions of SSRs in the different genomes, and allowed the correlation of particular SSRs with chromosome areas characterised by specific features (e.g., heterochromatin, euchromatin and centromeres) in all three species. The present review provides a detailed comparative study of the distribution of these SSRs in each of the seven chromosomes of the genomes A, B and D of wheat, H of barley and R of rye. The importance of SSRs in plant breeding and their possible role in chromosome structure, function and evolution is discussed. 2008 S. Karger AG, Basel
Survey and Analysis of Microsatellites in the Silkworm, Bombyx mori

PubMed Central

Prasad, M. Dharma; Muthulakshmi, M.; Madhu, M.; Archak, Sunil; Mita, K.; Nagaraju, J.

2005-01-01

We studied microsatellite frequency and distribution in 21.76-Mb random genomic sequences, 0.67-Mb BAC sequences from the Z chromosome, and 6.3-Mb EST sequences of Bombyx mori. We mined microsatellites of ≥15 bases of mononucleotide repeats and ≥5 repeat units of other classes of repeats. We estimated that microsatellites account for 0.31% of the genome of B. mori. Microsatellite tracts of A, AT, and ATT were the most abundant whereas their number drastically decreased as the length of the repeat motif increased. In general, tri- and hexanucleotide repeats were overrepresented in the transcribed sequences except TAA, GTA, and TGA, which were in excess in genomic sequences. The Z chromosome sequences contained shorter repeat types than the rest of the chromosomes in addition to a higher abundance of AT-rich repeats. Our results showed that base composition of the flanking sequence has an influence on the origin and evolution of microsatellites. Transitions/transversions were high in microsatellites of ESTs, whereas the genomic sequence had an equal number of substitutions and indels. The average heterozygosity value for 23 polymorphic microsatellite loci surveyed in 13 diverse silkmoth strains having 2–14 alleles was 0.54. Only 36 (18.2%) of 198 microsatellite loci were polymorphic between the two divergent silkworm populations and 10 (5%) loci revealed null alleles. The microsatellite map generated using these polymorphic markers resulted in 8 linkage groups. B. mori microsatellite loci were the most conserved in its immediate ancestor, B. mandarina, followed by the wild saturniid silkmoth, Antheraea assama. PMID:15371363
Multiple functions of the leucine-rich repeat protein LrrA of Treponema denticola.

PubMed

Ikegami, Akihiko; Honma, Kiyonobu; Sharma, Ashu; Kuramitsu, Howard K

2004-08-01

The gene lrrA, encoding a leucine-rich repeat protein, LrrA, that contains eight consensus tandem repeats of 23 amino acid residues, has been identified in Treponema denticola ATCC 35405. A leucine-rich repeat is a generally useful protein-binding motif, and proteins containing this repeat are typically involved in protein-protein interactions. Southern blot analysis demonstrated that T. denticola ATCC 35405 expresses the lrrA gene, but the gene was not identified in T. denticola ATCC 33520. In order to analyze the functions of LrrA in T. denticola, an lrrA-inactivated mutant of strain ATCC 35405 and an lrrA gene expression transformant of strain ATCC 33520 were constructed. Characterization of the mutant and transformant demonstrated that LrrA is associated with the extracytoplasmic fraction of T. denticola and expresses multifunctional properties. It was demonstrated that the attachment of strain ATCC 35405 to HEp-2 cell cultures and coaggregation with Tannerella forsythensis were attenuated by the lrrA mutation. In addition, an in vitro binding assay demonstrated specific binding of LrrA to a portion of the Tannerella forsythensis leucine-rich repeat protein, BspA, which is mediated by the N-terminal region of LrrA. It was also observed that the lrrA mutation caused a reduction of swarming in T. denticola ATCC 35405 and consequently attenuated tissue penetration. These results suggest that the leucine-rich repeat protein LrrA plays a role in the attachment and penetration of human epithelial cells and coaggregation with Tannerella forsythensis. These properties may play important roles in the virulence of T. denticola.
Characterization of genic microsatellite markers derived from expressed sequence tags in Pacific abalone ( Haliotis discus hannai)

NASA Astrophysics Data System (ADS)

Li, Qi; Shu, Jing; Zhao, Cui; Liu, Shikai; Kong, Lingfeng; Zheng, Xiaodong

2010-01-01

Simple sequence repeat (SSR) markers were developed from the expressed sequence tags (ESTs) of Pacific abalone ( Haliotis discus hannai). Repeat motifs were found in 4.95% of the ESTs at a frequency of one repeat every 10.04 kb of EST sequences, after redundancy elimination. Seventeen polymorphic EST-SSRs were developed. The number of alleles per locus varied from 2-17, with an average of 6.8 alleles per locus. The expected and observed heterozygosities ranged from 0.159 to 0.928 and from 0.132 to 0.922, respectively. Twelve of the 17 loci (70.6%) were successfully amplified in H. diversicolor. Seventeen loci segregated in three families, with three showing the presence of null alleles (17.6%). The adequate level of variability and low frequency of null alleles observed in H. discus hannai, together with the high rate of transportability across Haliotis species, make this set of EST-SSR markers an important tool for comparative mapping, marker-assisted selection, and evolutionary studies, not only in the Pacific abalone, but also in related species.
ProGeRF: Proteome and Genome Repeat Finder Utilizing a Fast Parallel Hash Function

PubMed Central

Moraes, Walas Jhony Lopes; Rodrigues, Thiago de Souza; Bartholomeu, Daniella Castanheira

2015-01-01

Repetitive element sequences are adjacent, repeating patterns, also called motifs, and can be of different lengths; repetitions can involve their exact or approximate copies. They have been widely used as molecular markers in population biology. Given the sizes of sequenced genomes, various bioinformatics tools have been developed for the extraction of repetitive elements from DNA sequences. However, currently available tools do not provide options for identifying repetitive elements in the genome or proteome, displaying a user-friendly web interface, and performing-exhaustive searches. ProGeRF is a web site for extracting repetitive regions from genome and proteome sequences. It was designed to be efficient, fast, and accurate and primarily user-friendly web tool allowing many ways to view and analyse the results. ProGeRF (Proteome and Genome Repeat Finder) is freely available as a stand-alone program, from which the users can download the source code, and as a web tool. It was developed using the hash table approach to extract perfect and imperfect repetitive regions in a (multi)FASTA file, while allowing a linear time complexity. PMID:25811026
Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

PubMed

Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

2016-09-01

Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.
Development of Novel SSR Markers for Flax (Linum usitatissimum L.) Using Reduced-Representation Genome Sequencing

PubMed Central

Wu, Jianzhong; Zhao, Qian; Wu, Guangwen; Zhang, Shuquan; Jiang, Tingbo

2017-01-01

Flax (Linum usitatissimum L.) is a major fiber and oil yielding crop grown in northeastern China. Identification of flax molecular markers is a key step toward improving flax yield and quality via marker-assisted breeding. Simple sequence repeat (SSR) markers, which are based on genomic structural variation, are considered the most valuable type of genetic marker for this purpose. In this study, we screened 1574 microsatellites from Linum usitatissimum L. obtained using reduced representation genome sequencing (RRGS) to systematically identify SSR markers. The resulting set of microsatellites consisted mainly of trinucleotide (56.10%) and dinucleotide (35.23%) repeats, with each motif consisting of 5–8 repeats. We then evaluated marker sensitivity and specificity based on samples of 48 flax isolates obtained from northeastern China. Using the new SSR panel, the results demonstrated that fiber flax and oilseed flax varieties clustered into two well separated groups. The novel SSR markers developed in this study show potential value for selection of varieties for use in flax breeding programs. PMID:28133461
DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes

PubMed Central

Sebestyén, Endre; Nagy, Tibor; Suhai, Sándor; Barta, Endre

2009-01-01

Background The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s). Results We have developed a new tool called DoOPSearch for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program. Conclusion We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes. PMID:19534755
The glycine-rich motif of Pyrococcus abyssi DNA polymerase D is critical for protein stability.

PubMed

Castrec, Benoît; Laurent, Sébastien; Henneke, Ghislaine; Flament, Didier; Raffin, Jean-Paul

2010-03-05

A glycine-rich motif described as being involved in human polymerase delta proliferating cell nuclear antigen (PCNA) binding has also been identified in all euryarchaeal DNA polymerase D (Pol D) family members. We redefined the motif as the (G)-PYF box. In the present study, Pol D (G)-PYF box motif mutants from Pyrococcus abyssi were generated to investigate its role in functional interactions with the cognate PCNA. We demonstrated that this motif is not essential for interactions between PabPol D (P. abyssi Pol D) and PCNA, using surface plasmon resonance and primer extension studies. Interestingly, the (G)-PYF box is located in a hydrophobic region close to the active site. The (G)-PYF box mutants exhibited altered DNA binding properties. In addition, the thermal stability of all mutants was reduced compared to that of wild type, and this effect could be attributed to increased exposure of the hydrophobic region. These studies suggest that the (G)-PYF box motif mediates intersubunit interactions and that it may be crucial for the thermostability of PabPol D. (c) 2010 Elsevier Ltd. All rights reserved.
cWINNOWER algorithm for finding fuzzy dna motifs

NASA Technical Reports Server (NTRS)

Liang, S.; Samanta, M. P.; Biegel, B. A.

2004-01-01

The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a sufficiently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l, d) = (15, 4). Copyright Imperial College Press.
Rules for the recognition of dilysine retrieval motifs by coatomer

PubMed Central

Ma, Wenfu; Goldberg, Jonathan

2013-01-01

Cytoplasmic dilysine motifs on transmembrane proteins are captured by coatomer α-COP and β′-COP subunits and packaged into COPI-coated vesicles for Golgi-to-ER retrieval. Numerous ER/Golgi proteins contain K(x)Kxx motifs, but the rules for their recognition are unclear. We present crystal structures of α-COP and β′-COP bound to a series of naturally occurring retrieval motifs—encompassing KKxx, KxKxx and non-canonical RKxx and viral KxHxx sequences. Binding experiments show that α-COP and β′-COP have generally the same specificity for KKxx and KxKxx, but only β′-COP recognizes the RKxx signal. Dilysine motif recognition involves lysine side-chain interactions with two acidic patches. Surprisingly, however, KKxx and KxKxx motifs bind differently, with their lysine residues transposed at the binding patches. We derive rules for retrieval motif recognition from key structural features: the reversed binding modes, the recognition of the C-terminal carboxylate group which enforces lysine positional context, and the tolerance of the acidic patches for non-lysine residues. PMID:23481256
Crystal structure of the Xpo1p nuclear export complex bound to the SxFG/PxFG repeats of the nucleoporin Nup42p.

PubMed

Koyama, Masako; Hirano, Hidemi; Shirai, Natsuki; Matsuura, Yoshiyuki

2017-10-01

Xpo1p (yeast CRM1) is the major nuclear export receptor that carries a plethora of proteins and ribonucleoproteins from the nucleus to cytoplasm. The passage of the Xpo1p nuclear export complex through nuclear pore complexes (NPCs) is facilitated by interactions with nucleoporins (Nups) containing extensive repeats of phenylalanine-glycine (so-called FG repeats), although the precise role of each Nup in the nuclear export reaction remains incompletely understood. Here we report structural and biochemical characterization of the interactions between the Xpo1p nuclear export complex and the FG repeats of Nup42p, a nucleoporin localized at the cytoplasmic face of yeast NPCs and has characteristic SxFG/PxFG sequence repeat motif. The crystal structure of Xpo1p-PKI-Nup42p-Gsp1p-GTP complex identified three binding sites for the SxFG/PxFG repeats on HEAT repeats 14-20 of Xpo1p. Mutational analyses of Nup42p showed that the conserved serines and prolines in the SxFG/PxFG repeats contribute to Xpo1p-Nup42p binding. Our structural and biochemical data suggest that SxFG/PxFG-Nups such as Nup42p and Nup159p at the cytoplasmic face of NPCs provide high-affinity docking sites for the Xpo1p nuclear export complex in the terminal stage of NPC passage and that subsequent disassembly of the nuclear export complex facilitates recycling of free Xpo1p back to the nucleus. © 2017 Molecular Biology Society of Japan and John Wiley & Sons Australia, Ltd.
Ca2+/Cation Antiporters (CaCA): Identification, Characterization and Expression Profiling in Bread Wheat (Triticum aestivum L.)

PubMed Central

Taneja, Mehak; Tyagi, Shivi; Sharma, Shailesh; Upadhyay, Santosh Kumar

2016-01-01

The Ca2+/cation antiporters (CaCA) superfamily proteins play vital function in Ca2+ ion homeostasis, which is an important event during development and defense response. Molecular characterization of these proteins has been performed in certain plants, but they are still not characterized in Triticum aestivum (bread wheat). Herein, we identified 34 TaCaCA superfamily proteins, which were classified into TaCAX, TaCCX, TaNCL, and TaMHX protein families based on their structural organization and evolutionary relation with earlier reported proteins. Since the T. aestivum comprises an allohexaploid genome, TaCaCA genes were derived from each A, B, and D subgenome and homeologous chromosome (HC), except chromosome-group 1. Majority of genes were derived from more than one HCs in each family that were considered as homeologous genes (HGs) due to their high similarity with each other. These HGs showed comparable gene and protein structures in terms of exon/intron organization and domain architecture. Majority of TaCaCA proteins comprised two Na_Ca_ex domains. However, TaNCLs consisted of an additional EF-hand domain with calcium binding motifs. Each TaCaCA protein family consisted of about 10 transmembrane and two α-repeat regions with specifically conserved signature motifs except TaNCL, which had single α-repeat. Variable expression of most of the TaCaCA genes during various developmental stages suggested their specified role in development. However, constitutively high expression of a few genes like TaCAX1-A and TaNCL1-B indicated their role throughout the plant growth and development. The modulated expression of certain genes during biotic (fungal infections) and abiotic stresses (heat, drought, salt) suggested their role in stress response. Majority of TaCCX and TaNCL family genes were found highly affected during various abiotic stresses. However, the role of individual gene needs to be established. The present study unfolded the opportunity for detail functional characterization of TaCaCA proteins and their utilization in future crop improvement programs. PMID:27965686
Ca2+/Cation Antiporters (CaCA): Identification, Characterization and Expression Profiling in Bread Wheat (Triticum aestivum L.).

PubMed

Taneja, Mehak; Tyagi, Shivi; Sharma, Shailesh; Upadhyay, Santosh Kumar

2016-01-01

The Ca 2+ /cation antiporters (CaCA) superfamily proteins play vital function in Ca 2+ ion homeostasis, which is an important event during development and defense response. Molecular characterization of these proteins has been performed in certain plants, but they are still not characterized in Triticum aestivum (bread wheat). Herein, we identified 34 TaCaCA superfamily proteins, which were classified into TaCAX, TaCCX, TaNCL, and TaMHX protein families based on their structural organization and evolutionary relation with earlier reported proteins. Since the T. aestivum comprises an allohexaploid genome, TaCaCA genes were derived from each A, B, and D subgenome and homeologous chromosome (HC), except chromosome-group 1. Majority of genes were derived from more than one HCs in each family that were considered as homeologous genes (HGs) due to their high similarity with each other. These HGs showed comparable gene and protein structures in terms of exon/intron organization and domain architecture. Majority of TaCaCA proteins comprised two Na_Ca_ex domains. However, TaNCLs consisted of an additional EF-hand domain with calcium binding motifs. Each TaCaCA protein family consisted of about 10 transmembrane and two α-repeat regions with specifically conserved signature motifs except TaNCL, which had single α-repeat. Variable expression of most of the TaCaCA genes during various developmental stages suggested their specified role in development. However, constitutively high expression of a few genes like TaCAX1-A and TaNCL1-B indicated their role throughout the plant growth and development. The modulated expression of certain genes during biotic (fungal infections) and abiotic stresses (heat, drought, salt) suggested their role in stress response. Majority of TaCCX and TaNCL family genes were found highly affected during various abiotic stresses. However, the role of individual gene needs to be established. The present study unfolded the opportunity for detail functional characterization of TaCaCA proteins and their utilization in future crop improvement programs.
Identification, characterization, and functional analysis of Tube and Pelle homologs in the mud crab Scylla paramamosain.

PubMed

Li, Xin-Cang; Zhang, Xiao-Wen; Zhou, Jun-Fang; Ma, Hong-Yu; Liu, Zhi-Dong; Zhu, Lei; Yao, Xiao-Juan; Li, Lin-Gui; Fang, Wen-Hong

2013-01-01

Tube and Pelle are essential components in Drosophila Toll signaling pathway. In this study, we characterized a pair of crustacean homologs of Tube and Pelle in Scylla paramamosain, namely, SpTube and SpPelle, and analyzed their immune functions. The full-length cDNA of SpTube had 2052 bp with a 1578 bp open reading frame (ORF) encoding a protein with 525 aa. A death domain (DD) and a kinase domain were predicted in the deduced protein. The full-length cDNA of SpPelle had 3825 bp with a 3420 bp ORF encoding a protein with 1140 aa. The protein contained a DD and a kinase domain. Two conserved repeat motifs previously called Tube repeat motifs present only in insect Tube or Tube-like sequences were found between these two domains. Alignments and structure predictions demonstrated that SpTubeDD and SpPelleDD significantly differed in sequence and 3D structure. Similar to TubeDD, SpTubeDD contained three common conserved residues (R, K, and R) on one surface that may mediate SpMyD88 binding and two common residues (A and A) on the other surface that may contribute to Pelle binding. By contrast, SpPelleDD lacked similar conservative residues. SpTube, insect Tube-like kinases, and human IRAK4 were found to be RD kinases with an RD dipeptide in the kinase domain. SpPelle, Pelle, insect Pelle-like kinases, and human IRAK1 were found to be non-RD kinases lacking an RD dipeptide. Both SpTube and SpPelle were highly expressed in hemocytes, gills, and hepatopancreas. Upon challenge, SpTube and SpPele were significantly increased in hemocytes by Gram-negative or Gram-positive bacteria, whereas only SpPelle was elevated by White Spot Syndrome Virus. The pull-down assay showed that SpTube can bind to both SpMyD88 and SpPelle. These results suggest that SpTube, SpPelle, and SpMyD88 may form a trimeric complex involved in the immunity of mud crabs against both Gram-negative and Gram-positive bacteria.
Identification, Characterization, and Functional Analysis of Tube and Pelle Homologs in the Mud Crab Scylla paramamosain

PubMed Central

Zhou, Jun-Fang; Ma, Hong-Yu; Liu, Zhi-Dong; Zhu, Lei; Yao, Xiao-Juan; Li, Lin-Gui; Fang, Wen-Hong

2013-01-01

Tube and Pelle are essential components in Drosophila Toll signaling pathway. In this study, we characterized a pair of crustacean homologs of Tube and Pelle in Scylla paramamosain, namely, SpTube and SpPelle, and analyzed their immune functions. The full-length cDNA of SpTube had 2052 bp with a 1578 bp open reading frame (ORF) encoding a protein with 525 aa. A death domain (DD) and a kinase domain were predicted in the deduced protein. The full-length cDNA of SpPelle had 3825 bp with a 3420 bp ORF encoding a protein with 1140 aa. The protein contained a DD and a kinase domain. Two conserved repeat motifs previously called Tube repeat motifs present only in insect Tube or Tube-like sequences were found between these two domains. Alignments and structure predictions demonstrated that SpTubeDD and SpPelleDD significantly differed in sequence and 3D structure. Similar to TubeDD, SpTubeDD contained three common conserved residues (R, K, and R) on one surface that may mediate SpMyD88 binding and two common residues (A and A) on the other surface that may contribute to Pelle binding. By contrast, SpPelleDD lacked similar conservative residues. SpTube, insect Tube-like kinases, and human IRAK4 were found to be RD kinases with an RD dipeptide in the kinase domain. SpPelle, Pelle, insect Pelle-like kinases, and human IRAK1 were found to be non-RD kinases lacking an RD dipeptide. Both SpTube and SpPelle were highly expressed in hemocytes, gills, and hepatopancreas. Upon challenge, SpTube and SpPele were significantly increased in hemocytes by Gram-negative or Gram-positive bacteria, whereas only SpPelle was elevated by White Spot Syndrome Virus. The pull-down assay showed that SpTube can bind to both SpMyD88 and SpPelle. These results suggest that SpTube, SpPelle, and SpMyD88 may form a trimeric complex involved in the immunity of mud crabs against both Gram-negative and Gram-positive bacteria. PMID:24116143
Full mitochondrial genome sequences of two endemic Philippine hornbill species (Aves: Bucerotidae) provide evidence for pervasive mitochondrial DNA recombination.

PubMed

Sammler, Svenja; Bleidorn, Christoph; Tiedemann, Ralph

2011-01-14

Although nowaday it is broadly accepted that mitochondrial DNA (mtDNA) may undergo recombination, the frequency of such recombination remains controversial. Its estimation is not straightforward, as recombination under homoplasmy (i.e., among identical mt genomes) is likely to be overlooked. In species with tandem duplications of large mtDNA fragments the detection of recombination can be facilitated, as it can lead to gene conversion among duplicates. Although the mechanisms for concerted evolution in mtDNA are not fully understood yet, recombination rates have been estimated from "one per speciation event" down to 850 years or even "during every replication cycle". Here we present the first complete mt genome of the avian family Bucerotidae, i.e., that of two Philippine hornbills, Aceros waldeni and Penelopides panini. The mt genomes are characterized by a tandemly duplicated region encompassing part of cytochrome b, 3 tRNAs, NADH6, and the control region. The duplicated fragments are identical to each other except for a short section in domain I and for the length of repeat motifs in domain III of the control region. Due to the heteroplasmy with regard to the number of these repeat motifs, there is some size variation in both genomes; with around 21,657 bp (A. waldeni) and 22,737 bp (P. panini), they significantly exceed the hitherto longest known avian mt genomes, that of the albatrosses. We discovered concerted evolution between the duplicated fragments within individuals. The existence of differences between individuals in coding genes as well as in the control region, which are maintained between duplicates, indicates that recombination apparently occurs frequently, i.e., in every generation. The homogenised duplicates are interspersed by a short fragment which shows no sign of recombination. We hypothesize that this region corresponds to the so-called Replication Fork Barrier (RFB), which has been described from the chicken mitochondrial genome. As this RFB is supposed to halt replication, it offers a potential mechanistic explanation for frequent recombination in mitochondrial genomes.

Full mitochondrial genome sequences of two endemic Philippine hornbill species (Aves: Bucerotidae) provide evidence for pervasive mitochondrial DNA recombination

PubMed Central

2011-01-01

Background Although nowaday it is broadly accepted that mitochondrial DNA (mtDNA) may undergo recombination, the frequency of such recombination remains controversial. Its estimation is not straightforward, as recombination under homoplasmy (i.e., among identical mt genomes) is likely to be overlooked. In species with tandem duplications of large mtDNA fragments the detection of recombination can be facilitated, as it can lead to gene conversion among duplicates. Although the mechanisms for concerted evolution in mtDNA are not fully understood yet, recombination rates have been estimated from "one per speciation event" down to 850 years or even "during every replication cycle". Results Here we present the first complete mt genome of the avian family Bucerotidae, i.e., that of two Philippine hornbills, Aceros waldeni and Penelopides panini. The mt genomes are characterized by a tandemly duplicated region encompassing part of cytochrome b, 3 tRNAs, NADH6, and the control region. The duplicated fragments are identical to each other except for a short section in domain I and for the length of repeat motifs in domain III of the control region. Due to the heteroplasmy with regard to the number of these repeat motifs, there is some size variation in both genomes; with around 21,657 bp (A. waldeni) and 22,737 bp (P. panini), they significantly exceed the hitherto longest known avian mt genomes, that of the albatrosses. We discovered concerted evolution between the duplicated fragments within individuals. The existence of differences between individuals in coding genes as well as in the control region, which are maintained between duplicates, indicates that recombination apparently occurs frequently, i.e., in every generation. Conclusions The homogenised duplicates are interspersed by a short fragment which shows no sign of recombination. We hypothesize that this region corresponds to the so-called Replication Fork Barrier (RFB), which has been described from the chicken mitochondrial genome. As this RFB is supposed to halt replication, it offers a potential mechanistic explanation for frequent recombination in mitochondrial genomes. PMID:21235758
Properties of the [NiFe]-hydrogenase maturation protein HypD.

PubMed

Blokesch, Melanie; Böck, August

2006-07-24

A mutational screen of amino acid residues of hydrogenase maturation protein HypD from Escherichia coli disclosed that seven conserved cysteine residues located in three different motifs in HypD are essential. Evidence is presented for potential functions of these motifs in the maturation process.
Plasma Chemokines in Patients with Alcohol Use Disorders: Association of CCL11 (Eotaxin-1) with Psychiatric Comorbidity.

PubMed

García-Marchena, Nuria; Araos, Pedro Fernando; Barrios, Vicente; Sánchez-Marín, Laura; Chowen, Julie A; Pedraz, María; Castilla-Ortega, Estela; Romero-Sanchiz, Pablo; Ponce, Guillermo; Gavito, Ana L; Decara, Juan; Silva, Daniel; Torrens, Marta; Argente, Jesús; Rubio, Gabriel; Serrano, Antonia; de Fonseca, Fernando Rodríguez; Pavón, Francisco Javier

2016-01-01

Recent studies have linked changes in peripheral chemokine concentrations to the presence of both addictive behaviors and psychiatric disorders. The present study further explore this link by analyzing the potential association of psychiatry comorbidity with alterations in the concentrations of circulating plasma chemokine in patients of both sexes diagnosed with alcohol use disorders (AUD). To this end, 85 abstinent subjects with AUD from an outpatient setting and 55 healthy subjects were evaluated for substance and mental disorders. Plasma samples were obtained to quantify chemokine concentrations [C-C motif (CC), C-X-C motif (CXC), and C-X 3 -C motif (CX 3 C) chemokines]. Abstinent AUD patients displayed a high prevalence of comorbid mental disorders (72%) and other substance use disorders (45%). Plasma concentrations of chemokines CXCL12/stromal cell-derived factor-1 ( p < 0.001) and CX 3 CL1/fractalkine ( p < 0.05) were lower in AUD patients compared to controls, whereas CCL11/eotaxin-1 concentrations were strongly decreased in female AUD patients ( p < 0.001). In the alcohol group, CXCL8 concentrations were increased in patients with liver and pancreas diseases and there was a significant correlation to aspartate transaminase ( r = +0.456, p < 0.001) and gamma-glutamyltransferase ( r = +0.647, p < 0.001). Focusing on comorbid psychiatric disorders, we distinguish between patients with additional mental disorders ( N = 61) and other substance use disorders ( N = 38). Only CCL11 concentrations were found to be altered in AUD patients diagnosed with mental disorders ( p < 0.01) with a strong main effect of sex. Thus, patients with mood disorders ( N = 42) and/or anxiety ( N = 16) had lower CCL11 concentrations than non-comorbid patients being more evident in women. The alcohol-induced alterations in circulating chemokines were also explored in preclinical models of alcohol use with male Wistar rats. Rats exposed to repeated ethanol (3 g/kg, gavage) had lower CXCL12 ( p < 0.01) concentrations and higher CCL11 concentrations ( p < 0.001) relative to vehicle-treated rats. Additionally, the increased CCL11 concentrations in rats exposed to ethanol were enhanced by the prior exposure to restraint stress ( p < 0.01). Concordantly, acute ethanol exposure induced changes in CXCL12, CX 3 CL1, and CCL11 in the same direction to repeated exposure. These results clearly indicate a contribution of specific chemokines to the phenotype of AUD and a strong effect of sex, revealing a link of CCL11 to alcohol and anxiety/stress.
Plasma Chemokines in Patients with Alcohol Use Disorders: Association of CCL11 (Eotaxin-1) with Psychiatric Comorbidity

PubMed Central

García-Marchena, Nuria; Araos, Pedro Fernando; Barrios, Vicente; Sánchez-Marín, Laura; Chowen, Julie A.; Pedraz, María; Castilla-Ortega, Estela; Romero-Sanchiz, Pablo; Ponce, Guillermo; Gavito, Ana L.; Decara, Juan; Silva, Daniel; Torrens, Marta; Argente, Jesús; Rubio, Gabriel; Serrano, Antonia; de Fonseca, Fernando Rodríguez; Pavón, Francisco Javier

2017-01-01

Recent studies have linked changes in peripheral chemokine concentrations to the presence of both addictive behaviors and psychiatric disorders. The present study further explore this link by analyzing the potential association of psychiatry comorbidity with alterations in the concentrations of circulating plasma chemokine in patients of both sexes diagnosed with alcohol use disorders (AUD). To this end, 85 abstinent subjects with AUD from an outpatient setting and 55 healthy subjects were evaluated for substance and mental disorders. Plasma samples were obtained to quantify chemokine concentrations [C–C motif (CC), C–X–C motif (CXC), and C–X3–C motif (CX3C) chemokines]. Abstinent AUD patients displayed a high prevalence of comorbid mental disorders (72%) and other substance use disorders (45%). Plasma concentrations of chemokines CXCL12/stromal cell-derived factor-1 (p < 0.001) and CX3CL1/fractalkine (p < 0.05) were lower in AUD patients compared to controls, whereas CCL11/eotaxin-1 concentrations were strongly decreased in female AUD patients (p < 0.001). In the alcohol group, CXCL8 concentrations were increased in patients with liver and pancreas diseases and there was a significant correlation to aspartate transaminase (r = +0.456, p < 0.001) and gamma-glutamyltransferase (r = +0.647, p < 0.001). Focusing on comorbid psychiatric disorders, we distinguish between patients with additional mental disorders (N = 61) and other substance use disorders (N = 38). Only CCL11 concentrations were found to be altered in AUD patients diagnosed with mental disorders (p < 0.01) with a strong main effect of sex. Thus, patients with mood disorders (N = 42) and/or anxiety (N = 16) had lower CCL11 concentrations than non-comorbid patients being more evident in women. The alcohol-induced alterations in circulating chemokines were also explored in preclinical models of alcohol use with male Wistar rats. Rats exposed to repeated ethanol (3 g/kg, gavage) had lower CXCL12 (p < 0.01) concentrations and higher CCL11 concentrations (p < 0.001) relative to vehicle-treated rats. Additionally, the increased CCL11 concentrations in rats exposed to ethanol were enhanced by the prior exposure to restraint stress (p < 0.01). Concordantly, acute ethanol exposure induced changes in CXCL12, CX3CL1, and CCL11 in the same direction to repeated exposure. These results clearly indicate a contribution of specific chemokines to the phenotype of AUD and a strong effect of sex, revealing a link of CCL11 to alcohol and anxiety/stress. PMID:28149283
[The horror story--a contribution of horror literature to psychoanalysis].

PubMed

Pohl, H

1985-01-01

Using the example of the vampire motif origin and psychic function of the ghost-story in context of the pertaining historical situation are presented. A comparison to the development and function of night-mares is drawn.--The vampire motif in Europe originally developed at the end of the Middle Ages. As with the collective madness of witch persecution it was a superstition which was in fact supported by the church of this time. The belief in vampires was used for splitting of, projection and acting-out of taboo aggressive, oral and sexual drives. Although the era of Englightment quenched this superstition, the motif started to crop up in sublimated form: the vampire became a favourite motif in serious as in trivial literature of the 19th century. At first it can be seen as symbolic expression of rejected anxiety and guilt feelings of the bourgoisie after having thrown the absolutistic institution from power. In the course of the century the motif developed increasingly to an enciphered representation of sexuality perverted due to repression and rejection.--Our current social and literary life has now changed--also under the influence of Freuds work. The hypothesis is presented that meanwhile science fiction literature has become the successor of the ghost story, since it allows the preconscious presentation of contemporary anxiety and conflict and their rejection. Since fictions therefore can be viewed as a serious collective nightmare of the second half of the 20th century.
Chemical Space Mapping and Structure-Activity Analysis of the ChEMBL Antiviral Compound Set.

PubMed

Klimenko, Kyrylo; Marcou, Gilles; Horvath, Dragos; Varnek, Alexandre

2016-08-22

Curation, standardization and data fusion of the antiviral information present in the ChEMBL public database led to the definition of a robust data set, providing an association of antiviral compounds to seven broadly defined antiviral activity classes. Generative topographic mapping (GTM) subjected to evolutionary tuning was then used to produce maps of the antiviral chemical space, providing an optimal separation of compound families associated with the different antiviral classes. The ability to pinpoint the specific spots occupied (responsibility patterns) on a map by various classes of antiviral compounds opened the way for a GTM-supported search for privileged structural motifs, typical for each antiviral class. The privileged locations of antiviral classes were analyzed in order to highlight underlying privileged common structural motifs. Unlike in classical medicinal chemistry, where privileged structures are, almost always, predefined scaffolds, privileged structural motif detection based on GTM responsibility patterns has the decisive advantage of being able to automatically capture the nature ("resolution detail"-scaffold, detailed substructure, pharmacophore pattern, etc.) of the relevant structural motifs. Responsibility patterns were found to represent underlying structural motifs of various natures-from very fuzzy (groups of various "interchangeable" similar scaffolds), to the classical scenario in medicinal chemistry (underlying motif actually being the scaffold), to very precisely defined motifs (specifically substituted scaffolds).
Discovery of T Cell Receptor β Motifs Specific to HLA-B27-Positive Ankylosing Spondylitis by Deep Repertoire Sequence Analysis.

PubMed

Faham, Malek; Carlton, Victoria; Moorhead, Martin; Zheng, Jianbiao; Klinger, Mark; Pepin, Francois; Asbury, Thomas; Vignali, Marissa; Emerson, Ryan O; Robins, Harlan S; Ireland, James; Baechler-Gillespie, Emily; Inman, Robert D

2017-04-01

Ankylosing spondylitis (AS), a chronic inflammatory disorder, has a notable association with HLA-B27. One hypothesis suggests that a common antigen that binds to HLA-B27 is important for AS disease pathogenesis. This study was undertaken to determine sequences and motifs that are shared among HLA-B27-positive AS patients, using T cell repertoire next-generation sequencing. To identify motifs enriched among B27-positive AS patients, we performed T cell receptor β (TCRβ) repertoire sequencing on samples from 191 B27-positive AS patients, 43 B27-negative AS patients, and 227 controls, and we obtained >77 million TCRβ clonotype sequences. First, we assessed whether any of 50 previously published sequences were enriched in B27-positive AS patients. We then used training and test cohorts to identify discovered motifs that were enriched in B27-positive AS patients versus controls. Six previously published and 11 discovered motifs were enriched in the B27-positive AS samples as compared to controls. After combining motifs related by sequence, we identified a total of 15 independent motifs. Both the full set of 15 motifs and a set of 6 published motifs were enriched in the B27-positive AS patients as compared to B27-positive healthy individuals (P = 0.049 and P = 0.001, respectively). Using an independent cohort, we validated that at least some of these motifs were associated with AS, and not simply with B27-positive status. We identified TCRβ motifs that are enriched in B27-positive AS patients as compared to B27-positive healthy controls. This suggests that a common antigen, presented by HLA-B27 and detected by CD8+ T cells, may be associated with AS disease pathogenesis. © 2016, American College of Rheumatology.
Functional conservation and structural diversification of silk sericins in two moth species.

PubMed

Zurovec, Michal; Kludkiewicz, Barbara; Fedic, Robert; Sulitkova, Jitka; Mach, Vaclav; Kucerova, Lucie; Sehnal, Frantisek

2013-06-10

Sericins are hydrophilic structural proteins produced by caterpillars in the middle section of silk glands and layered over fibroin proteins secreted in the posterior section. In the process of spinning, fibroins form strong solid filaments, while sericins seal the pair of filaments into a single fiber and glue the fiber into a cocoon. Galleria mellonella and the previously examined Bombyx mori harbor three sericin genes that encode proteins containing long repetitive regions. Galleria sericin genes are similar to each other and the protein repeats are built from short and extremely serine-rich motifs, while Bombyx sericin genes are diversified and encode proteins with long and complex repeats. Developmental changes in sericin properties are controlled at the level of gene expression and splicing. In Galleria , MG-1 sericin is produced throughout larval life until the wandering stage, while the production of MG-2 and MG-3 reaches a peak during cocoon spinning.
Dithiopheneindenofluorene (TIF) Semiconducting Polymers with Very High Mobility in Field-Effect Transistors.

PubMed

Chen, Hu; Hurhangee, Michael; Nikolka, Mark; Zhang, Weimin; Kirkus, Mindaugas; Neophytou, Marios; Cryer, Samuel J; Harkin, David; Hayoz, Pascal; Abdi-Jalebi, Mojtaba; McNeill, Christopher R; Sirringhaus, Henning; McCulloch, Iain

2017-09-01

The charge-carrier mobility of organic semiconducting polymers is known to be enhanced when the energetic disorder of the polymer is minimized. Fused, planar aromatic ring structures contribute to reducing the polymer conformational disorder, as demonstrated by polymers containing the indacenodithiophene (IDT) repeat unit, which have both a low Urbach energy and a high mobility in thin-film-transistor (TFT) devices. Expanding on this design motif, copolymers containing the dithiopheneindenofluorene repeat unit are synthesized, which extends the fused aromatic structure with two additional phenyl rings, further rigidifying the polymer backbone. A range of copolymers are prepared and their electrical properties and thin-film morphology evaluated, with the co-benzothiadiazole polymer having a twofold increase in hole mobility when compared to the IDT analog, reaching values of almost 3 cm 2 V -1 s -1 in bottom-gate top-contact organic field-effect transistors. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Telomerase Mechanism of Telomere Synthesis

PubMed Central

Wu, R. Alex; Upton, Heather E.; Vogan, Jacob M.; Collins, Kathleen

2017-01-01

Telomerase is the essential reverse transcriptase required for linear chromosome maintenance in most eukaryotes. Telomerase supplements the tandem array of simple-sequence repeats at chromosome ends to compensate for the DNA erosion inherent in genome replication. The template for telomerase reverse transcriptase is within the RNA subunit of the ribonucleoprotein complex, which in cells contains additional telomerase holoenzyme proteins that assemble the active ribonucleoprotein and promote its function at telomeres. Telomerase is distinct among polymerases in its reiterative reuse of an internal template. The template is precisely defined, processively copied, and regenerated by release of single-stranded product DNA. New specificities of nucleic acid handling that underlie the catalytic cycle of repeat synthesis derive from both active site specialization and new motif elaborations in protein and RNA subunits. Studies of telomerase provide unique insights into cellular requirements for genome stability, tissue renewal, and tumorigenesis as well as new perspectives on dynamic ribonucleoprotein machines. PMID:28141967
ATtRACT-a database of RNA-binding proteins and associated motifs.

PubMed

Giudice, Girolamo; Sánchez-Cabo, Fátima; Torroja, Carlos; Lara-Pezzi, Enrique

2016-01-01

RNA-binding proteins (RBPs) play a crucial role in key cellular processes, including RNA transport, splicing, polyadenylation and stability. Understanding the interaction between RBPs and RNA is key to improve our knowledge of RNA processing, localization and regulation in a global manner. Despite advances in recent years, a unified non-redundant resource that includes information on experimentally validated motifs, RBPs and integrated tools to exploit this information is lacking. Here, we developed a database named ATtRACT (available athttp://attract.cnic.es) that compiles information on 370 RBPs and 1583 RBP consensus binding motifs, 192 of which are not present in any other database. To populate ATtRACT we (i) extracted and hand-curated experimentally validated data from CISBP-RNA, SpliceAid-F, RBPDB databases, (ii) integrated and updated the unavailable ASD database and (iii) extracted information from Protein-RNA complexes present in Protein Data Bank database through computational analyses. ATtRACT provides also efficient algorithms to search a specific motif and scan one or more RNA sequences at a time. It also allows discoveringde novomotifs enriched in a set of related sequences and compare them with the motifs included in the database.Database URL:http:// attract. cnic. es. © The Author(s) 2016. Published by Oxford University Press.
Structural Insights into the Quadruplex-Duplex 3' Interface Formed from a Telomeric Repeat: A Potential Molecular Target.

PubMed

Russo Krauss, Irene; Ramaswamy, Sneha; Neidle, Stephen; Haider, Shozeb; Parkinson, Gary N

2016-02-03

We report here on an X-ray crystallographic and molecular modeling investigation into the complex 3' interface formed between putative parallel stranded G-quadruplexes and a duplex DNA sequence constructed from the human telomeric repeat sequence TTAGGG. Our crystallographic approach provides a detailed snapshot of a telomeric 3' quadruplex-duplex junction: a junction that appears to have the potential to form a unique molecular target for small molecule binding and interference with telomere-related functions. This unique target is particularly relevant as current high-affinity compounds that bind putative G-quadruplex forming sequences only rarely have a high degree of selectivity for a particular quadruplex. Here DNA junctions were assembled using different putative quadruplex-forming scaffolds linked at the 3' end to a telomeric duplex sequence and annealed to a complementary strand. We successfully generated a series of G-quadruplex-duplex containing crystals, both alone and in the presence of ligands. The structures demonstrate the formation of a parallel folded G-quadruplex and a B-form duplex DNA stacked coaxially. Most strikingly, structural data reveals the consistent formation of a TAT triad platform between the two motifs. This triad allows for a continuous stack of bases to link the quadruplex motif with the duplex region. For these crystal structures formed in the absence of ligands, the TAT triad interface occludes ligand binding at the 3' quadruplex-duplex interface, in agreement with in silico docking predictions. However, with the rearrangement of a single nucleotide, a stable pocket can be produced, thus providing an opportunity for the binding of selective molecules at the interface.
Tuning the Cavity Size and Chirality of Self-Assembling 3D DNA Crystals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simmons, Chad R.; Zhang, Fei; MacCulloch, Tara

The foundational goal of structural DNA nanotechnology—the field that uses oligonucleotides as a molecular building block for the programmable self-assembly of nanostructured systems—was to use DNA to construct three-dimensional (3D) lattices for solving macromolecular structures. The programmable nature of DNA makes it an ideal system for rationally constructing self-assembled crystals and immobilizing guest molecules in a repeating 3D array through their specific stereospatial interactions with the scaffold. In this work, we have extended a previously described motif (4 × 5) by expanding the structure to a system that links four double-helical layers; we use a central weaving oligonucleotide containing amore » sequence of four six-base repeats (4 × 6), forming a matrix of layers that are organized and dictated by a series of Holliday junctions. In addition, we have assembled mirror image crystals (l-DNA) with the identical sequence that are completely resistant to nucleases. Bromine and selenium derivatives were obtained for the l- and d-DNA forms, respectively, allowing phase determination for both forms and solution of the resulting structures to 3.0 and 3.05 Å resolution. Both right- and left-handed forms crystallized in the trigonal space groups with mirror image 3-fold helical screw axes P32 and P31 for each motif, respectively. The structures reveal a highly organized array of discrete and well-defined cavities that are suitable for hosting guest molecules and allow us to dictate a priori the assembly of guest–DNA conjugates with a specified crystalline hand.« less
Proline: The Distribution, Frequency, Positioning, and Common Functional Roles of Proline and Polyproline Sequences in the Human Proteome

PubMed Central

Morgan, Alexander A.; Rubenstein, Edward

2013-01-01

Proline is an anomalous amino acid. Its nitrogen atom is covalently locked within a ring, thus it is the only proteinogenic amino acid with a constrained phi angle. Sequences of three consecutive prolines can fold into polyproline helices, structures that join alpha helices and beta pleats as architectural motifs in protein configuration. Triproline helices are participants in protein-protein signaling interactions. Longer spans of repeat prolines also occur, containing as many as 27 consecutive proline residues. Little is known about the frequency, positioning, and functional significance of these proline sequences. Therefore we have undertaken a systematic bioinformatics study of proline residues in proteins. We analyzed the distribution and frequency of 687,434 proline residues among 18,666 human proteins, identifying single residues, dimers, trimers, and longer repeats. Proline accounts for 6.3% of the 10,882,808 protein amino acids. Of all proline residues, 4.4% are in trimers or longer spans. We detected patterns that influence function based on proline location, spacing, and concentration. We propose a classification based on proline-rich, polyproline-rich, and proline-poor status. Whereas singlet proline residues are often found in proteins that display recurring architectural patterns, trimers or longer proline sequences tend be associated with the absence of repetitive structural motifs. Spans of 6 or more are associated with DNA/RNA processing, actin, and developmental processes. We also suggest a role for proline in Kruppel-type zinc finger protein control of DNA expression, and in the nucleation and translocation of actin by the formin complex. PMID:23372670
Analyses of carnivore microsatellites and their intimate association with tRNA-derived SINEs

PubMed Central

López-Giráldez, Francesc; Andrés, Olga; Domingo-Roura, Xavier; Bosch, Montserrat

2006-01-01

Background The popularity of microsatellites has greatly increased in the last decade on account of their many applications. However, little is currently understood about the factors that influence their genesis and distribution among and within species genomes. In this work, we analyzed carnivore microsatellite clones from GenBank to study their association with interspersed repeats and elucidate the role of the latter in microsatellite genesis and distribution. Results We constructed a comprehensive carnivore microsatellite database comprising 1236 clones from GenBank. Thirty-three species of 11 out of 12 carnivore families were represented, although two distantly related species, the domestic dog and cat, were clearly overrepresented. Of these clones, 330 contained tRNALys-derived SINEs and 357 contained other interspersed repeats. Our rough estimates of tRNA SINE copies per haploid genome were much higher than published ones. Our results also revealed a distinct juxtaposition of AG and A-rich repeats and tRNALys-derived SINEs suggesting their coevolution. Both microsatellites arose repeatedly in two regions of the insterspersed repeat. Moreover, microsatellites associated with tRNALys-derived SINEs showed the highest complexity and less potential instability. Conclusion Our results suggest that tRNALys-derived SINEs are a significant source for microsatellite generation in carnivores, especially for AG and A-rich repeat motifs. These observations indicate two modes of microsatellite generation: the expansion and variation of pre-existing tandem repeats and the conversion of sequences with high cryptic simplicity into a repeat array; mechanisms which are not specific to tRNALys-derived SINEs. Microsatellite and interspersed repeat coevolution could also explain different distribution of repeat types among and within species genomes. Finally, due to their higher complexity and lower potential informative content of microsatellites associated with tRNALys-derived SINEs, we recommend avoiding their use as genetic markers. PMID:17059596
Computational Analyses of Synergism in Small Molecular Network Motifs

PubMed Central

Zhang, Yili; Smolen, Paul; Baxter, Douglas A.; Byrne, John H.

2014-01-01

Cellular functions and responses to stimuli are controlled by complex regulatory networks that comprise a large diversity of molecular components and their interactions. However, achieving an intuitive understanding of the dynamical properties and responses to stimuli of these networks is hampered by their large scale and complexity. To address this issue, analyses of regulatory networks often focus on reduced models that depict distinct, reoccurring connectivity patterns referred to as motifs. Previous modeling studies have begun to characterize the dynamics of small motifs, and to describe ways in which variations in parameters affect their responses to stimuli. The present study investigates how variations in pairs of parameters affect responses in a series of ten common network motifs, identifying concurrent variations that act synergistically (or antagonistically) to alter the responses of the motifs to stimuli. Synergism (or antagonism) was quantified using degrees of nonlinear blending and additive synergism. Simulations identified concurrent variations that maximized synergism, and examined the ways in which it was affected by stimulus protocols and the architecture of a motif. Only a subset of architectures exhibited synergism following paired changes in parameters. The approach was then applied to a model describing interlocked feedback loops governing the synthesis of the CREB1 and CREB2 transcription factors. The effects of motifs on synergism for this biologically realistic model were consistent with those for the abstract models of single motifs. These results have implications for the rational design of combination drug therapies with the potential for synergistic interactions. PMID:24651495
SVM2Motif—Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor

PubMed Central

Vidovic, Marina M. -C.; Görnitz, Nico; Müller, Klaus-Robert; Rätsch, Gunnar; Kloft, Marius

2015-01-01

Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs) achieve state-of-the-art performances in genomic discrimination tasks, but—due to its black-box character—motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs) allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs—regardless of their length and complexity—underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set. PMID:26690911
Sequence- and Interactome-Based Prediction of Viral Protein Hotspots Targeting Host Proteins: A Case Study for HIV Nef

PubMed Central

Sarmady, Mahdi; Dampier, William; Tozeren, Aydin

2011-01-01

Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of their host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virus-host crosstalk. PMID:21738584
The group B streptococcal alpha C protein binds alpha1beta1-integrin through a novel KTD motif that promotes internalization of GBS within human epithelial cells.

PubMed

Bolduc, Gilles R; Madoff, Lawrence C

2007-12-01

Group B Streptococcus (GBS) is the leading cause of bacterial pneumonia, sepsis and meningitis among neonates and a cause of morbidity among pregnant women and immunocompromised adults. GBS epithelial cell invasion is associated with expression of alpha C protein (ACP). Loss of ACP expression results in a decrease in GBS internalization and translocation across human cervical epithelial cells (ME180). Soluble ACP and its 170 amino acid N-terminal region (NtACP), but not the repeat protein RR', bind to ME180 cells and reduce internalization of wild-type GBS to levels obtained with an ACP-deficient isogenic mutant. In the current study, ACP colocalized with alpha(1)beta(1)-integrin, resulting in integrin clustering as determined by laser scanning confocal microscopy. NtACP contains two structural domains, D1 and D2. D1 is structurally similar to fibronectin's integrin-binding region (FnIII10). D1's (KT)D146 motif is structurally similar to the FnIII10 (RG)D1495 integrin-binding motif, suggesting that ACP binds alpha(1)beta(1)-integrin via the D1 domain. The (KT)D146A mutation within soluble NtACP reduced its ability to bind alpha(1)beta(1)-integrin and inhibit GBS internalization within ME180 cells. Thus ACP binding to human epithelial cell integrins appears to contribute to GBS internalization within epithelial cells.
Repeated functional convergent effects of NaV1.7 on acid insensitivity in hibernating mammals

PubMed Central

Liu, Zhen; Wang, Wei; Zhang, Tong-Zuo; Li, Gong-Hua; He, Kai; Huang, Jing-Fei; Jiang, Xue-Long; Murphy, Robert W.; Shi, Peng

2014-01-01

Hibernating mammals need to be insensitive to acid in order to cope with conditions of high CO2; however, the molecular basis of acid tolerance remains largely unknown. The African naked mole-rat (Heterocephalus glaber) and hibernating mammals share similar environments and physiological features. In the naked mole-rat, acid insensitivity has been shown to be conferred by the functional motif of the sodium ion channel NaV1.7. There is now an opportunity to evaluate acid insensitivity in other taxa. In this study, we tested for functional convergence of NaV1.7 in 71 species of mammals, including 22 species that hibernate. Our analyses revealed a functional convergence of amino acid sequences, which occurred at least six times independently in mammals that hibernate. Evolutionary analyses determined that the convergence results from both parallel and divergent evolution of residues in the functional motif. Our findings not only identify the functional molecules responsible for acid insensitivity in hibernating mammals, but also open new avenues to elucidate the molecular underpinnings of acid insensitivity in mammals. PMID:24352952

Thermodynamic characterization of the multivalent interactions underlying rapid and selective translocation through the nuclear pore complex

PubMed Central

Hayama, Ryo; Sparks, Samuel; Hecht, Lee M.; Dutta, Kaushik; Karp, Jerome M.; Cabana, Christina M.; Rout, Michael P.; Cowburn, David

2018-01-01

Intrinsically disordered proteins (IDPs) play important roles in many biological systems. Given the vast conformational space that IDPs can explore, the thermodynamics of the interactions with their partners is closely linked to their biological functions. Intrinsically disordered regions of Phe–Gly nucleoporins (FG Nups) that contain multiple phenylalanine–glycine repeats are of particular interest, as their interactions with transport factors (TFs) underlie the paradoxically rapid yet also highly selective transport of macromolecules mediated by the nuclear pore complex. Here, we used NMR and isothermal titration calorimetry to thermodynamically characterize these multivalent interactions. These analyses revealed that a combination of low per-FG motif affinity and the enthalpy–entropy balance prevents high-avidity interaction between FG Nups and TFs, whereas the large number of FG motifs promotes frequent FG–TF contacts, resulting in enhanced selectivity. Our thermodynamic model underlines the importance of functional disorder of FG Nups. It helps explain the rapid and selective translocation of TFs through the nuclear pore complex and further expands our understanding of the mechanisms of “fuzzy” interactions involving IDPs. PMID:29374059
DNA sequence-dependent compartmentalization and silencing of chromatin at the nuclear lamina.

PubMed

Zullo, Joseph M; Demarco, Ignacio A; Piqué-Regi, Roger; Gaffney, Daniel J; Epstein, Charles B; Spooner, Chauncey J; Luperchio, Teresa R; Bernstein, Bradley E; Pritchard, Jonathan K; Reddy, Karen L; Singh, Harinder

2012-06-22

A large fraction of the mammalian genome is organized into inactive chromosomal domains along the nuclear lamina. The mechanism by which these lamina associated domains (LADs) are established remains to be elucidated. Using genomic repositioning assays, we show that LADs, spanning the developmentally regulated IgH and Cyp3a loci contain discrete DNA regions that associate chromatin with the nuclear lamina and repress gene activity in fibroblasts. Lamina interaction is established during mitosis and likely involves the localized recruitment of Lamin B during late anaphase. Fine-scale mapping of LADs reveals numerous lamina-associating sequences (LASs), which are enriched for a GAGA motif. This repeated motif directs lamina association and is bound by the transcriptional repressor cKrox, in a complex with HDAC3 and Lap2β. Knockdown of cKrox or HDAC3 results in dissociation of LASs/LADs from the nuclear lamina. These results reveal a mechanism that couples nuclear compartmentalization of chromatin domains with the control of gene activity. Copyright © 2012 Elsevier Inc. All rights reserved.
Repeated functional convergent effects of NaV1.7 on acid insensitivity in hibernating mammals.

PubMed

Liu, Zhen; Wang, Wei; Zhang, Tong-Zuo; Li, Gong-Hua; He, Kai; Huang, Jing-Fei; Jiang, Xue-Long; Murphy, Robert W; Shi, Peng

2014-02-07

Hibernating mammals need to be insensitive to acid in order to cope with conditions of high CO2; however, the molecular basis of acid tolerance remains largely unknown. The African naked mole-rat (Heterocephalus glaber) and hibernating mammals share similar environments and physiological features. In the naked mole-rat, acid insensitivity has been shown to be conferred by the functional motif of the sodium ion channel NaV1.7. There is now an opportunity to evaluate acid insensitivity in other taxa. In this study, we tested for functional convergence of NaV1.7 in 71 species of mammals, including 22 species that hibernate. Our analyses revealed a functional convergence of amino acid sequences, which occurred at least six times independently in mammals that hibernate. Evolutionary analyses determined that the convergence results from both parallel and divergent evolution of residues in the functional motif. Our findings not only identify the functional molecules responsible for acid insensitivity in hibernating mammals, but also open new avenues to elucidate the molecular underpinnings of acid insensitivity in mammals.
Highly Effective Serodiagnosis for Chagas' Disease ▿

PubMed Central

Hernández, Pilar; Heimann, Michael; Riera, Cristina; Solano, Marco; Santalla, José; Luquetti, Alejandro O.; Beck, Ewald

2010-01-01

Many proteins of Trypanosoma cruzi, the causative agent of Chagas' disease, contain characteristic arrays of highly repetitive immunogenic amino acid motifs. Diagnostic tests using these motifs in monomeric or dimeric form have proven to provide markedly improved specificity compared to conventional tests based on crude parasite extracts. However, in many cases the available tests still suffer from limited sensitivity. In this study we produced stable synthetic genes with maximal codon variability for the four diagnostic antigens, B13, CRA, TcD, and TcE, each containing between three and nine identical amino acid repeats. These genes were combined by linker sequences encoding short proline-rich peptides, giving rise to a 24-kDa fusion protein which was used as a novel diagnostic antigen in an enzyme-linked immunosorbent assay setup. Validation of the assay with a large number of well-characterized patient sera from Bolivia and Brazil revealed excellent diagnostic performance. The high sensitivity of the new test may allow future studies to use blood collected by finger prick and dried on filter paper, thus dramatically reducing the costs and effort for the detection of T. cruzi infection. PMID:20668136
Crystal structure of P58(IPK) TPR fragment reveals the mechanism for its molecular chaperone activity in UPR

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tao, Jiahui; Petrova, Kseniya; Ron, David

2010-05-25

P58(IPK) might function as an endoplasmic reticulum molecular chaperone to maintain protein folding homeostasis during unfolded protein responses. P58(IPK) contains nine tetratricopeptide repeat (TPR) motifs and a C-terminal J-domain within its primary sequence. To investigate the mechanism by which P58(IPK) functions to promote protein folding within the endoplasmic reticulum, we have determined the crystal structure of P58(IPK) TPR fragment to 2.5 {angstrom} resolution by the SAD method. The crystal structure of P58(IPK) revealed three domains (I-III) with similar folds and each domain contains three TPR motifs. An ELISA assay indicated that P58(IPK) acts as a molecular chaperone by interacting withmore » misfolded proteins such as luciferase and rhodanese. The P58(IPK) structure reveals a conserved hydrophobic patch located in domain I that might be involved in binding the misfolded polypeptides. Structure-based mutagenesis for the conserved hydrophobic residues located in domain I significantly reduced the molecular chaperone activity of P58(IPK).« less
Characterization of the targeting signal in mitochondrial β-barrel proteins

PubMed Central

Jores, Tobias; Klinger, Anna; Groß, Lucia E.; Kawano, Shin; Flinner, Nadine; Duchardt-Ferner, Elke; Wöhnert, Jens; Kalbacher, Hubert; Endo, Toshiya; Schleiff, Enrico; Rapaport, Doron

2016-01-01

Mitochondrial β-barrel proteins are synthesized on cytosolic ribosomes and must be specifically targeted to the organelle before their integration into the mitochondrial outer membrane. The signal that assures such precise targeting and its recognition by the organelle remained obscure. In the present study we show that a specialized β-hairpin motif is this long searched for signal. We demonstrate that a synthetic β-hairpin peptide competes with the import of mitochondrial β-barrel proteins and that proteins harbouring a β-hairpin peptide fused to passenger domains are targeted to mitochondria. Furthermore, a β-hairpin motif from mitochondrial proteins targets chloroplast β-barrel proteins to mitochondria. The mitochondrial targeting depends on the hydrophobicity of the β-hairpin motif. Finally, this motif interacts with the mitochondrial import receptor Tom20. Collectively, we reveal that β-barrel proteins are targeted to mitochondria by a dedicated β-hairpin element, and this motif is recognized at the organelle surface by the outer membrane translocase. PMID:27345737
Sequence, Structure, and Context Preferences of Human RNA Binding Proteins.

PubMed

Dominguez, Daniel; Freese, Peter; Alexis, Maria S; Su, Amanda; Hochman, Myles; Palden, Tsultrim; Bazile, Cassandra; Lambert, Nicole J; Van Nostrand, Eric L; Pratt, Gabriel A; Yeo, Gene W; Graveley, Brenton R; Burge, Christopher B

2018-06-07

RNA binding proteins (RBPs) orchestrate the production, processing, and function of mRNAs. Here, we present the affinity landscapes of 78 human RBPs using an unbiased assay that determines the sequence, structure, and context preferences of these proteins in vitro by deep sequencing of bound RNAs. These data enable construction of "RNA maps" of RBP activity without requiring crosslinking-based assays. We found an unexpectedly low diversity of RNA motifs, implying frequent convergence of binding specificity toward a relatively small set of RNA motifs, many with low compositional complexity. Offsetting this trend, however, we observed extensive preferences for contextual features distinct from short linear RNA motifs, including spaced "bipartite" motifs, biased flanking nucleotide composition, and bias away from or toward RNA structure. Our results emphasize the importance of contextual features in RNA recognition, which likely enable targeting of distinct subsets of transcripts by different RBPs that recognize the same linear motif. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Searching RNA motifs and their intermolecular contacts with constraint networks.

PubMed

Thébault, P; de Givry, S; Schiex, T; Gaspin, C

2006-09-01

Searching RNA gene occurrences in genomic sequences is a task whose importance has been renewed by the recent discovery of numerous functional RNA, often interacting with other ligands. Even if several programs exist for RNA motif search, none exists that can represent and solve the problem of searching for occurrences of RNA motifs in interaction with other molecules. We present a constraint network formulation of this problem. RNA are represented as structured motifs that can occur on more than one sequence and which are related together by possible hybridization. The implemented tool MilPat is used to search for several sRNA families in genomic sequences. Results show that MilPat allows to efficiently search for interacting motifs in large genomic sequences and offers a simple and extensible framework to solve such problems. New and known sRNA are identified as H/ACA candidates in Methanocaldococcus jannaschii. http://carlit.toulouse.inra.fr/MilPaT/MilPat.pl.
Rediscovering Medicinal Plants' Potential with OMICS: Microsatellite Survey in Expressed Sequence Tags of Eleven Traditional Plants with Potent Antidiabetic Properties

PubMed Central

Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar

2014-01-01

Abstract Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic stock for cross transferability in these plants and the literature on biomarkers and novel drug discovery for common chronic diseases such as diabetes. PMID:24802971
Rediscovering medicinal plants' potential with OMICS: microsatellite survey in expressed sequence tags of eleven traditional plants with potent antidiabetic properties.

PubMed

Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar; Talukdar, Anupam Das

2014-05-01

Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic stock for cross transferability in these plants and the literature on biomarkers and novel drug discovery for common chronic diseases such as diabetes.
How pathogens use linear motifs to perturb host cell networks.

PubMed

Via, Allegra; Uyar, Bora; Brun, Christine; Zanzoni, Andreas

2015-01-01

Molecular mimicry is one of the powerful stratagems that pathogens employ to colonise their hosts and take advantage of host cell functions to guarantee their replication and dissemination. In particular, several viruses have evolved the ability to interact with host cell components through protein short linear motifs (SLiMs) that mimic host SLiMs, thus facilitating their internalisation and the manipulation of a wide range of cellular networks. Here we present convincing evidence from the literature that motif mimicry also represents an effective, widespread hijacking strategy in prokaryotic and eukaryotic parasites. Further insights into host motif mimicry would be of great help in the elucidation of the molecular mechanisms behind host cell invasion and the development of anti-infective therapeutic strategies. Copyright © 2014 Elsevier Ltd. All rights reserved.
A dinucleotide motif in oligonucleotides shows potent immunomodulatory activity and overrides species-specific recognition observed with CpG motif.

PubMed

Kandimalla, Ekambar R; Bhagat, Lakshmi; Zhu, Fu-Gang; Yu, Dong; Cong, Yan-Ping; Wang, Daqing; Tang, Jimmy X; Tang, Jin-Yan; Knetter, Cathrine F; Lien, Egil; Agrawal, Sudhir

2003-11-25

Bacterial and synthetic DNAs containing CpG dinucleotides in specific sequence contexts activate the vertebrate immune system through Toll-like receptor 9 (TLR9). In the present study, we used a synthetic nucleoside with a bicyclic heterobase [1-(2'-deoxy-beta-d-ribofuranosyl)-2-oxo-7-deaza-8-methyl-purine; R] to replace the C in CpG, resulting in an RpG dinucleotide. The RpG dinucleotide was incorporated in mouse- and human-specific motifs in oligodeoxynucleotides (oligos) and 3'-3-linked oligos, referred to as immunomers. Oligos containing the RpG motif induced cytokine secretion in mouse spleen-cell cultures. Immunomers containing RpG dinucleotides showed activity in transfected-HEK293 cells stably expressing mouse TLR9, suggesting direct involvement of TLR9 in the recognition of RpG motif. In J774 macrophages, RpG motifs activated NF-kappa B and mitogen-activated protein kinase pathways. Immunomers containing the RpG dinucleotide induced high levels of IL-12 and IFN-gamma, but lower IL-6 in time- and concentration-dependent fashion in mouse spleen-cell cultures costimulated with IL-2. Importantly, immunomers containing GTRGTT and GARGTT motifs were recognized to a similar extent by both mouse and human immune systems. Additionally, both mouse- and human-specific RpG immunomers potently stimulated proliferation of peripheral blood mononuclear cells obtained from diverse vertebrate species, including monkey, pig, horse, sheep, goat, rat, and chicken. An immunomer containing GTRGTT motif prevented conalbumin-induced and ragweed allergen-induced allergic inflammation in mice. We show that a synthetic bicyclic nucleotide is recognized in the C position of a CpG dinucleotide by immune cells from diverse vertebrate species without bias for flanking sequences, suggesting a divergent nucleotide motif recognition pattern of TLR9.
DNA nanotechnology based on i-motif structures.

PubMed

Dong, Yuanchen; Yang, Zhongqiang; Liu, Dongsheng

2014-06-17

CONSPECTUS: Most biological processes happen at the nanometer scale, and understanding the energy transformations and material transportation mechanisms within living organisms has proved challenging. To better understand the secrets of life, researchers have investigated artificial molecular motors and devices over the past decade because such systems can mimic certain biological processes. DNA nanotechnology based on i-motif structures is one system that has played an important role in these investigations. In this Account, we summarize recent advances in functional DNA nanotechnology based on i-motif structures. The i-motif is a DNA quadruplex that occurs as four stretches of cytosine repeat sequences form C·CH(+) base pairs, and their stabilization requires slightly acidic conditions. This unique property has produced the first DNA molecular motor driven by pH changes. The motor is reliable, and studies show that it is capable of millisecond running speeds, comparable to the speed of natural protein motors. With careful design, the output of these types of motors was combined to drive micrometer-sized cantilevers bend. Using established DNA nanostructure assembly and functionalization methods, researchers can easily integrate the motor within other DNA assembled structures and functional units, producing DNA molecular devices with new functions such as suprahydrophobic/suprahydrophilic smart surfaces that switch, intelligent nanopores triggered by pH changes, molecular logic gates, and DNA nanosprings. Recently, researchers have produced motors driven by light and electricity, which have allowed DNA motors to be integrated within silicon-based nanodevices. Moreover, some devices based on i-motif structures have proven useful for investigating processes within living cells. The pH-responsiveness of the i-motif structure also provides a way to control the stepwise assembly of DNA nanostructures. In addition, because of the stability of the i-motif, this structure can serve as the stem of one-dimensional nanowires, and a four-strand stem can provide a new basis for three-dimensional DNA structures such as pillars. By sacrificing some accuracy in assembly, we used these properties to prepare the first fast-responding pure DNA supramolecular hydrogel. This hydrogel does not swell and cannot encapsulate small molecules. These unique properties could lead to new developments in smart materials based on DNA assembly and support important applications in fields such as tissue engineering. We expect that DNA nanotechnology will continue to develop rapidly. At a fundamental level, further studies should lead to greater understanding of the energy transformation and material transportation mechanisms at the nanometer scale. In terms of applications, we expect that many of these elegant molecular devices will soon be used in vivo. These further studies could demonstrate the power of DNA nanotechnology in biology, material science, chemistry, and physics.
The crystal structure of the regulatory domain of the human sodium-driven chloride/bicarbonate exchanger.

PubMed

Alvadia, Carolina M; Sommer, Theis; Bjerregaard-Andersen, Kaare; Damkier, Helle Hasager; Montrasio, Michele; Aalkjaer, Christian; Morth, J Preben

2017-09-21

The sodium-driven chloride/bicarbonate exchanger (NDCBE) is essential for maintaining homeostatic pH in neurons. The crystal structure at 2.8 Å resolution of the regulatory N-terminal domain of human NDCBE represents the first crystal structure of an electroneutral sodium-bicarbonate cotransporter. The crystal structure forms an equivalent dimeric interface as observed for the cytoplasmic domain of Band 3, and thus establishes that the consensus motif VTVLP is the key minimal dimerization motif. The VTVLP motif is highly conserved and likely to be the physiologically relevant interface for all other members of the SLC4 family. A novel conserved Zn 2+ -binding motif present in the N-terminal domain of NDCBE is identified and characterized in vitro. Cellular studies confirm the Zn 2+ dependent transport of two electroneutral bicarbonate transporters, NCBE and NBCn1. The Zn 2+ site is mapped to a cluster of histidines close to the conserved ETARWLKFEE motif and likely plays a role in the regulation of this important motif. The combined structural and bioinformatics analysis provides a model that predicts with additional confidence the physiologically relevant interface between the cytoplasmic domain and the transmembrane domain.
Dynamic changes in Sox2 spatio-temporal expression promote the second cell fate decision through Fgf4/Fgfr2 signaling in preimplantation mouse embryos.

PubMed

Mistri, Tapan Kumar; Arindrarto, Wibowo; Ng, Wei Ping; Wang, Choayang; Lim, Leng Hiong; Sun, Lili; Chambers, Ian; Wohland, Thorsten; Robson, Paul

2018-03-20

Oct4 and Sox2 regulate the expression of target genes such as Nanog, Fgf4 , and Utf1 , by binding to their respective regulatory motifs. Their functional cooperation is reflected in their ability to heterodimerize on adjacent cis regulatory motifs, the composite Sox/Oct motif. Given that Oct4 and Sox2 regulate many developmental genes, a quantitative analysis of their synergistic action on different Sox/Oct motifs would yield valuable insights into the mechanisms of early embryonic development. In the present study, we measured binding affinities of Oct4 and Sox2 to different Sox/Oct motifs using fluorescence correlation spectroscopy. We found that the synergistic binding interaction is driven mainly by the level of Sox2 in the case of the Fgf4 Sox/Oct motif. Taking into account Sox2 expression levels fluctuate more than Oct4 , our finding provides an explanation on how Sox2 controls the segregation of the epiblast and primitive endoderm populations within the inner cell mass of the developing rodent blastocyst. © 2018 The Author(s). Published by Portland Press Limited on behalf of the Biochemical Society.
Insights into Structural and Mechanistic Features of Viral IRES Elements

PubMed Central

Martinez-Salas, Encarnacion; Francisco-Velilla, Rosario; Fernandez-Chamorro, Javier; Embarek, Azman M.

2018-01-01

Internal ribosome entry site (IRES) elements are cis-acting RNA regions that promote internal initiation of protein synthesis using cap-independent mechanisms. However, distinct types of IRES elements present in the genome of various RNA viruses perform the same function despite lacking conservation of sequence and secondary RNA structure. Likewise, IRES elements differ in host factor requirement to recruit the ribosomal subunits. In spite of this diversity, evolutionarily conserved motifs in each family of RNA viruses preserve sequences impacting on RNA structure and RNA–protein interactions important for IRES activity. Indeed, IRES elements adopting remarkable different structural organizations contain RNA structural motifs that play an essential role in recruiting ribosomes, initiation factors and/or RNA-binding proteins using different mechanisms. Therefore, given that a universal IRES motif remains elusive, it is critical to understand how diverse structural motifs deliver functions relevant for IRES activity. This will be useful for understanding the molecular mechanisms beyond cap-independent translation, as well as the evolutionary history of these regulatory elements. Moreover, it could improve the accuracy to predict IRES-like motifs hidden in genome sequences. This review summarizes recent advances on the diversity and biological relevance of RNA structural motifs for viral IRES elements. PMID:29354113
Divergence and Conservative Evolution of XTNX Genes in Land Plants.

PubMed

Zhang, Yan-Mei; Xue, Jia-Yu; Liu, Li-Wei; Sun, Xiao-Qin; Zhou, Guang-Can; Chen, Min; Shao, Zhu-Qing; Hang, Yue-Yu

2017-01-01

The Toll-interleukin-1 receptor (TIR) and Nucleotide-binding site (NBS) domains are two major components of the TIR-NBS-leucine-rich repeat family plant disease resistance genes. Extensive functional and evolutionary studies have been performed on these genes; however, the characterization of a small group of genes that are composed of atypical TIR and NBS domains, namely XTNX genes, is limited. The present study investigated this specific gene family by conducting genome-wide analyses of 59 green plant genomes. A total of 143 XTNX genes were identified in 51 of the 52 land plant genomes, whereas no XTNX gene was detected in any green algae genomes, which indicated that XTNX genes originated upon emergence of land plants. Phylogenetic analysis revealed that the ancestral XTNX gene underwent two rounds of ancient duplications in land plants, which resulted in the formation of clades I/II and clades IIa/IIb successively. Although clades I and IIb have evolved conservatively in angiosperms, the motif composition difference and sequence divergence at the amino acid level suggest that functional divergence may have occurred since the separation of the two clades. In contrast, several features of the clade IIa genes, including the absence in the majority of dicots, the long branches in the tree, the frequent loss of ancestral motifs, and the loss of expression in all detected tissues of Zea mays , all suggest that the genes in this lineage might have undergone pseudogenization. This study highlights that XTNX genes are a gene family originated anciently in land plants and underwent specific conservative pattern in evolution.
A 170kDa multi-domain cystatin of Fasciola gigantica is active in the male reproductive system.

PubMed

Geadkaew, Amornrat; Kosa, Nanthawat; Siricoon, Sinee; Grams, Suksiri Vichasri; Grams, Rudi

2014-09-01

Cystatins are functional as intra- and extracellular inhibitors of cysteine proteases and are expressed as single or multi-domain proteins. We have previously described two single domain type 1 cystatins in the trematode Fasciola gigantica that are released into the parasite's intestinal tract and exhibit inhibitory activity against endogenous and host cathepsin L and B proteases. In contrast, the here presented 170kDa multi-domain cystatin (FgMDC) comprises signal peptide and 12 tandem repeated cystatin-like domains with similarity to type 2 single domain cystatins. The domains show high sequence divergence with identity values often <20% and at only 26.8% between the highest matching domains 6 and 10. Several domains contain degenerated QVVAG core motifs and/or lack other important residues of active type 2 cystatins. Domain-specific antisera detected multiple forms of FgMDC ranging from <10 to >120kDa molecular mass in immunoblots of parasite crude extracts and ES product with different banding patterns for each antiserum demonstrating complex processing of the proprotein. The four domains with the highest conserved QVVAG motifs were expressed in Escherichia coli and the refolded recombinant proteins blocked cysteine protease activity in the parasite's ES product. Strikingly, immunohistochemical analysis using seven domain-specific antisera localized FgMDC in testis lobes and sperm. It is speculated that the processed cystatin-like domains have function analogous to the mammalian group of male reproductive tissue-specific type 2 cystatins and are functional in spermiogenesis and fertilization. Copyright © 2014 Elsevier B.V. All rights reserved.
Bioinformatic analyses implicate the collaborating meiotic crossover/chiasma proteins Zip2, Zip3, and Spo22/Zip4 in ubiquitin labeling

PubMed Central

Perry, Jason; Kleckner, Nancy; Börner, G. Valentin

2005-01-01

Zip2 and Zip3 are meiosis-specific proteins that, in collaboration with several partners, act at the sites of crossover-designated, axis-associated recombinational interactions to mediate crossover/chiasma formation. Here, Spo22 (also called Zip4) is identified as a probable functional collaborator of Zip2/3. The molecular roles of Zip2, Zip3, and Spo22/Zip4 are unknown. All three proteins are part of a small evolutionary cohort comprising similar homologs in four related yeasts. Zip3 is shown to contain a RING finger whose structural features most closely match those of known ubiquitin E3s. Further, Zip3 exhibits major domainal homologies to Rad18, a known DNA-binding ubiquitin E3. Also described is an approach to the identification and mapping of repeated protein sequence motifs, Alignment Based Repeat Annotation (ABRA), that we have developed. When ABRA is applied to Zip2 and Spo22/Zip4, they emerge as a 14-blade WD40-like repeat protein and a 22-unit tetratricopeptide repeat protein, respectively. WD40 repeats of Cdc20, Cdh1, and Cdc16 and tetratricopeptide repeats of Cdc16, Cdc23, and Cdc27, all components of the anaphase-promoting complex, are also analyzed. These and other findings suggest that Zip2, Zip3, and Zip4 act together to mediate a process that involves Zip3-mediated ubiquitin labeling, potentially as a unique type of ubiquitin-conjugating complex. PMID:16314568
Determination of the chemical structure of the capsular polysaccharide of strain B33, a fast-growing soya bean-nodulating bacterium isolated from an arid region of China.

PubMed Central

Rodríguez-Carvajal, M A; Tejero-Mateo, P; Espartero, J L; Ruiz-Sainz, J E; Buendía-Clavería, A M; Ollero, F J; Yang, S S; Gil-Serrano, A M

2001-01-01

We have determined the structure of a polysaccharide from strain B33, a fast-growing bacterium that forms nitrogen-fixing nodules with Asiatic and American soya bean cultivars. On the basis of monosaccharide analysis, methylation analysis, one-dimensional 1H- and 13C-NMR and two-dimensional NMR experiments, the structure was shown to consist of a polymer having the repeating unit -->6)-4-O-methyl-alpha-D-Glcp-(1-->4)-3-O-methyl-beta-D-GlcpA-(1--> (where GlcpA is glucopyranuronic acid and Glcp is glucopyranose). Strain B33 produces a K-antigen polysaccharide repeating unit that does not have the structural motif sugar-Kdx [where Kdx is 3-deoxy-D-manno-2-octulosonic acid (Kdo) or a Kdo-related acid] proposed for different Sinorhizobium fredii strains, all of them being effective with Asiatic soya bean cultivars but unable to form nitrogen-fixing nodules with American soya bean cultivars. Instead, it resembles the K-antigen of S. fredii strain HH303 (rhamnose, galacturonic acid)n, which is also effective with both groups of soya bean cultivars. Only the capsular polysaccharide from strains B33 and HH303 have monosaccharide components that are also present in the surface polysaccharide of Bradyrhizobium elkanii strains, which consists of a 4-O-methyl-D-glucurono-L-rhamnan. PMID:11439101

Some links on this page may take you to non-federal websites. Their policies may differ from this site.