unique motifs identify: Topics by Science.gov

Sample records for unique motifs identify

Analysis of secondary structural elements in human microRNA hairpin precursors.

PubMed

Liu, Biao; Childs-Disney, Jessica L; Znosko, Brent M; Wang, Dan; Fallahi, Mohammad; Gallo, Steven M; Disney, Matthew D

2016-03-01

MicroRNAs (miRNAs) regulate gene expression by targeting complementary mRNAs for destruction or translational repression. Aberrant expression of miRNAs has been associated with various diseases including cancer, thus making them interesting therapeutic targets. The composite of secondary structural elements that comprise miRNAs could aid the design of small molecules that modulate their function. We analyzed the secondary structural elements, or motifs, present in all human miRNA hairpin precursors and compared them to highly expressed human RNAs with known structures and other RNAs from various organisms. Amongst human miRNAs, there are 3808 are unique motifs, many residing in processing sites. Further, we identified motifs in miRNAs that are not present in other highly expressed human RNAs, desirable targets for small molecules. MiRNA motifs were incorporated into a searchable database that is freely available. We also analyzed the most frequently occurring bulges and internal loops for each RNA class and found that the smallest loops possible prevail. However, the distribution of loops and the preferred closing base pairs were unique to each class. Collectively, we have completed a broad survey of motifs found in human miRNA precursors, highly expressed human RNAs, and RNAs from other organisms. Interestingly, unique motifs were identified in human miRNA processing sites, binding to which could inhibit miRNA maturation and hence function.
Identification of high-efficiency 3'GG gRNA motifs in indexed FASTA files with ngg2.

PubMed

Roberson, Elisha D O

CRISPR/Cas9 is emerging as one of the most-used methods of genome modification in organisms ranging from bacteria to human cells. However, the efficiency of editing varies tremendously site-to-site. A recent report identified a novel motif, called the 3'GG motif, which substantially increases the efficiency of editing at all sites tested in C. elegans . Furthermore, they highlighted that previously published gRNAs with high editing efficiency also had this motif. I designed a python command-line tool, ngg2, to identify 3'GG gRNA sites from indexed FASTA files. As a proof-of-concept, I screened for these motifs in six model genomes: Saccharomyces cerevisiae , Caenorhabditis elegans , Drosophila melanogaster , Danio rerio , Mus musculus , and Homo sapiens. I also scanned the genomes of pig ( Sus scrofa ) and African elephant ( Loxodonta africana ) to demonstrate the utility in non-model organisms. I identified more than 60 million single match 3'GG motifs in these genomes. Greater than 61% of all protein coding genes in the reference genomes had at least one unique 3'GG gRNA site overlapping an exon. In particular, more than 96% of mouse and 93% of human protein coding genes have at least one unique, overlapping 3'GG gRNA. These identified sites can be used as a starting point in gRNA selection, and the ngg2 tool provides an important ability to identify 3'GG editing sites in any species with an available genome sequence.
Identifying the preferred RNA motifs and chemotypes that interact by probing millions of combinations.

PubMed

Tran, Tuan; Disney, Matthew D

2012-01-01

RNA is an important therapeutic target but information about RNA-ligand interactions is limited. Here, we report a screening method that probes over 3,000,000 combinations of RNA motif-small molecule interactions to identify the privileged RNA structures and chemical spaces that interact. Specifically, a small molecule library biased for binding RNA was probed for binding to over 70,000 unique RNA motifs in a high throughput solution-based screen. The RNA motifs that specifically bind each small molecule were identified by microarray-based selection. In this library-versus-library or multidimensional combinatorial screening approach, hairpin loops (among a variety of RNA motifs) were the preferred RNA motif space that binds small molecules. Furthermore, it was shown that indole, 2-phenyl indole, 2-phenyl benzimidazole and pyridinium chemotypes allow for specific recognition of RNA motifs. As targeting RNA with small molecules is an extremely challenging area, these studies provide new information on RNA-ligand interactions that has many potential uses.
Identifying the Preferred RNA Motifs and Chemotypes that Interact by Probing Millions of Combinations

PubMed Central

Tran, Tuan; Disney, Matthew D.

2012-01-01

RNA is an important therapeutic target but information about RNA-ligand interactions is limited. Here we report a screening method that probes over 3,000,000 combinations of RNA motif-small molecule interactions to identify the privileged RNA structures and chemical spaces that interact. Specifically, a small molecule library biased for binding RNA was probed for binding to over 70,000 unique RNA motifs in a high throughput solution-based screen. The RNA motifs that specifically bind each small molecule were identified by microarray-based selection. In this library-versus-library or multidimensional combinatorial screening approach, hairpin loops (amongst a variety of RNA motifs) were the preferred RNA motif space that binds small molecules. Furthermore, it was shown that indole, 2-phenyl indole, 2-phenyl benzimidazole, and pyridinium chemotypes allow for specific recognition of RNA motifs. Since targeting RNA with small molecules is an extremely challenging area, these studies provide new information on RNA-ligand interactions that has many potential uses. PMID:23047683
An intracellular motif of GLUT4 regulates fusion of GLUT4-containing vesicles.

PubMed

Heyward, Catherine A; Pettitt, Trevor R; Leney, Sophie E; Welsh, Gavin I; Tavaré, Jeremy M; Wakelam, Michael J O

2008-05-20

Insulin stimulates glucose uptake by adipocytes through increasing translocation of the glucose transporter GLUT4 from an intracellular compartment to the plasma membrane. Fusion of GLUT4-containing vesicles at the cell surface is thought to involve phospholipase D activity, generating the signalling lipid phosphatidic acid, although the mechanism of action is not yet clear. Here we report the identification of a putative phosphatidic acid-binding motif in a GLUT4 intracellular loop. Mutation of this motif causes a decrease in the insulin-induced exposure of GLUT4 at the cell surface of 3T3-L1 adipocytes via an effect on vesicle fusion. The potential phosphatidic acid-binding motif identified in this study is unique to GLUT4 among the sugar transporters, therefore this motif may provide a unique mechanism for regulating insulin-induced translocation by phospholipase D signalling.
Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

PubMed

Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

2005-09-01

We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.
Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

PubMed Central

Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.

2016-01-01

Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825
Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

PubMed Central

Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

2013-01-01

The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545
Emotion Regulation through Movement: Unique Sets of Movement Characteristics are Associated with and Enhance Basic Emotions.

PubMed

Shafir, Tal; Tsachor, Rachelle P; Welch, Kathleen B

2015-01-01

We have recently demonstrated that motor execution, observation, and imagery of movements expressing certain emotions can enhance corresponding affective states and therefore could be used for emotion regulation. But which specific movement(s) should one use in order to enhance each emotion? This study aimed to identify, using Laban Movement Analysis (LMA), the Laban motor elements (motor characteristics) that characterize movements whose execution enhances each of the basic emotions: anger, fear, happiness, and sadness. LMA provides a system of symbols describing its motor elements, which gives a written instruction (motif) for the execution of a movement or movement-sequence over time. Six senior LMA experts analyzed a validated set of video clips showing whole body dynamic expressions of anger, fear, happiness and sadness, and identified the motor elements that were common to (appeared in) all clips expressing the same emotion. For each emotion, we created motifs of different combinations of the motor elements common to all clips of the same emotion. Eighty subjects from around the world read and moved those motifs, to identify the emotion evoked when moving each motif and to rate the intensity of the evoked emotion. All subjects together moved and rated 1241 motifs, which were produced from 29 different motor elements. Using logistic regression, we found a set of motor elements associated with each emotion which, when moved, predicted the feeling of that emotion. Each emotion was predicted by a unique set of motor elements and each motor element predicted only one emotion. Knowledge of which specific motor elements enhance specific emotions can enable emotional self-regulation through adding some desired motor qualities to one's personal everyday movements (rather than mimicking others' specific movements) and through decreasing motor behaviors which include elements that enhance negative emotions.
Emotion Regulation through Movement: Unique Sets of Movement Characteristics are Associated with and Enhance Basic Emotions

PubMed Central

Shafir, Tal; Tsachor, Rachelle P.; Welch, Kathleen B.

2016-01-01

We have recently demonstrated that motor execution, observation, and imagery of movements expressing certain emotions can enhance corresponding affective states and therefore could be used for emotion regulation. But which specific movement(s) should one use in order to enhance each emotion? This study aimed to identify, using Laban Movement Analysis (LMA), the Laban motor elements (motor characteristics) that characterize movements whose execution enhances each of the basic emotions: anger, fear, happiness, and sadness. LMA provides a system of symbols describing its motor elements, which gives a written instruction (motif) for the execution of a movement or movement-sequence over time. Six senior LMA experts analyzed a validated set of video clips showing whole body dynamic expressions of anger, fear, happiness and sadness, and identified the motor elements that were common to (appeared in) all clips expressing the same emotion. For each emotion, we created motifs of different combinations of the motor elements common to all clips of the same emotion. Eighty subjects from around the world read and moved those motifs, to identify the emotion evoked when moving each motif and to rate the intensity of the evoked emotion. All subjects together moved and rated 1241 motifs, which were produced from 29 different motor elements. Using logistic regression, we found a set of motor elements associated with each emotion which, when moved, predicted the feeling of that emotion. Each emotion was predicted by a unique set of motor elements and each motor element predicted only one emotion. Knowledge of which specific motor elements enhance specific emotions can enable emotional self-regulation through adding some desired motor qualities to one's personal everyday movements (rather than mimicking others' specific movements) and through decreasing motor behaviors which include elements that enhance negative emotions. PMID:26793147
Analysis decorating design on Perahu Buatan Barat, the Malay traditional boat by using frieze pattern

NASA Astrophysics Data System (ADS)

Wahab, Mohd Rohaizat Abdul; Ramli, Zuliskandar; Zakaria, Ros Mahwati Ahmad; Samad, Mohammad Anis Abdul

2017-01-01

Boat building tradition is one of the skills mastered by Malay craftsmen. Decoration on the Perahu Buatan Barat, the Malay traditional boat is one of the uniqueness of the production of traditional boats in East Coast of Malaysia. The tradition of Malay boat building, each plank was given specific names based on the line of planks. There is one line called `papan tarik' or `papan cantik' was usually decorated with paintings by a variety of motifs and patterns from the bow to the stern of the boat. The motifs usually taken from the surrounding environment as well as flora and fauna will be painted with motifs repeated but with differing formations. The aim of this study is to identify the motifs and analyze the formation of motifs by using mathematical methods of frieze pattern.
Characteristic motifs for families of allergenic proteins

PubMed Central

Ivanciuc, Ovidiu; Garcia, Tzintzuni; Torres, Miguel; Schein, Catherine H.; Braun, Werner

2008-01-01

The identification of potential allergenic proteins is usually done by scanning a database of allergenic proteins and locating known allergens with a high sequence similarity. However, there is no universally accepted cut-off value for sequence similarity to indicate potential IgE cross-reactivity. Further, overall sequence similarity may be less important than discrete areas of similarity in proteins with homologous structure. To identify such areas, we first classified all allergens and their subdomains in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/) to their closest protein families as defined in Pfam, and identified conserved physicochemical property motifs characteristic of each group of sequences. Allergens populate only a small subset of all known Pfam families, as all allergenic proteins in SDAP could be grouped to only 130 (of 9318 total) Pfams, and 31 families contain more than four allergens. Conserved physicochemical property motifs for the aligned sequences of the most populated Pfam families were identified with the PCPMer program suite and catalogued in the webserver Motif-Mate (http://born.utmb.edu/motifmate/summary.php). We also determined specific motifs for allergenic members of a family that could distinguish them from non-allergenic ones. These allergen specific motifs should be most useful in database searches for potential allergens. We found that sequence motifs unique to the allergens in three families (seed storage proteins, Bet v 1, and tropomyosin) overlap with known IgE epitopes, thus providing evidence that our motif based approach can be used to assess the potential allergenicity of novel proteins. PMID:18951633
Process-based network decomposition reveals backbone motif structure

PubMed Central

Wang, Guanyu; Du, Chenghang; Chen, Hao; Simha, Rahul; Rong, Yongwu; Xiao, Yi; Zeng, Chen

2010-01-01

A central challenge in systems biology today is to understand the network of interactions among biomolecules and, especially, the organizing principles underlying such networks. Recent analysis of known networks has identified small motifs that occur ubiquitously, suggesting that larger networks might be constructed in the manner of electronic circuits by assembling groups of these smaller modules. Using a unique process-based approach to analyzing such networks, we show for two cell-cycle networks that each of these networks contains a giant backbone motif spanning all the network nodes that provides the main functional response. The backbone is in fact the smallest network capable of providing the desired functionality. Furthermore, the remaining edges in the network form smaller motifs whose role is to confer stability properties rather than provide function. The process-based approach used in the above analysis has additional benefits: It is scalable, analytic (resulting in a single analyzable expression that describes the behavior), and computationally efficient (all possible minimal networks for a biological process can be identified and enumerated). PMID:20498084
Isolation and characterization of microsatellite loci in the intertidal sponge Halichondria panicea

USGS Publications Warehouse

Knowlton, Anne L.; Pierson, Barbara J.; Talbot, S.L.; Highsmith, Ray C.

2003-01-01

GA- and CA-enriched genomic libraries were constructed for the intertidal sponge Halichondria panicea. Unique repeat motifs identified varied from the expected simple dinucleotide repeats to more complex repeat units. All sequences tended to be highly repetitive but did not necessarily contain the targeted motifs. Seven microsatellite loci were evaluated on sponges from the clone source population. All seven were polymorphic with 5.43 ± 0.92 mean number of alleles. Six of the seven loci that could be resolved had mean heterozygosities of 0.14–0.68. The loci identified here will be useful for population studies.
RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections

PubMed Central

Jaeger, Sébastien; Thieffry, Denis

2017-01-01

Abstract Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. PMID:28591841
FoldMiner and LOCK 2: protein structure comparison and motif discovery on the web.

PubMed

Shapiro, Jessica; Brutlag, Douglas

2004-07-01

The FoldMiner web server (http://foldminer.stanford.edu/) provides remote access to methods for protein structure alignment and unsupervised motif discovery. FoldMiner is unique among such algorithms in that it improves both the motif definition and the sensitivity of a structural similarity search by combining the search and motif discovery methods and using information from each process to enhance the other. In a typical run, a query structure is aligned to all structures in one of several databases of single domain targets in order to identify its structural neighbors and to discover a motif that is the basis for the similarity among the query and statistically significant targets. This process is fully automated, but options for manual refinement of the results are available as well. The server uses the Chime plugin and customized controls to allow for visualization of the motif and of structural superpositions. In addition, we provide an interface to the LOCK 2 algorithm for rapid alignments of a query structure to smaller numbers of user-specified targets.
Comprehensive Identification of Glycated Peptides and Their Glycation Motifs in Plasma and Erythrocytes of Control and Diabetic Subjects

PubMed Central

Zhang, Qibin; Monroe, Matthew E.; Schepmoes, Athena A.; Clauss, Therese R. W.; Gritsenko, Marina A.; Meng, Da; Petyuk, Vladislav A.; Smith, Richard D.; Metz, Thomas O.

2011-01-01

Non-enzymatic glycation of proteins sets the stage for formation of advanced glycation end-products and development of chronic complications of diabetes. In this report, we extended our previous methods on proteomics analysis of glycated proteins to comprehensively identify glycated proteins in control and diabetic human plasma and erythrocytes. Using immunodepletion, enrichment, and fractionation strategies, we identified 7749 unique glycated peptides, corresponding to 3742 unique glycated proteins. Semi-quantitative comparisons showed that glycation levels of a number of proteins were significantly increased in diabetes and that erythrocyte proteins were more extensively glycated than plasma proteins. A glycation motif analysis revealed that some amino acids were favored more than others in the protein primary structures in the vicinity of the glycation sites in both sample types. The glycated peptides and corresponding proteins reported here provide a foundation for potential identification of novel markers for diabetes, hyperglycemia, and diabetic complications in future studies. PMID:21612289
Inforna 2.0: A Platform for the Sequence-Based Design of Small Molecules Targeting Structured RNAs.

PubMed

Disney, Matthew D; Winkelsas, Audrey M; Velagapudi, Sai Pradeep; Southern, Mark; Fallahi, Mohammad; Childs-Disney, Jessica L

2016-06-17

The development of small molecules that target RNA is challenging yet, if successful, could advance the development of chemical probes to study RNA function or precision therapeutics to treat RNA-mediated disease. Previously, we described Inforna, an approach that can mine motifs (secondary structures) within target RNAs, which is deduced from the RNA sequence, and compare them to a database of known RNA motif-small molecule binding partners. Output generated by Inforna includes the motif found in both the database and the desired RNA target, lead small molecules for that target, and other related meta-data. Lead small molecules can then be tested for binding and affecting cellular (dys)function. Herein, we describe Inforna 2.0, which incorporates all known RNA motif-small molecule binding partners reported in the scientific literature, a chemical similarity searching feature, and an improved user interface and is freely available via an online web server. By incorporation of interactions identified by other laboratories, the database has been doubled, containing 1936 RNA motif-small molecule interactions, including 244 unique small molecules and 1331 motifs. Interestingly, chemotype analysis of the compounds that bind RNA in the database reveals features in small molecule chemotypes that are privileged for binding. Further, this updated database expanded the number of cellular RNAs to which lead compounds can be identified.
RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections.

PubMed

Castro-Mondragon, Jaime Abraham; Jaeger, Sébastien; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

2017-07-27

Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Members of the Meloidogyne avirulence protein family contain multiple plant ligand-like motifs.

PubMed

Rutter, William B; Hewezi, Tarek; Maier, Tom R; Mitchum, Melissa G; Davis, Eric L; Hussey, Richard S; Baum, Thomas J

2014-08-01

Sedentary plant-parasitic nematodes engage in complex interactions with their host plants by secreting effector proteins. Some effectors of both root-knot nematodes (Meloidogyne spp.) and cyst nematodes (Heterodera and Globodera spp.) mimic plant ligand proteins. Most prominently, cyst nematodes secrete effectors that mimic plant CLAVATA3/ESR-related (CLE) ligand proteins. However, only cyst nematodes have been shown to secrete such effectors and to utilize CLE ligand mimicry in their interactions with host plants. Here, we document the presence of ligand-like motifs in bona fide root-knot nematode effectors that are most similar to CLE peptides from plants and cyst nematodes. We have identified multiple tandem CLE-like motifs conserved within the previously identified Meloidogyne avirulence protein (MAP) family that are secreted from root-knot nematodes and have been shown to function in planta. By searching all 12 MAP family members from multiple Meloidogyne spp., we identified 43 repetitive CLE-like motifs composing 14 unique variants. At least one CLE-like motif was conserved in each MAP family member. Furthermore, we documented the presence of other conserved sequences that resemble the variable domains described in Heterodera and Globodera CLE effectors. These findings document that root-knot nematodes appear to use CLE ligand mimicry and point toward a common host node targeted by two evolutionarily diverse groups of nematodes. As a consequence, it is likely that CLE signaling pathways are important in other phytonematode pathosystems as well.

DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes

PubMed Central

Sebestyén, Endre; Nagy, Tibor; Suhai, Sándor; Barta, Endre

2009-01-01

Background The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s). Results We have developed a new tool called DoOPSearch for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program. Conclusion We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes. PMID:19534755
A Unique T-Cell Receptor Amino Acid Sequence Selected by Human T-Cell Lymphotropic Virus Type 1 Tax301-309-Specific Cytotoxic T Cells in HLA-A24:02-Positive Asymptomatic Carriers and Adult T-Cell Leukemia/Lymphoma Patients.

PubMed

Ishihara, Yuko; Tanaka, Yukie; Kobayashi, Seiichiro; Kawamura, Koji; Nakasone, Hideki; Gomyo, Ayumi; Hayakawa, Jin; Tamaki, Masaharu; Akahoshi, Yu; Harada, Naonori; Kusuda, Machiko; Kameda, Kazuaki; Ugai, Tomotaka; Wada, Hidenori; Sakamoto, Kana; Sato, Miki; Terasako-Saito, Kiriko; Kikuchi, Misato; Kimura, Shun-Ichi; Tanihara, Aki; Kako, Shinichi; Uchimaru, Kaoru; Kanda, Yoshinobu

2017-10-01

We previously reported that the T-cell receptor (TCR) repertoire of human T-cell lymphotropic virus type 1 (HTLV-1) Tax 301-309 -specific CD8 + cytotoxic T cells (Tax 301-309 -CTLs) was highly restricted and a particular amino acid sequence motif, the PDR motif, was conserved among HLA-A*24:02-positive (HLA-A*24:02 + ) adult T-cell leukemia/lymphoma (ATL) patients who had undergone allogeneic hematopoietic cell transplantation (allo-HSCT). Furthermore, we found that donor-derived PDR + CTLs selectively expanded in ATL long-term HSCT survivors with strong CTL activity against HTLV-1. On the other hand, the TCR repertoires in Tax 301-309 -CTLs of asymptomatic HTLV-1 carriers (ACs) remain unclear. In this study, we directly identified the DNA sequence of complementarity-determining region 3 (CDR3) of the TCR-β chain of Tax 301-309 -CTLs at the single-cell level and compared not only the TCR repertoires but also the frequencies and phenotypes of Tax 301-309 -CTLs between ACs and ATL patients. We did not observe any essential difference in the frequencies of Tax 301-309 -CTLs between ACs and ATL patients. In the single-cell TCR repertoire analysis of Tax 301-309 -CTLs, 1,458 Tax 301-309 -CTLs and 140 clones were identified in this cohort. Tax 301-309 -CTLs showed highly restricted TCR repertoires with a strongly biased usage of BV7, and PDR, the unique motif in TCR-β CDR3, was exclusively observed in all ACs and ATL patients. However, there was no correlation between PDR + CTL frequencies and HTLV-1 proviral load (PVL). In conclusion, we have identified, for the first time, a unique amino acid sequence, PDR, as a public TCR-CDR3 motif against Tax in HLA-A*24:02 + HTLV-1-infected individuals. Further investigations are warranted to elucidate the role of the PDR + CTL response in the progression from carrier state to ATL. IMPORTANCE ATL is an aggressive T-cell malignancy caused by HTLV-1 infection. The HTLV-1 regulatory protein Tax aggressively promotes the proliferation of HTLV-1-infected lymphocytes and is also a major target antigen for CD8 + CTLs. In our previous evaluation of Tax 301-309 -CTLs, we found that a unique amino acid sequence motif, PDR, in CDR3 of the TCR-β chain of Tax 301-309 -CTLs was conserved among ATL patients after allo-HSCT. Furthermore, the PDR + Tax 301-309 -CTL clones selectively expanded and showed strong cytotoxic activities against HTLV-1. On the other hand, it remains unclear how Tax 301-309 -CTL repertoire exists in ACs. In this study, we comprehensively compared Tax-specific TCR repertoires at the single-cell level between ACs and ATL patients. Tax 301-309 -CTLs showed highly restricted TCR repertoires with a strongly biased usage of BV7, and PDR, the unique motif in TCR-β CDR3, was conserved in all ACs and ATL patients, regardless of clinical subtype in HTLV-1 infection. Copyright © 2017 American Society for Microbiology.
A Unique T-Cell Receptor Amino Acid Sequence Selected by Human T-Cell Lymphotropic Virus Type 1 Tax301-309-Specific Cytotoxic T Cells in HLA-A24:02-Positive Asymptomatic Carriers and Adult T-Cell Leukemia/Lymphoma Patients

PubMed Central

Ishihara, Yuko; Tanaka, Yukie; Kobayashi, Seiichiro; Kawamura, Koji; Nakasone, Hideki; Gomyo, Ayumi; Hayakawa, Jin; Tamaki, Masaharu; Akahoshi, Yu; Harada, Naonori; Kusuda, Machiko; Kameda, Kazuaki; Ugai, Tomotaka; Wada, Hidenori; Sakamoto, Kana; Sato, Miki; Terasako-Saito, Kiriko; Kikuchi, Misato; Kimura, Shun-ichi; Tanihara, Aki; Kako, Shinichi; Uchimaru, Kaoru

2017-01-01

ABSTRACT We previously reported that the T-cell receptor (TCR) repertoire of human T-cell lymphotropic virus type 1 (HTLV-1) Tax301-309-specific CD8+ cytotoxic T cells (Tax301-309-CTLs) was highly restricted and a particular amino acid sequence motif, the PDR motif, was conserved among HLA-A*24:02-positive (HLA-A*24:02+) adult T-cell leukemia/lymphoma (ATL) patients who had undergone allogeneic hematopoietic cell transplantation (allo-HSCT). Furthermore, we found that donor-derived PDR+ CTLs selectively expanded in ATL long-term HSCT survivors with strong CTL activity against HTLV-1. On the other hand, the TCR repertoires in Tax301-309-CTLs of asymptomatic HTLV-1 carriers (ACs) remain unclear. In this study, we directly identified the DNA sequence of complementarity-determining region 3 (CDR3) of the TCR-β chain of Tax301-309-CTLs at the single-cell level and compared not only the TCR repertoires but also the frequencies and phenotypes of Tax301-309-CTLs between ACs and ATL patients. We did not observe any essential difference in the frequencies of Tax301-309-CTLs between ACs and ATL patients. In the single-cell TCR repertoire analysis of Tax301-309-CTLs, 1,458 Tax301-309-CTLs and 140 clones were identified in this cohort. Tax301-309-CTLs showed highly restricted TCR repertoires with a strongly biased usage of BV7, and PDR, the unique motif in TCR-β CDR3, was exclusively observed in all ACs and ATL patients. However, there was no correlation between PDR+ CTL frequencies and HTLV-1 proviral load (PVL). In conclusion, we have identified, for the first time, a unique amino acid sequence, PDR, as a public TCR-CDR3 motif against Tax in HLA-A*24:02+ HTLV-1-infected individuals. Further investigations are warranted to elucidate the role of the PDR+ CTL response in the progression from carrier state to ATL. IMPORTANCE ATL is an aggressive T-cell malignancy caused by HTLV-1 infection. The HTLV-1 regulatory protein Tax aggressively promotes the proliferation of HTLV-1-infected lymphocytes and is also a major target antigen for CD8+ CTLs. In our previous evaluation of Tax301-309-CTLs, we found that a unique amino acid sequence motif, PDR, in CDR3 of the TCR-β chain of Tax301-309-CTLs was conserved among ATL patients after allo-HSCT. Furthermore, the PDR+ Tax301-309-CTL clones selectively expanded and showed strong cytotoxic activities against HTLV-1. On the other hand, it remains unclear how Tax301-309-CTL repertoire exists in ACs. In this study, we comprehensively compared Tax-specific TCR repertoires at the single-cell level between ACs and ATL patients. Tax301-309-CTLs showed highly restricted TCR repertoires with a strongly biased usage of BV7, and PDR, the unique motif in TCR-β CDR3, was conserved in all ACs and ATL patients, regardless of clinical subtype in HTLV-1 infection. PMID:28724766
Unique Structural Features and Sequence Motifs of Proline Utilization A (PutA)

PubMed Central

Singh, Ranjan K.; Tanner, John J.

2013-01-01

Proline utilization A proteins (PutAs) are bifunctional enzymes that catalyze the oxidation of proline to glutamate using spatially separated proline dehydrogenase and pyrroline-5-carboxylate dehydrogenase active sites. Here we use the crystal structure of the minimalist PutA from Bradyrhizobium japonicum (BjPutA) along with sequence analysis to identify unique structural features of PutAs. This analysis shows that PutAs have secondary structural elements and domains not found in the related monofunctional enzymes. Some of these extra features are predicted to be important for substrate channeling in BjPutA. Multiple sequence alignment analysis shows that some PutAs have a 17-residue conserved motif in the C-terminal 20–30 residues of the polypeptide chain. The BjPutA structure shows that this motif helps seal the internal substrate-channeling cavity from the bulk medium. Finally, it is shown that some PutAs have a 100–200 residue domain of unknown function in the C-terminus that is not found in minimalist PutAs. Remote homology detection suggests that this domain is homologous to the oligomerization beta-hairpin and Rossmann fold domain of BjPutA. PMID:22201760
Ethnomathematics elements in Batik Bali using backpropagation method

NASA Astrophysics Data System (ADS)

Lestari, Mei; Irawan, Ari; Rahayu, Wanti; Wayan Parwati, Ni

2018-05-01

Batik is one of traditional arts that has been established by the UNESCO as Indonesia’s cultural heritage. Batik has varieties and motifs, and each motifs has its own uniqueness but seems similar, that makes it difficult to identify. This study aims to develop an application that can identify typical batik Bali with etnomatematics elements on it. Etnomatematics is a study that shows relation between culture and mathematics concepts. Etnomatematics in Batik Bali is more to geometrical concept in line of strong Balinese culture element. The identification process is use backpropagation method. Steps of backpropagation methods are image processing (including scalling and tresholding image process). Next step is insert the processed image to an artificial neural network. This study resulted an accuracy of identification of batik Bali that has Etnomatematics elements on it.
Leucine-rich Repeats of Bacterial Surface Proteins Serve as Common Pattern Recognition Motifs of Human Scavenger Receptor gp340*

PubMed Central

Loimaranta, Vuokko; Hytönen, Jukka; Pulliainen, Arto T.; Sharma, Ashu; Tenovuo, Jorma; Strömberg, Nicklas; Finne, Jukka

2009-01-01

Scavenger receptors are innate immune molecules recognizing and inducing the clearance of non-host as well as modified host molecules. To recognize a wide pattern of invading microbes, many scavenger receptors bind to common pathogen-associated molecular patterns, such as lipopolysaccharides and lipoteichoic acids. Similarly, the gp340/DMBT1 protein, a member of the human scavenger receptor cysteine-rich protein family, displays a wide ligand repertoire. The peptide motif VEVLXXXXW derived from its scavenger receptor cysteine-rich domains is involved in some of these interactions, but most of the recognition mechanisms are unknown. In this study, we used mass spectrometry sequencing, gene inactivation, and recombinant proteins to identify Streptococcus pyogenes protein Spy0843 as a recognition receptor of gp340. Antibodies against Spy0843 are shown to protect against S. pyogenes infection, but no function or host receptor have been identified for the protein. Spy0843 belongs to the leucine-rich repeat (Lrr) family of eukaryotic and prokaryotic proteins. Experiments with truncated forms of the recombinant proteins confirmed that the Lrr region is needed in the binding of Spy0843 to gp340. The same motif of two other Lrr proteins, LrrG from the Gram-positive S. agalactiae and BspA from the Gram-negative Tannerella forsythia, also mediated binding to gp340. Moreover, inhibition of Spy0843 binding occurred with peptides containing the VEVLXXXXW motif, but also peptides devoid of the XXXXW motif inhibited binding of Lrr proteins. These results thus suggest that the conserved Lrr motif in bacterial proteins serves as a novel pattern recognition motif for unique core peptides of human scavenger receptor gp340. PMID:19465482
Hyperactive antifreeze proteins from longhorn beetles: some structural insights.

PubMed

Kristiansen, Erlend; Wilkens, Casper; Vincents, Bjarne; Friis, Dennis; Lorentzen, Anders Blomkild; Jenssen, Håvard; Løbner-Olesen, Anders; Ramløv, Hans

2012-11-01

This study reports on structural characteristics of hyperactive antifreeze proteins (AFPs) from two species of longhorn beetles. In Rhagium mordax, eight unique mRNAs coding for five different mature AFPs were identified from cold-hardy individuals. These AFPs are apparently homologues to a previously characterized AFP from the closely related species Rhagium inquisitor, and consist of six identifiable repeats of a putative ice binding motif TxTxTxT spaced irregularly apart by segments varying in length from 13 to 20 residues. Circular dichroism spectra show that the AFPs from both species have a high content of β-sheet and low levels of α-helix and random coil. Theoretical predictions of residue-specific secondary structure locate these β-sheets within the putative ice-binding motifs and the central parts of the segments separating them, consistent with an overall β-helical structure with the ice-binding motifs stacked in a β-sheet on one side of the coil. Molecular dynamics models based on these findings show that these AFPs would be energetically stable in a β-helical conformation. Copyright © 2012 Elsevier Ltd. All rights reserved.
CompariMotif: quick and easy comparisons of sequence motifs.

PubMed

Edwards, Richard J; Davey, Norman E; Shields, Denis C

2008-05-15

CompariMotif is a novel tool for making motif-motif comparisons, identifying and describing similarities between regular expression motifs. CompariMotif can identify a number of different relationships between motifs, including exact matches, variants of degenerate motifs and complex overlapping motifs. Motif relationships are scored using shared information content, allowing the best matches to be easily identified in large comparisons. Many input and search options are available, enabling a list of motifs to be compared to itself (to identify recurring motifs) or to datasets of known motifs. CompariMotif can be run online at http://bioware.ucd.ie/ and is freely available for academic use as a set of open source Python modules under a GNU General Public License from http://bioinformatics.ucd.ie/shields/software/comparimotif/
Tissue phosphoproteomics with PolyMAC identifies potential therapeutic targets in a transgenic mouse model of HER2 positive breast cancer

PubMed Central

Searleman, Adam C.; Iliuk, Anton B.; Collier, Timothy S.; Chodosh, Lewis A.; Tao, W. Andy; Bose, Ron

2014-01-01

Altered protein phosphorylation is a feature of many human cancers that can be targeted therapeutically. Phosphopeptide enrichment is a critical step for maximizing the depth of phosphoproteome coverage by MS, but remains challenging for tissue specimens because of their high complexity. We describe the first analysis of a tissue phosphoproteome using polymer-based metal ion affinity capture (PolyMAC), a nanopolymer that has excellent yield and specificity for phosphopeptide enrichment, on a transgenic mouse model of HER2-driven breast cancer. By combining phosphotyrosine immunoprecipitation with PolyMAC, 411 unique peptides with 139 phosphotyrosine, 45 phosphoserine, and 29 phosphothreonine sites were identified from five LC-MS/MS runs. Combining reverse phase liquid chromatography fractionation at pH 8.0 with PolyMAC identified 1571 unique peptides with 1279 phosphoserine, 213 phosphothreonine, and 21 phosphotyrosine sites from eight LC-MS/MS runs. Linear motif analysis indicated that many of the phosphosites correspond to well-known phosphorylation motifs. Analysis of the tyrosine phosphoproteome with the Drug Gene Interaction database uncovered a network of potential therapeutic targets centered on Src family kinases with inhibitors that are either FDA-approved or in clinical development. These results demonstrate that PolyMAC is well suited for phosphoproteomic analysis of tissue specimens. PMID:24723360
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Guo Qiang; Luo, Lingyun; Ogbuji, Chime

The interaction of multiple types of relationships among anatomical classes in the Foundational Model of Anatomy (FMA) can provide inferred information valuable for quality assurance. This paper introduces a method called Motif Checking (MOCH) to study the effects of such multi-relation type interactions. MOCH represents patterns of multitype interaction as small labeled sub-graph motifs, whose nodes represent class variables, and labeled edges represent relational types. By representing FMA as an RDF graph and motifs as SPARQL queries, fragments of FMA are automatically obtained as auditing candidates. Leveraging the scalability and reconfigurability of Semantic Web Technology (OWL, RDF and SPARQL) andmore » Virtuoso, we performed exhaustive analyses of three 2-node motifs, resulting in 638 matching FMA configurations; twelve 3-node motifs, resulting in 202,960 configurations. Using the Principal Ideal Explorer (PIE) methodology as an extension of MOCH, we were able to identify 755 root nodes with 4,100 respective descendants with opposing antonyms in their class names for arbitrary-length motifs. With possible disjointness implied by antonyms, we performed manual inspection of a subset of the resulting FMA fragments and tracked down a source of abnormal inferred conclusions (captured by the motifs), coming from a gender-neutral class being modeled as a part of gender-specific class, such as “Urinary system” is a part of “Female human body.” Our results demonstrate that MOCH and PIE provide a unique source of valuable information for quality assurance. Since our approach is general, it is applicable to any ontological system with an OWL representation.« less
Efficient Identification of Murine M2 Macrophage Peptide Targeting Ligands by Phage Display and Next-Generation Sequencing.

PubMed

Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H

2015-08-19

Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.
The amino acid motif L/IIxxFE defines a novel actin-binding sequence in PDZ-RhoGEF

PubMed Central

Banerjee, Jayashree; Fischer, Christopher C.; Wedegaertner, Philip B.

2009-01-01

PDZ-RhoGEF is a member of the regulator of G protein signaling (RGS) domain-containing RhoGEFs (RGS-RhoGEFs) that link activated heterotrimeric G protein α subunits of the G12 family to activation of the small GTPase RhoA. Unique among the RGS-RhoGEFs, PDZ-RhoGEF contains a short sequence that localizes the protein to the actin cytoskeleton. In this report, we demonstrate that the actin-binding domain, located between amino acids 561–585, directly binds to F-actin in vitro. Extensive mutagenesis identifies isoleucine 568, isoleucine 569, phenylalanine 572, and glutamic acid 573 as necessary for binding to actin and for co-localization with the actin cytoskeleton in cells. These results define a novel actin-binding sequence in PDZ-RhoGEF with a critical amino acid motif of IIxxFE. Moreover, sequence analysis identifies a similar actin-binding motif in the N-terminus of the RhoGEF frabin, and, as with PDZ-RhoGEF, mutagenesis and actin interaction experiments demonstrate a motif of LIxxFE, consisting of the key amino acids leucine 23, isoleucine 24, phenylalanine 27, and glutamic acid 28. Taken together, results with PDZ-RhoGEF and frabin identify a novel actin binding sequence. Lastly, inducible dimerization of the actin-binding region of PDZ-RhoGEF revealed a dimerization-dependent actin bundling activity in vitro. PDZ-RhoGEF exists in cells as a dimer, raising the possibility that PDZ-RhoGEF could influence actin structure independent of its ability to activate RhoA. PMID:19618964
A novel motif in the yeast mitochondrial dynamin Dnm1 is essential for adaptor binding and membrane recruitment

PubMed Central

Bui, Huyen T.; Karren, Mary A.; Bhar, Debjani

2012-01-01

To initiate mitochondrial fission, dynamin-related proteins (DRPs) must bind specific adaptors on the outer mitochondrial membrane. The structural features underlying this interaction are poorly understood. Using yeast as a model, we show that the Insert B domain of the Dnm1 guanosine triphosphatase (a DRP) contains a novel motif required for association with the mitochondrial adaptor Mdv1. Mutation of this conserved motif specifically disrupted Dnm1–Mdv1 interactions, blocking Dnm1 recruitment and mitochondrial fission. Suppressor mutations in Mdv1 that restored Dnm1–Mdv1 interactions and fission identified potential protein-binding interfaces on the Mdv1 β-propeller domain. These results define the first known function for Insert B in DRP–adaptor interactions. Based on the variability of Insert B sequences and adaptor proteins, we propose that Insert B domains and mitochondrial adaptors have coevolved to meet the unique requirements for mitochondrial fission of different organisms. PMID:23148233
An Ancient Fingerprint Indicates the Common Ancestry of Rossmann-Fold Enzymes Utilizing Different Ribose-Based Cofactors

PubMed Central

Laurino, Paola; Tóth-Petróczy, Ágnes; Meana-Pañeda, Rubén; Lin, Wei; Truhlar, Donald G.; Tawfik, Dan S.

2016-01-01

Nucleoside-based cofactors are presumed to have preceded proteins. The Rossmann fold is one of the most ancient and functionally diverse protein folds, and most Rossmann enzymes utilize nucleoside-based cofactors. We analyzed an omnipresent Rossmann ribose-binding interaction: a carboxylate side chain at the tip of the second β-strand (β2-Asp/Glu). We identified a canonical motif, defined by the β2-topology and unique geometry. The latter relates to the interaction being bidentate (both ribose hydroxyls interacting with the carboxylate oxygens), to the angle between the carboxylate and the ribose, and to the ribose’s ring configuration. We found that this canonical motif exhibits hallmarks of divergence rather than convergence. It is uniquely found in Rossmann enzymes that use different cofactors, primarily SAM (S-adenosyl methionine), NAD (nicotinamide adenine dinucleotide), and FAD (flavin adenine dinucleotide). Ribose-carboxylate bidentate interactions in other folds are not only rare but also have a different topology and geometry. We further show that the canonical geometry is not dictated by a physical constraint—geometries found in noncanonical interactions have similar calculated bond energies. Overall, these data indicate the divergence of several major Rossmann-fold enzyme classes, with different cofactors and catalytic chemistries, from a common pre-LUCA (last universal common ancestor) ancestor that possessed the β2-Asp/Glu motif. PMID:26938925
Design of character-based DNA barcode motif for species identification: A computational approach and its validation in fishes.

PubMed

Chakraborty, Mohua; Dhar, Bishal; Ghosh, Sankar Kumar

2017-11-01

The DNA barcodes are generally interpreted using distance-based and character-based methods. The former uses clustering of comparable groups, based on the relative genetic distance, while the latter is based on the presence or absence of discrete nucleotide substitutions. The distance-based approach has a limitation in defining a universal species boundary across the taxa as the rate of mtDNA evolution is not constant throughout the taxa. However, character-based approach more accurately defines this using a unique set of nucleotide characters. The character-based analysis of full-length barcode has some inherent limitations, like sequencing of the full-length barcode, use of a sparse-data matrix and lack of a uniform diagnostic position for each group. A short continuous stretch of a fragment can be used to resolve the limitations. Here, we observe that a 154-bp fragment, from the transversion-rich domain of 1367 COI barcode sequences can successfully delimit species in the three most diverse orders of freshwater fishes. This fragment is used to design species-specific barcode motifs for 109 species by the character-based method, which successfully identifies the correct species using a pattern-matching program. The motifs also correctly identify geographically isolated population of the Cypriniformes species. Further, this region is validated as a species-specific mini-barcode for freshwater fishes by successful PCR amplification and sequencing of the motif (154 bp) using the designed primers. We anticipate that use of such motifs will enhance the diagnostic power of DNA barcode, and the mini-barcode approach will greatly benefit the field-based system of rapid species identification. © 2017 John Wiley & Sons Ltd.
Modeling and analysis of molecularinteraction between Smurf1-WW2 domain and various isoforms of LIM mineralization protein.

PubMed

Sangadala, Sreedhara; Boden, Scott D; Metpally, Raghu Prasad Rao; Reddy, Boojala Vijay B

2007-08-15

LIM Mineralization Protein-1 (LMP-1) has been cloned and shown to be osteoinductive. Our efforts to understand the mode of action of LMP-1 led to the determination that LMP-1 interacts with Smad Ubiquitin Regulatory Factor-1 (Smurf1). Smurf1 targets osteogenic Smads, Smad1/5, for ubiquitin-mediated proteasomal degradation. Smurf1 interaction with LMP-1 or Smads is based on the presence of unique WW-domain interacting motif in these target molecules. By performing site-directed mutagenesis and binding studies in vitro on purified recombinant proteins, we identified a specific motif within the osteogenic region of several LMP isoforms that is necessary for Smurf1 interaction. Similarly, we have identified that the WW2 domain of Smurf1 is necessary for target protein interaction. Here, we present a homology-based modeling of the Smurf1 WW2 domain and its interacting motif of LMP-1. We performed computational docking of the interacting domains in Smurf1 and LMPs to identify the key amino acid residues involved in their binding regions. In support of the computational predictions, we also present biochemical evidence supporting the hypothesis that the physical interaction of Smurf1 and osteoinductive forms of LMP may prevent Smurf1 from targeting osteogenic Smads by ubiquitin-mediated proteasomal degradation.
Extensive T-Cell Epitope Repertoire Sharing among Human Proteome, Gastrointestinal Microbiome, and Pathogenic Bacteria: Implications for the Definition of Self

PubMed Central

Bremel, Robert D.; Homan, E. Jane

2015-01-01

T-cell receptor binding to MHC-bound peptides plays a key role in discrimination between self and non-self. Only a subset, typically a pentamer, of amino acids in a MHC-bound peptide form the motif exposed to the T-cell receptor. We categorize and compare the T-cell exposed amino acid motif repertoire of the total proteomes of two groups of bacteria, comprising pathogens and gastrointestinal microbiome organisms, with the human proteome and immunoglobulins. Given the maximum 205, or 3.2 million of such motifs that bind T-cell receptors, there is considerable overlap in motif usage. We show that the human proteome, exclusive of immunoglobulins, only comprises three quarters of the possible motifs, of which 65.3% are also present in both composite bacterial proteomes. Very few motifs are unique to the human proteome. Immunoglobulin variable regions carry a broad diversity of T-cell exposed motifs (TCEMs) that provides a stratified random sample of the motifs found in pathogens, microbiome, and the human proteome. Individual bacterial genera and species vary in the content of immunoglobulin and human proteome matched motifs that they carry. Mycobacteria and Burkholderia spp carry a particularly high content of such matched motifs. Some bacteria retain a unique motif signature and motif sharing pattern with the human proteome. The implication is that distinguishing self from non-self does not depend on individual TCEMs, but on a complex and dynamic overlay of signals wherein the same TCEM may play different roles in different organisms, and the frequency with which a particular TCEM appears influences its effect. The patterns observed provide clues to bacterial immune evasion and to strategies for intervention, including vaccine design. The breadth and distinct frequency patterns of the immunoglobulin-derived peptides suggest a role of immunoglobulins in maintaining a broadly responsive T-cell repertoire. PMID:26557118
Finding specific RNA motifs: Function in a zeptomole world?

PubMed Central

KNIGHT, ROB; YARUS, MICHAEL

2003-01-01

We have developed a new method for estimating the abundance of any modular (piecewise) RNA motif within a longer random region. We have used this method to estimate the size of the active motifs available to modern SELEX experiments (picomoles of unique sequences) and to a plausible RNA World (zeptomoles of unique sequences: 1 zmole = 602 sequences). Unexpectedly, activities such as specific isoleucine binding are almost certainly present in zeptomoles of molecules, and even ribozymes such as self-cleavage motifs may appear (depending on assumptions about the minimal structures). The number of specified nucleotides is not the only important determinant of a motif’s rarity: The number of modules into which it is divided, and the details of this division, are also crucial. We propose three maxims for easily isolated motifs: the Maxim of Minimization, the Maxim of Multiplicity, and the Maxim of the Median. These maxims together state that selected motifs should be small and composed of as many separate, equally sized modules as possible. For evenly divided motifs with four modules, the largest accessible activity in picomole scale (1–1000 pmole) pools of length 100 is about 34 nucleotides; while for zeptomole scale (1–1000 zmole) pools it is about 20 specific nucleotides (50% probability of occurrence). This latter figure includes some ribozymes and aptamers. Consequently, an RNA metabolism apparently could have begun with only zeptomoles of RNA molecules. PMID:12554865
Identification of aryl hydrocarbon receptor binding targets in mouse hepatic tissue treated with 2,3,7,8-tetrachlorodibenzo-p-dioxin

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lo, Raymond; Celius, Trine; Forgacs, Agnes L.

2011-11-15

Genome-wide, promoter-focused ChIP-chip analysis of hepatic aryl hydrocarbon receptor (AHR) binding sites was conducted in 8-week old female C57BL/6 treated with 30 {mu}g/kg/body weight 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) for 2 h and 24 h. These studies identified 1642 and 508 AHR-bound regions at 2 h and 24 h, respectively. A total of 430 AHR-bound regions were common between the two time points, corresponding to 403 unique genes. Comparison with previous AHR ChIP-chip studies in mouse hepatoma cells revealed that only 62 of the putative target genes overlapped with the 2 h AHR-bound regions in vivo. Transcription factor binding site analysis revealed anmore » over-representation of aryl hydrocarbon response elements (AHREs) in AHR-bound regions with 53% (2 h) and 68% (24 h) of them containing at least one AHRE. In addition to AHREs, E2f-Myc activator motifs previously implicated in AHR function, as well as a number of other motifs, including Sp1, nuclear receptor subfamily 2 factor, and early growth response factor motifs were also identified. Expression microarray studies identified 133 unique genes differentially regulated after 4 h treatment with TCDD. Of which, 39 were identified as AHR-bound genes at 2 h. Ingenuity Pathway Analysis on the 39 AHR-bound TCDD responsive genes identified potential perturbation in biological processes such as lipid metabolism, drug metabolism, and endocrine system development as a result of TCDD-mediated AHR activation. Our findings identify direct AHR target genes in vivo, highlight in vitro and in vivo differences in AHR signaling and show that AHR recruitment does not necessarily result in changes in target gene expression. -- Highlights: Black-Right-Pointing-Pointer ChIP-chip analysis of hepatic AHR binding after 2 h and 24 h of TCDD. Black-Right-Pointing-Pointer We identified 1642 and 508 AHR-bound regions at 2 h and 24 h. Black-Right-Pointing-Pointer 430 regions were common to both time points and highly enriched with AHREs. Black-Right-Pointing-Pointer Only 62 putative target regions overlapped AHR-bound regions in hepatoma cells. Black-Right-Pointing-Pointer Microarrays identified 133 TCDD-regulated genes; of which 39 were also bound by AHR.« less
WebMOTIFS: automated discovery, filtering and scoring of DNA sequence motifs using multiple programs and Bayesian approaches

PubMed Central

Romer, Katherine A.; Kayombya, Guy-Richard; Fraenkel, Ernest

2007-01-01

WebMOTIFS provides a web interface that facilitates the discovery and analysis of DNA-sequence motifs. Several studies have shown that the accuracy of motif discovery can be significantly improved by using multiple de novo motif discovery programs and using randomized control calculations to identify the most significant motifs or by using Bayesian approaches. WebMOTIFS makes it easy to apply these strategies. Using a single submission form, users can run several motif discovery programs and score, cluster and visualize the results. In addition, the Bayesian motif discovery program THEME can be used to determine the class of transcription factors that is most likely to regulate a set of sequences. Input can be provided as a list of gene or probe identifiers. Used with the default settings, WebMOTIFS accurately identifies biologically relevant motifs from diverse data in several species. WebMOTIFS is freely available at http://fraenkel.mit.edu/webmotifs. PMID:17584794

Structural insights into species-specific features of the ribosome from the pathogen Staphylococcus aureus

PubMed Central

Eyal, Zohar; Matzov, Donna; Krupkin, Miri; Wekselman, Itai; Paukner, Susanne; Zimmerman, Ella; Rozenberg, Haim; Bashan, Anat; Yonath, Ada

2015-01-01

The emergence of bacterial multidrug resistance to antibiotics threatens to cause regression to the preantibiotic era. Here we present the crystal structure of the large ribosomal subunit from Staphylococcus aureus, a versatile Gram-positive aggressive pathogen, and its complexes with the known antibiotics linezolid and telithromycin, as well as with a new, highly potent pleuromutilin derivative, BC-3205. These crystal structures shed light on specific structural motifs of the S. aureus ribosome and the binding modes of the aforementioned antibiotics. Moreover, by analyzing the ribosome structure and comparing it with those of nonpathogenic bacterial models, we identified some unique internal and peripheral structural motifs that may be potential candidates for improving known antibiotics and for use in the design of selective antibiotic drugs against S. aureus. PMID:26464510
Phosphoproteomic analysis of chromoplasts from sweet orange during fruit ripening.

PubMed

Zeng, Yunliu; Pan, Zhiyong; Wang, Lun; Ding, Yuduan; Xu, Qiang; Xiao, Shunyuan; Deng, Xiuxin

2014-02-01

Like other types of plastids, chromoplasts have essential biosynthetic and metabolic activities which may be regulated via post-translational modifications, such as phosphorylation, of their resident proteins. We here report a proteome-wide mapping of in vivo phosphorylation sites in chromoplast-enriched samples prepared from sweet orange [Citrus sinensis (L.) Osbeck] at different ripening stages by titanium dioxide-based affinity chromatography for phosphoprotein enrichment with LC-MS/MS. A total of 109 plastid-localized phosphoprotein candidates were identified that correspond to 179 unique phosphorylation sites in 135 phosphopeptides. On the basis of Motif-X analysis, two distinct types of phosphorylation sites, one as proline-directed phosphorylation motif and the other as casein kinase II motif, can be generalized from these identified phosphopeptides. While most identified phosphoproteins show high homology to those already identified in plastids, approximately 22% of them are novel based on BLAST search using the public databases PhosPhAt and P(3) DB. A close comparative analysis showed that approximately 50% of the phosphoproteins identified in citrus chromoplasts find obvious counterparts in the chloroplast phosphoproteome, suggesting a rather high-level of conservation in basic metabolic activities in these two types of plastids. Not surprisingly, the phosphoproteome of citrus chromoplasts is also characterized by the lack of phosphoproteins involved in photosynthesis and by the presence of more phosphoproteins implicated in stress/redox responses. This study presents the first comprehensive phosphoproteomic analysis of chromoplasts and may help to understand how phosphorylation regulates differentiation of citrus chromoplasts during fruit ripening. © 2013 Scandinavian Plant Physiology Society.
Using peptide array to identify binding motifs and interaction networks for modular domains.

PubMed

Li, Shawn S-C; Wu, Chenggang

2009-01-01

Specific protein-protein interactions underlie all essential biological processes and form the basis of cellular signal transduction. The recognition of a short, linear peptide sequence in one protein by a modular domain in another represents a common theme of macromolecular recognition in cells, and the importance of this mode of protein-protein interaction is highlighted by the large number of peptide-binding domains encoded by the human genome. This phenomenon also provides a unique opportunity to identify protein-protein binding events using peptide arrays and complementary biochemical assays. Accordingly, high-density peptide array has emerged as a useful tool by which to map domain-mediated protein-protein interaction networks at the proteome level. Using the Src-homology 2 (SH2) and 3 (SH3) domains as examples, we describe the application of oriented peptide array libraries in uncovering specific motifs recognized by an SH2 domain and the use of high-density peptide arrays in identifying interaction networks mediated by the SH3 domain. Methods reviewed here could also be applied to other modular domains, including catalytic domains, that recognize linear peptide sequences.
Signatures of Pleiotropy, Economy and Convergent Evolution in a Domain-Resolved Map of Human–Virus Protein–Protein Interaction Networks

PubMed Central

Garamszegi, Sara; Franzosa, Eric A.; Xia, Yu

2013-01-01

A central challenge in host-pathogen systems biology is the elucidation of general, systems-level principles that distinguish host-pathogen interactions from within-host interactions. Current analyses of host-pathogen and within-host protein-protein interaction networks are largely limited by their resolution, treating proteins as nodes and interactions as edges. Here, we construct a domain-resolved map of human-virus and within-human protein-protein interaction networks by annotating protein interactions with high-coverage, high-accuracy, domain-centric interaction mechanisms: (1) domain-domain interactions, in which a domain in one protein binds to a domain in a second protein, and (2) domain-motif interactions, in which a domain in one protein binds to a short, linear peptide motif in a second protein. Analysis of these domain-resolved networks reveals, for the first time, significant mechanistic differences between virus-human and within-human interactions at the resolution of single domains. While human proteins tend to compete with each other for domain binding sites by means of sequence similarity, viral proteins tend to compete with human proteins for domain binding sites in the absence of sequence similarity. Independent of their previously established preference for targeting human protein hubs, viral proteins also preferentially target human proteins containing linear motif-binding domains. Compared to human proteins, viral proteins participate in more domain-motif interactions, target more unique linear motif-binding domains per residue, and contain more unique linear motifs per residue. Together, these results suggest that viruses surmount genome size constraints by convergently evolving multiple short linear motifs in order to effectively mimic, hijack, and manipulate complex host processes for their survival. Our domain-resolved analyses reveal unique signatures of pleiotropy, economy, and convergent evolution in viral-host interactions that are otherwise hidden in the traditional binary network, highlighting the power and necessity of high-resolution approaches in host-pathogen systems biology. PMID:24339775
Signatures of pleiotropy, economy and convergent evolution in a domain-resolved map of human-virus protein-protein interaction networks.

PubMed

Garamszegi, Sara; Franzosa, Eric A; Xia, Yu

2013-01-01

A central challenge in host-pathogen systems biology is the elucidation of general, systems-level principles that distinguish host-pathogen interactions from within-host interactions. Current analyses of host-pathogen and within-host protein-protein interaction networks are largely limited by their resolution, treating proteins as nodes and interactions as edges. Here, we construct a domain-resolved map of human-virus and within-human protein-protein interaction networks by annotating protein interactions with high-coverage, high-accuracy, domain-centric interaction mechanisms: (1) domain-domain interactions, in which a domain in one protein binds to a domain in a second protein, and (2) domain-motif interactions, in which a domain in one protein binds to a short, linear peptide motif in a second protein. Analysis of these domain-resolved networks reveals, for the first time, significant mechanistic differences between virus-human and within-human interactions at the resolution of single domains. While human proteins tend to compete with each other for domain binding sites by means of sequence similarity, viral proteins tend to compete with human proteins for domain binding sites in the absence of sequence similarity. Independent of their previously established preference for targeting human protein hubs, viral proteins also preferentially target human proteins containing linear motif-binding domains. Compared to human proteins, viral proteins participate in more domain-motif interactions, target more unique linear motif-binding domains per residue, and contain more unique linear motifs per residue. Together, these results suggest that viruses surmount genome size constraints by convergently evolving multiple short linear motifs in order to effectively mimic, hijack, and manipulate complex host processes for their survival. Our domain-resolved analyses reveal unique signatures of pleiotropy, economy, and convergent evolution in viral-host interactions that are otherwise hidden in the traditional binary network, highlighting the power and necessity of high-resolution approaches in host-pathogen systems biology.
Using SCOPE to identify potential regulatory motifs in coregulated genes.

PubMed

Martyanov, Viktor; Gross, Robert H

2011-05-31

SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data. In this article, we utilize a web version of SCOPE to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs and has been used in other studies. The three algorithms that comprise SCOPE are BEAM, which finds non-degenerate motifs (ACCGGT), PRISM, which finds degenerate motifs (ASCGWT), and SPACER, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail.
SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

PubMed

Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

2018-05-25

Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.
A novel Met-to-Thr mutation in the YMDD motif of reverse transcriptase from feline immunodeficiency virus confers resistance to oxathiolane nucleosides.

PubMed Central

Smith, R A; Remington, K M; Lloyd, R M; Schinazi, R F; North, T W

1997-01-01

Variants of feline immunodeficiency virus (FIV) that possess a unique methionine-to-threonine mutation within the YMDD motif of reverse transcriptase (RT) were selected by culturing virus in the presence of inhibitory concentrations of (-)-beta-L-2',3'-dideoxy-5-fluoro-3'-thiacytidine [(-)-FTC]. The mutants were resistant to (-)-FTC and (-)-beta-L-2',3'-dideoxy-3'-thiacytidine (3TC) and additionally exhibited low-level resistance to 2',3'-dideoxycytidine (ddC). DNA sequence analysis of the RT-encoding region of the pol gene amplified from resistant viruses consistently identified a Met-to-Thr mutation in the YMDD motif. Purified RT from the mutants was also resistant to the 5'-triphosphate forms of 3TC, (-)-FTC, and ddC. Site-directed mutants of FIV were engineered which contain either the novel Met-to-Thr mutation or the Met-to-Val mutation seen in oxathiolane nucleoside-resistant HIV-1. Both site-directed mutants displayed resistance to 3TC, thus confirming the role of these mutations in the resistance of FIV to beta-L-3'-thianucleosides. PMID:9032372
Distribution of CpG Motifs in Upstream Gene Domains in a Reef Coral and Sea Anemone: Implications for Epigenetics in Cnidarians.

PubMed

Marsh, Adam G; Hoadley, Kenneth D; Warner, Mark E

2016-01-01

Coral reefs are under assault from stressors including global warming, ocean acidification, and urbanization. Knowing how these factors impact the future fate of reefs requires delineating stress responses across ecological, organismal and cellular scales. Recent advances in coral reef biology have integrated molecular processes with ecological fitness and have identified putative suites of temperature acclimation genes in a Scleractinian coral Acropora hyacinthus. We wondered what unique characteristics of these genes determined their coordinate expression in response to temperature acclimation, and whether or not other corals and cnidarians would likewise possess these features. Here, we focus on cytosine methylation as an epigenetic DNA modification that is responsive to environmental stressors. We identify common conserved patterns of cytosine-guanosine dinucleotide (CpG) motif frequencies in upstream promoter domains of different functional gene groups in two cnidarian genomes: a coral (Acropora digitifera) and an anemone (Nematostella vectensis). Our analyses show that CpG motif frequencies are prominent in the promoter domains of functional genes associated with environmental adaptation, particularly those identified in A. hyacinthus. Densities of CpG sites in upstream promoter domains near the transcriptional start site (TSS) are 1.38x higher than genomic background levels upstream of -2000 bp from the TSS. The increase in CpG usage suggests selection to allow for DNA methylation events to occur more frequently within 1 kb of the TSS. In addition, observed shifts in CpG densities among functional groups of genes suggests a potential role for epigenetic DNA methylation within promoter domains to impact functional gene expression responses in A. digitifera and N. vectensis. Identifying promoter epigenetic sequence motifs among genes within specific functional groups establishes an approach to describe integrated cellular responses to environmental stress in reef corals and potential roles of epigenetics on survival and fitness in the face of global climate change.
Mechanism underlying selective regulation of G protein-gated inwardly rectifying potassium channels by the psychostimulant-sensitive sorting nexin 27

PubMed Central

Balana, Bartosz; Maslennikov, Innokentiy; Kwiatkowski, Witek; Stern, Kalyn M.; Bahima, Laia; Choe, Senyon; Slesinger, Paul A.

2011-01-01

G protein-gated inwardly rectifying potassium (GIRK) channels are important gatekeepers of neuronal excitability. The surface expression of neuronal GIRK channels is regulated by the psychostimulant-sensitive sorting nexin 27 (SNX27) protein through a class I (-X-Ser/Thr-X-Φ, where X is any residue and Φ is a hydrophobic amino acid) PDZ-binding interaction. The G protein-insensitive inward rectifier channel (IRK1) contains the same class I PDZ-binding motif but associates with a different synaptic PDZ protein, postsynaptic density protein 95 (PSD95). The mechanism by which SNX27 and PSD95 discriminate these channels was previously unclear. Using high-resolution structures coupled with biochemical and functional analyses, we identified key amino acids upstream of the channel's canonical PDZ-binding motif that associate electrostatically with a unique structural pocket in the SNX27-PDZ domain. Changing specific charged residues in the channel's carboxyl terminus or in the PDZ domain converts the selective association and functional regulation by SNX27. Elucidation of this unique interaction site between ion channels and PDZ-containing proteins could provide a therapeutic target for treating brain diseases. PMID:21422294
DOE Office of Scientific and Technical Information (OSTI.GOV)

Schürpf, Thomas; Chen, Qiang; Liu, Jin-huan

Developmental endothelial cell locus-1 (Del-1) glycoprotein is secreted by endothelial cells and a subset of macrophages. Del-1 plays a regulatory role in vascular remodeling and functions in innate immunity through interaction with integrin {alpha}{sub V}{beta}{sub 3}. Del-1 contains 3 epidermal growth factor (EGF)-like repeats and 2 discoidin-like domains. An Arg-Gly-Asp (RGD) motif in the second EGF domain (EGF2) mediates adhesion by endothelial cells and phagocytes. We report the crystal structure of its 3 EGF domains. The RGD motif of EGF2 forms a type II' {beta} turn at the tip of a long protruding loop, dubbed the RGD finger. Whereas EGF2more » and EGF3 constitute a rigid rod via an interdomain calcium ion binding site, the long linker between EGF1 and EGF2 lends considerable flexibility to EGF1. Two unique O-linked glycans and 1 N-linked glycan locate to the opposite side of EGF2 from the RGD motif. These structural features favor integrin binding of the RGD finger. Mutagenesis data confirm the importance of having the RGD motif at the tip of the RGD finger. A database search for EGF domain sequences shows that this RGD finger is likely an evolutionary insertion and unique to the EGF domain of Del-1 and its homologue milk fat globule-EGF 8. The RGD finger of Del-1 is a unique structural feature critical for integrin binding.« less
Classic Nuclear Localization Signals and a Novel Nuclear Localization Motif Are Required for Nuclear Transport of Porcine Parvovirus Capsid Proteins

PubMed Central

Boisvert, Maude; Bouchard-Lévesque, Véronique; Fernandes, Sandra

2014-01-01

ABSTRACT Nuclear targeting of capsid proteins (VPs) is important for genome delivery and precedes assembly in the replication cycle of porcine parvovirus (PPV). Clusters of basic amino acids, corresponding to potential nuclear localization signals (NLS), were found only in the unique region of VP1 (VP1up, for VP1 unique part). Of the five identified basic regions (BR), three were important for nuclear localization of VP1up: BR1 was a classic Pat7 NLS, and the combination of BR4 and BR5 was a classic bipartite NLS. These NLS were essential for viral replication. VP2, the major capsid protein, lacked these NLS and contained no region with more than two basic amino acids in proximity. However, three regions of basic clusters were identified in the folded protein, assembled into a trimeric structure. Mutagenesis experiments showed that only one of these three regions was involved in VP2 transport to the nucleus. This structural NLS, termed the nuclear localization motif (NLM), is located inside the assembled capsid and thus can be used to transport trimers to the nucleus in late steps of infection but not for virions in initial infection steps. The two NLS of VP1up are located in the N-terminal part of the protein, externalized from the capsid during endosomal transit, exposing them for nuclear targeting during early steps of infection. Globally, the determinants of nuclear transport of structural proteins of PPV were different from those of closely related parvoviruses. IMPORTANCE Most DNA viruses use the nucleus for their replication cycle. Thus, structural proteins need to be targeted to this cellular compartment at two distinct steps of the infection: in early steps to deliver viral genomes to the nucleus and in late steps to assemble new viruses. Nuclear targeting of proteins depends on the recognition of a stretch of basic amino acids by cellular transport proteins. This study reports the identification of two classic nuclear localization signals in the minor capsid protein (VP1) of porcine parvovirus. The major protein (VP2) nuclear localization was shown to depend on a complex structural motif. This motif can be used as a strategy by the virus to avoid transport of incorrectly folded proteins and to selectively import assembled trimers into the nucleus. Structural nuclear localization motifs can also be important for nuclear proteins without a classic basic amino acid stretch, including multimeric cellular proteins. PMID:25078698
Single-step inline hydroxyapatite enrichment facilitates identification and quantitation of phosphopeptides from mass-limited proteomes with MudPIT

PubMed Central

Fonslow, Bryan R.; Niessen, Sherry M.; Singh, Meha; Wong, Catherine C.; Xu, Tao; Carvalho, Paulo C.; Choi, Jeong; Park, Sung Kyu; Yates, John R.

2012-01-01

Herein we report the characterization and optimization of single-step inline enrichment of phosphopeptides directly from small amounts of whole cell and tissue lysates (100 – 500 μg) using a hydroxyapatite (HAP) microcolumn and Multidimensional Protein Identification Technology (MudPIT). In comparison to a triplicate HILIC-IMAC phosphopeptide enrichment study, ~80% of the phosphopeptides identified using HAP-MudPIT were unique. Similarly, analysis of the consensus phosphorylation motifs between the two enrichment methods illustrates the complementarity of calcium-and iron-based enrichment methods and the higher sensitivity and selectivity of HAP-MudPIT for acidic motifs. We demonstrate how the identification of more multiply phosphorylated peptides from HAP-MudPIT can be used to quantify phosphorylation cooperativity. Through optimization of HAP-MudPIT on a whole cell lysate we routinely achieved identification and quantification of ca. 1000 phosphopeptides from a ~1 hr enrichment and 12 hr MudPIT analysis on small quantities of material. Finally, we applied this optimized method to identify phosphorylation sites from a mass-limited mouse brain region, the amygdala (200 – 500 μg), identifying up to 4000 phosphopeptides per run. PMID:22509746
Dynamic motif occupancy (DynaMO) analysis identifies transcription factors and their binding sites driving dynamic biological processes

PubMed Central

Kuang, Zheng; Ji, Zhicheng

2018-01-01

Abstract Biological processes are usually associated with genome-wide remodeling of transcription driven by transcription factors (TFs). Identifying key TFs and their spatiotemporal binding patterns are indispensable to understanding how dynamic processes are programmed. However, most methods are designed to predict TF binding sites only. We present a computational method, dynamic motif occupancy analysis (DynaMO), to infer important TFs and their spatiotemporal binding activities in dynamic biological processes using chromatin profiling data from multiple biological conditions such as time-course histone modification ChIP-seq data. In the first step, DynaMO predicts TF binding sites with a random forests approach. Next and uniquely, DynaMO infers dynamic TF binding activities at predicted binding sites using their local chromatin profiles from multiple biological conditions. Another landmark of DynaMO is to identify key TFs in a dynamic process using a clustering and enrichment analysis of dynamic TF binding patterns. Application of DynaMO to the yeast ultradian cycle, mouse circadian clock and human neural differentiation exhibits its accuracy and versatility. We anticipate DynaMO will be generally useful for elucidating transcriptional programs in dynamic processes. PMID:29325176
Phosphoproteomic analysis of the non-seed vascular plant model Selaginella moellendorffii

PubMed Central

2014-01-01

Background Selaginella (Selaginella moellendorffii) is a lycophyte which diverged from other vascular plants approximately 410 million years ago. As the first reported non-seed vascular plant genome, Selaginella genome data allow comparative analysis of genetic changes that may be associated with land plant evolution. Proteomics investigations on this lycophyte model have not been extensively reported. Phosphorylation represents the most common post-translational modifications and it is a ubiquitous regulatory mechanism controlling the functional expression of proteins inside living organisms. Results In this study, polyethylene glycol fractionation and immobilized metal ion affinity chromatography were employed to isolate phosphopeptides from wild-growing Selaginella. Using liquid chromatography-tandem mass spectrometry analysis, 1593 unique phosphopeptides spanning 1104 non-redundant phosphosites with confirmed localization on 716 phosphoproteins were identified. Analysis of the Selaginella dataset revealed features that are consistent with other plant phosphoproteomes, such as the relative proportions of phosphorylated Ser, Thr, and Tyr residues, the highest occurrence of phosphosites in the C-terminal regions of proteins, and the localization of phosphorylation events outside protein domains. In addition, a total of 97 highly conserved phosphosites in evolutionary conserved proteins were identified, indicating the conservation of phosphorylation-dependent regulatory mechanisms in phylogenetically distinct plant species. On the other hand, close examination of proteins involved in photosynthesis revealed phosphorylation events which may be unique to Selaginella evolution. Furthermore, phosphorylation motif analyses identified Pro-directed, acidic, and basic signatures which are recognized by typical protein kinases in plants. A group of Selaginella-specific phosphoproteins were found to be enriched in the Pro-directed motif class. Conclusions Our work provides the first large-scale atlas of phosphoproteins in Selaginella which occupies a unique position in the evolution of terrestrial plants. Future research into the functional roles of Selaginella-specific phosphorylation events in photosynthesis and other processes may offer insight into the molecular mechanisms leading to the distinct evolution of lycophytes. PMID:24628833
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Qibin; Monroe, Matthew E.; Schepmoes, Athena A.

Non-enzymatic glycation of proteins is implicated in diabetes mellitus and its related complications. In this report, we extend our previous development and refinement of proteomics-based methods for the analysis of non-enzymatically glycated proteins to comprehensively identify glycated proteins in normal and diabetic human plasma and erythrocytes. Using immunodepletion, enrichment, and fractionation strategies, we identified 7749 unique glycated peptides, corresponding to 3742 unique glycated proteins. Semi-quantitative comparisons revealed a number of proteins with glycation levels significantly increased in diabetes relative to control samples and that erythrocyte proteins are more extensively glycated than plasma proteins. A glycation motif analysis revealed amino acidsmore » that are favored more than others in the protein primary structures in the vicinity of the glycation sites in both sample types. The glycated peptides and corresponding proteins reported here provide a foundation for the potential identification of novel markers for diabetes, glycemia, or diabetic complications.« less
The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats

PubMed Central

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-01-01

Background In Archeae and Bacteria, the repeated elements called CRISPRs for "clustered regularly interspaced short palindromic repeats" are believed to participate in the defence against viruses. Short sequences called spacers are stored in-between repeated elements. In the current model, motifs comprising spacers and repeats may target an invading DNA and lead to its degradation through a proposed mechanism similar to RNA interference. Analysis of intra-species polymorphism shows that new motifs (one spacer and one repeated element) are added in a polarised fashion. Although their principal characteristics have been described, a lot remains to be discovered on the way CRISPRs are created and evolve. As new genome sequences become available it appears necessary to develop automated scanning tools to make available CRISPRs related information and to facilitate additional investigations. Description We have produced a program, CRISPRFinder, which identifies CRISPRs and extracts the repeated and unique sequences. Using this software, a database is constructed which is automatically updated monthly from newly released genome sequences. Additional tools were created to allow the alignment of flanking sequences in search for similarities between different loci and to build dictionaries of unique sequences. To date, almost six hundred CRISPRs have been identified in 475 published genomes. Two Archeae out of thirty-seven and about half of Bacteria do not possess a CRISPR. Fine analysis of repeated sequences strongly supports the current view that new motifs are added at one end of the CRISPR adjacent to the putative promoter. Conclusion It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution. Subsequent analyses of the intra-species CRISPR polymorphism will be facilitated by CRISPRFinder and the dictionary creator. CRISPRdb is accessible at PMID:17521438
Sulfur-induced structural motifs on copper and gold surfaces

DOE Office of Scientific and Technical Information (OSTI.GOV)

Walen, Holly

The interaction of sulfur with copper and gold surfaces plays a fundamental role in important phenomena that include coarsening of surface nanostructures, and self-assembly of alkanethiols. Here, we identify and analyze unique sulfur-induced structural motifs observed on the low-index surfaces of these two metals. We seek out these structures in an effort to better understand the fundamental interactions between these metals and sulfur that lends to the stability and favorability of metal-sulfur complexes vs. chemisorbed atomic sulfur. The experimental observations presented here—made under identical conditions—together with extensive DFT analyses, allow comparisons and insights into factors that favor the existence ofmore » metal-sulfur complexes, vs. chemisorbed atomic sulfur, on metal terraces. We believe this data will be instrumental in better understanding the complex phenomena occurring between the surfaces of coinage metals and sulfur.« less
Motif-based analysis of large nucleotide data sets using MEME-ChIP

PubMed Central

Ma, Wenxiu; Noble, William S; Bailey, Timothy L

2014-01-01

MEME-ChIP is a web-based tool for analyzing motifs in large DNA or RNA data sets. It can analyze peak regions identified by ChIP-seq, cross-linking sites identified by cLIP-seq and related assays, as well as sets of genomic regions selected using other criteria. MEME-ChIP performs de novo motif discovery, motif enrichment analysis, motif location analysis and motif clustering, providing a comprehensive picture of the DNA or RNA motifs that are enriched in the input sequences. MEME-ChIP performs two complementary types of de novo motif discovery: weight matrix–based discovery for high accuracy; and word-based discovery for high sensitivity. Motif enrichment analysis using DNA or RNA motifs from human, mouse, worm, fly and other model organisms provides even greater sensitivity. MEME-ChIP’s interactive HTML output groups and aligns significant motifs to ease interpretation. this protocol takes less than 3 h, and it provides motif discovery approaches that are distinct and complementary to other online methods. PMID:24853928
SARM1-specific motifs in the TIR domain enable NAD+ loss and regulate injury-induced SARM1 activation.

PubMed

Summers, Daniel W; Gibson, Daniel A; DiAntonio, Aaron; Milbrandt, Jeffrey

2016-10-11

Axon injury in response to trauma or disease stimulates a self-destruction program that promotes the localized clearance of damaged axon segments. Sterile alpha and Toll/interleukin receptor (TIR) motif-containing protein 1 (SARM1) is an evolutionarily conserved executioner of this degeneration cascade, also known as Wallerian degeneration; however, the mechanism of SARM1-dependent neuronal destruction is still obscure. SARM1 possesses a TIR domain that is necessary for SARM1 activity. In other proteins, dimerized TIR domains serve as scaffolds for innate immune signaling. In contrast, dimerization of the SARM1 TIR domain promotes consumption of the essential metabolite NAD + and induces neuronal destruction. This activity is unique to the SARM1 TIR domain, yet the structural elements that enable this activity are unknown. In this study, we identify fundamental properties of the SARM1 TIR domain that promote NAD + loss and axon degeneration. Dimerization of the TIR domain from the Caenorhabditis elegans SARM1 ortholog TIR-1 leads to NAD + loss and neuronal death, indicating these activities are an evolutionarily conserved feature of SARM1 function. Detailed analysis of sequence homology identifies canonical TIR motifs as well as a SARM1-specific (SS) loop that are required for NAD + loss and axon degeneration. Furthermore, we identify a residue in the SARM1 BB loop that is dispensable for TIR activity yet required for injury-induced activation of full-length SARM1, suggesting that SARM1 function requires multidomain interactions. Indeed, we identify a physical interaction between the autoinhibitory N terminus and the TIR domain of SARM1, revealing a previously unrecognized direct connection between these domains that we propose mediates autoinhibition and activation upon injury.

Screening of repetitive motifs inside the genome of the flat oyster (Ostrea edulis): Transposable elements and short tandem repeats.

PubMed

Vera, Manuel; Bello, Xabier; Álvarez-Dios, Jose-Antonio; Pardo, Belen G; Sánchez, Laura; Carlsson, Jens; Carlsson, Jeanette E L; Bartolomé, Carolina; Maside, Xulio; Martinez, Paulino

2015-12-01

The flat oyster (Ostrea edulis) is one of the most appreciated molluscs in Europe, but its production has been greatly reduced by the parasite Bonamia ostreae. Here, new generation genomic resources were used to analyse the repetitive fraction of the oyster genome, with the aim of developing molecular markers to face this main oyster production challenge. The resulting oyster database, consists of two sets of 10,318 and 7159 unique contigs (4.8 Mbp and 6.8 Mbp in total length) representing the oyster's genome (WG) and haemocyte transcriptome (HT), respectively. A total of 1083 sequences were identified as TE-derived, which corresponded to 4.0% of WG and 1.1% of HT. They were clustered into 142 homology groups, most of which were assigned to the Penelope order of retrotransposons, and to the Helitron and TIR DNA-transposons. Simple repeats and rRNA pseudogenes, also made a significant contribution to the oyster's genome (0.5% and 0.3% of WG and HT, respectively).The most frequent short tandem repeats identified in WG were tetranucleotide motifs while trinucleotide motifs were in HT. Forty identified microsatellite loci, 20 from each database, were selected for technical validation. Success was much lower among WG than HT microsatellites (15% vs 55%), which could reflect higher variation in anonymous regions interfering with primer annealing. All microsatellites developed adjusted to Hardy-Weinberg proportions and represent a useful tool to support future breeding programmes and to manage genetic resources of natural flat oyster beds. Copyright © 2015 Elsevier B.V. All rights reserved.
Methods for Identifying Ligands that Target Nucleic Acid Molecules and Nucleic Acid Structural Motifs

NASA Technical Reports Server (NTRS)

Childs-Disney, Jessica L. (Inventor); Disney, Matthew D. (Inventor)

2017-01-01

Disclosed are methods for identifying a nucleic acid (e.g., RNA, DNA, etc.) motif which interacts with a ligand. The method includes providing a plurality of ligands immobilized on a support, wherein each particular ligand is immobilized at a discrete location on the support; contacting the plurality of immobilized ligands with a nucleic acid motif library under conditions effective for one or more members of the nucleic acid motif library to bind with the immobilized ligands; and identifying members of the nucleic acid motif library that are bound to a particular immobilized ligand. Also disclosed are methods for selecting, from a plurality of candidate ligands, one or more ligands that have increased likelihood of binding to a nucleic acid molecule comprising a particular nucleic acid motif, as well as methods for identifying a nucleic acid which interacts with a ligand.
[Prediction of Promoter Motifs in Virophages].

PubMed

Gong, Chaowen; Zhou, Xuewen; Pan, Yingjie; Wang, Yongjie

2015-07-01

Virophages have crucial roles in ecosystems and are the transport vectors of genetic materials. To shed light on regulation and control mechanisms in virophage--host systems as well as evolution between virophages and their hosts, the promoter motifs of virophages were predicted on the upstream regions of start codons using an analytical tool for prediction of promoter motifs: Multiple EM for Motif Elicitation. Seventeen potential promoter motifs were identified based on the E-value, location, number and length of promoters in genomes. Sputnik and zamilon motif 2 with AT-rich regions were distributed widely on genomes, suggesting that these motifs may be associated with regulation of the expression of various genes. Motifs containing the TCTA box were predicted to be late promoter motif in mavirus; motifs containing the ATCT box were the potential late promoter motif in the Ace Lake mavirus . AT-rich regions were identified on motif 2 in the Organic Lake virophage, motif 3 in Yellowstone Lake virophage (YSLV)1 and 2, motif 1 in YSLV3, and motif 1 and 2 in YSLV4, respectively. AT-rich regions were distributed widely on the genomes of virophages. All of these motifs may be promoter motifs of virophages. Our results provide insights into further exploration of temporal expression of genes in virophages as well as associations between virophages and giant viruses.
Unique ζ-chain motifs mediate a direct TCR-actin linkage critical for immunological synapse formation and T-cell activation.

PubMed

Klieger, Yair; Almogi-Hazan, Osnat; Ish-Shalom, Eliran; Pato, Aviad; Pauker, Maor H; Barda-Saad, Mira; Wang, Lynn; Baniyash, Michal

2014-01-01

TCR-mediated activation induces receptor microclusters that evolve to a defined immune synapse (IS). Many studies showed that actin polymerization and remodeling, which create a scaffold critical to IS formation and stabilization, are TCR mediated. However, the mechanisms controlling simultaneous TCR and actin dynamic rearrangement in the IS are yet not fully understood. Herein, we identify two novel TCR ζ-chain motifs, mediating the TCR's direct interaction with actin and inducing actin bundling. While T cells expressing the ζ-chain mutated in these motifs lack cytoskeleton (actin) associated (cska)-TCRs, they express normal levels of non-cska and surface TCRs as cells expressing wild-type ζ-chain. However, such mutant cells are unable to display activation-dependent TCR clustering, IS formation, expression of CD25/CD69 activation markers, or produce/secrete cytokine, effects also seen in the corresponding APCs. We are the first to show a direct TCR-actin linkage, providing the missing gap linking between TCR-mediated Ag recognition, specific cytoskeleton orientation toward the T-cell-APC interacting pole and long-lived IS maintenance. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
SCOPE: a web server for practical de novo motif discovery.

PubMed

Carlson, Jonathan M; Chakravarty, Arijit; DeZiel, Charles E; Gross, Robert H

2007-07-01

SCOPE is a novel parameter-free method for the de novo identification of potential regulatory motifs in sets of coordinately regulated genes. The SCOPE algorithm combines the output of three component algorithms, each designed to identify a particular class of motifs. Using an ensemble learning approach, SCOPE identifies the best candidate motifs from its component algorithms. In tests on experimentally determined datasets, SCOPE identified motifs with a significantly higher level of accuracy than a number of other web-based motif finders run with their default parameters. Because SCOPE has no adjustable parameters, the web server has an intuitive interface, requiring only a set of gene names or FASTA sequences and a choice of species. The most significant motifs found by SCOPE are displayed graphically on the main results page with a table containing summary statistics for each motif. Detailed motif information, including the sequence logo, PWM, consensus sequence and specific matching sites can be viewed through a single click on a motif. SCOPE's efficient, parameter-free search strategy has enabled the development of a web server that is readily accessible to the practising biologist while providing results that compare favorably with those of other motif finders. The SCOPE web server is at .
Identifying DNA-binding proteins using structural motifs and the electrostatic potential

PubMed Central

Shanahan, Hugh P.; Garcia, Mario A.; Jones, Susan; Thornton, Janet M.

2004-01-01

Robust methods to detect DNA-binding proteins from structures of unknown function are important for structural biology. This paper describes a method for identifying such proteins that (i) have a solvent accessible structural motif necessary for DNA-binding and (ii) a positive electrostatic potential in the region of the binding region. We focus on three structural motifs: helix–turn-helix (HTH), helix–hairpin–helix (HhH) and helix–loop–helix (HLH). We find that the combination of these variables detect 78% of proteins with an HTH motif, which is a substantial improvement over previous work based purely on structural templates and is comparable to more complex methods of identifying DNA-binding proteins. Similar true positive fractions are achieved for the HhH and HLH motifs. We see evidence of wide evolutionary diversity for DNA-binding proteins with an HTH motif, and much smaller diversity for those with an HhH or HLH motif. PMID:15356290
BayesMotif: de novo protein sorting motif discovery from impure datasets.

PubMed

Hu, Jianjun; Zhang, Fan

2010-01-18

Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of PWM (position weight matrix) motif model.
SARS-unique fold in the Rousettus bat coronavirus HKU9.

PubMed

Hammond, Robert G; Tan, Xuan; Johnson, Margaret A

2017-09-01

The coronavirus nonstructural protein 3 (nsp3) is a multifunctional protein that comprises multiple structural domains. This protein assists viral polyprotein cleavage, host immune interference, and may play other roles in genome replication or transcription. Here, we report the solution NMR structure of a protein from the "SARS-unique region" of the bat coronavirus HKU9. The protein contains a frataxin fold or double-wing motif, which is an α + β fold that is associated with protein/protein interactions, DNA binding, and metal ion binding. High structural similarity to the human severe acute respiratory syndrome (SARS) coronavirus nsp3 is present. A possible functional site that is conserved among some betacoronaviruses has been identified using bioinformatics and biochemical analyses. This structure provides strong experimental support for the recent proposal advanced by us and others that the "SARS-unique" region is not unique to the human SARS virus, but is conserved among several different phylogenetic groups of coronaviruses and provides essential functions. © 2017 The Protein Society.
Insights into the Activity and Substrate Binding of Xylella fastidiosa Polygalacturonase by Modification of a Unique QMK Amino Acid Motif Using Protein Chimeras

PubMed Central

Warren, Jeremy G.; Lincoln, James E.; Kirkpatrick, Bruce C.

2015-01-01

Polygalacturonases (EC 3.2.1.15) catalyze the random hydrolysis of 1, 4-alpha-D-galactosiduronic linkages in pectate and other galacturonans. Xylella fastidiosa possesses a single polygalacturonase gene, pglA (PD1485), and X. fastidiosa mutants deficient in the production of polygalacturonase are non-pathogenic and show a compromised ability to systemically infect grapevines. These results suggested that grapevines expressing sufficient amounts of an inhibitor of X. fastidiosa polygalacturonase might be protected from disease. Previous work in our laboratory and others have tried without success to produce soluble active X. fastidiosa polygalacturonase for use in inhibition assays. In this study, we created two enzymatically active X. fastidiosa / A. vitis polygalacturonase chimeras, AX1A and AX2A to explore the functionality of X. fastidiosa polygalacturonase in vitro. The AX1A chimera was constructed to specifically test if recombinant chimeric protein, produced in Escherichia coli, is soluble and if the X. fastidiosa polygalacturonase catalytic amino acids are able to hydrolyze polygalacturonic acid. The AX2A chimera was constructed to evaluate the ability of a unique QMK motif of X. fastidiosa polygalacturonase, most polygalacturonases have a R(I/L)K motif, to bind to and allow the hydrolysis of polygalacturonic acid. Furthermore, the AX2A chimera was also used to explore what effect modification of the QMK motif of X. fastidiosa polygalacturonase to a conserved RIK motif has on enzymatic activity. These experiments showed that both the AX1A and AX2A polygalacturonase chimeras were soluble and able to hydrolyze the polygalacturonic acid substrate. Additionally, the modification of the QMK motif to the conserved RIK motif eliminated hydrolytic activity, suggesting that the QMK motif is important for the activity of X. fastidiosa polygalacturonase. This result suggests X. fastidiosa polygalacturonase may preferentially hydrolyze a different pectic substrate or, alternatively, it has a different mechanism of substrate binding than other polygalacturonases characterized to date. PMID:26571265
Insights into the Activity and Substrate Binding of Xylella fastidiosa Polygalacturonase by Modification of a Unique QMK Amino Acid Motif Using Protein Chimeras.

PubMed

Warren, Jeremy G; Lincoln, James E; Kirkpatrick, Bruce C

2015-01-01

Polygalacturonases (EC 3.2.1.15) catalyze the random hydrolysis of 1, 4-alpha-D-galactosiduronic linkages in pectate and other galacturonans. Xylella fastidiosa possesses a single polygalacturonase gene, pglA (PD1485), and X. fastidiosa mutants deficient in the production of polygalacturonase are non-pathogenic and show a compromised ability to systemically infect grapevines. These results suggested that grapevines expressing sufficient amounts of an inhibitor of X. fastidiosa polygalacturonase might be protected from disease. Previous work in our laboratory and others have tried without success to produce soluble active X. fastidiosa polygalacturonase for use in inhibition assays. In this study, we created two enzymatically active X. fastidiosa / A. vitis polygalacturonase chimeras, AX1A and AX2A to explore the functionality of X. fastidiosa polygalacturonase in vitro. The AX1A chimera was constructed to specifically test if recombinant chimeric protein, produced in Escherichia coli, is soluble and if the X. fastidiosa polygalacturonase catalytic amino acids are able to hydrolyze polygalacturonic acid. The AX2A chimera was constructed to evaluate the ability of a unique QMK motif of X. fastidiosa polygalacturonase, most polygalacturonases have a R(I/L)K motif, to bind to and allow the hydrolysis of polygalacturonic acid. Furthermore, the AX2A chimera was also used to explore what effect modification of the QMK motif of X. fastidiosa polygalacturonase to a conserved RIK motif has on enzymatic activity. These experiments showed that both the AX1A and AX2A polygalacturonase chimeras were soluble and able to hydrolyze the polygalacturonic acid substrate. Additionally, the modification of the QMK motif to the conserved RIK motif eliminated hydrolytic activity, suggesting that the QMK motif is important for the activity of X. fastidiosa polygalacturonase. This result suggests X. fastidiosa polygalacturonase may preferentially hydrolyze a different pectic substrate or, alternatively, it has a different mechanism of substrate binding than other polygalacturonases characterized to date.
Structural characterization of Helicobacter pylori dethiobiotin synthetase reveals differences between family members

DOE Office of Scientific and Technical Information (OSTI.GOV)

Porebski, Przemyslaw J.; Klimecka, Maria; Chruszcz, Maksymilian

2012-07-11

Dethiobiotin synthetase (DTBS) is involved in the biosynthesis of biotin in bacteria, fungi, and plants. As humans lack this pathway, DTBS is a promising antimicrobial drug target. We determined structures of DTBS from Helicobacter pylori (hpDTBS) bound with cofactors and a substrate analog, and described its unique characteristics relative to other DTBS proteins. Comparison with bacterial DTBS orthologs revealed considerable structural differences in nucleotide recognition. The C-terminal region of DTBS proteins, which contains two nucleotide-recognition motifs, differs greatly among DTBS proteins from different species. The structure of hpDTBS revealed that this protein is unique and does not contain a C-terminalmore » region containing one of the motifs. The single nucleotide-binding motif in hpDTBS is similar to its counterpart in GTPases; however, isothermal titration calorimetry binding studies showed that hpDTBS has a strong preference for ATP. The structural determinants of ATP specificity were assessed with X-ray crystallographic studies of hpDTBS-ATP and hpDTBS-GTP complexes. The unique mode of nucleotide recognition in hpDTBS makes this protein a good target for H. pylori-specific inhibitors of the biotin synthesis pathway.« less
De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes.

PubMed

Zolotarov, Yevgen; Strömvik, Martina

2015-01-01

Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved.
The Methionine-aromatic Motif Plays a Unique Role in Stabilizing Protein Structure*

PubMed Central

Valley, Christopher C.; Cembran, Alessandro; Perlmutter, Jason D.; Lewis, Andrew K.; Labello, Nicholas P.; Gao, Jiali; Sachs, Jonathan N.

2012-01-01

Of the 20 amino acids, the precise function of methionine (Met) remains among the least well understood. To establish a determining characteristic of methionine that fundamentally differentiates it from purely hydrophobic residues, we have used in vitro cellular experiments, molecular simulations, quantum calculations, and a bioinformatics screen of the Protein Data Bank. We show that approximately one-third of all known protein structures contain an energetically stabilizing Met-aromatic motif and, remarkably, that greater than 10,000 structures contain this motif more than 10 times. Critically, we show that as compared with a purely hydrophobic interaction, the Met-aromatic motif yields an additional stabilization of 1–1.5 kcal/mol. To highlight its importance and to dissect the energetic underpinnings of this motif, we have studied two clinically relevant TNF ligand-receptor complexes, namely TRAIL-DR5 and LTα-TNFR1. In both cases, we show that the motif is necessary for high affinity ligand binding as well as function. Additionally, we highlight previously overlooked instances of the motif in several disease-related Met mutations. Our results strongly suggest that the Met-aromatic motif should be exploited in the rational design of therapeutics targeting a range of proteins. PMID:22859300
NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents.

PubMed

Liu, Sophia S; Hockenberry, Adam J; Lancichinetti, Andrea; Jewett, Michael C; Amaral, Luís A N

2016-11-01

The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems.
Transcriptome-Wide Survey and Expression Profile Analysis of Putative Chrysanthemum HD-Zip I and II Genes

PubMed Central

Song, Aiping; Li, Peiling; Xin, Jingjing; Chen, Sumei; Zhao, Kunkun; Wu, Dan; Fan, Qingqing; Gao, Tianwei; Chen, Fadi; Guan, Zhiyong

2016-01-01

The homeodomain-leucine zipper (HD-Zip) transcription factor family is a key transcription factor family and unique to the plant kingdom. It consists of a homeodomain and a leucine zipper that serve in combination as a dimerization motif. The family can be classified into four subfamilies, and these subfamilies participate in the development of hormones and mediation of hormone action and are involved in plant responses to environmental conditions. However, limited information on this gene family is available for the important chrysanthemum ornamental species (Chrysanthemum morifolium). Here, we characterized 17 chrysanthemum HD-Zip genes based on transcriptome sequences. Phylogenetic analyses revealed that 17 CmHB genes were distributed in the HD-Zip subfamilies I and II and identified two pairs of putative orthologous proteins in Arabidopsis and chrysanthemum and four pairs of paralogous proteins in chrysanthemum. The software MEME was used to identify 7 putative motifs with E values less than 1e-3 in the chrysanthemum HD-Zip factors, and they can be clearly classified into two groups based on the composition of the motifs. A bioinformatics analysis predicted that 8 CmHB genes could be targeted by 10 miRNA families, and the expression of these 17 genes in response to phytohormone treatments and abiotic stresses was characterized. The results presented here will promote research on the various functions of the HD-Zip gene family members in plant hormones and stress responses. PMID:27196930
In silico analysis of surface structure variation of PCV2 capsid resulting from loop mutations of its capsid protein (Cap)

PubMed Central

Wang, Aibing; Zhang, Lijie; Khayat, Reza

2016-01-01

Outbreaks of porcine circovirus (PCV) type 2 (PCV2)-associated diseases have caused substantial economic losses worldwide in the last 20 years. The PCV capsid protein (Cap) is the sole structural protein and main antigenic determinant of this virus. In this study, not only were phylogenetic trees reconstructed, but variations of surface structure of the PCV capsid were analysed in the course of evolution. Unique surface patterns of the icosahedral fivefold axes of the PCV2 capsid were identified and characterized, all of which were absent in PCV type 1 (PCV1). Icosahedral fivefold axes, decorated with Loops BC, HI and DE, were distinctly different between PCV2 and PCV1. Loops BC, determining the outermost surface around the fivefold axes of PCV capsids, had limited homology between Caps of PCV1 and PCV2. A conserved tyrosine phosphorylation motif in Loop HI that might be recognized by non-receptor tyrosine kinase(s) in vivo was present only in PCV2. Particularly, the concurrent presence of 60 pairs of the conserved tyrosine and a canonical PXXP motif on the PCV2 capsid surface could be a mechanism for PXXP motif binding to and activation of an SH3-domain-containing tyrosine kinase in host cells. Additionally, a conserved cysteine in Loop DE of the PCV2 Cap was substituted by an arginine in PCV1, indicating potentially distinct assembly mechanisms of the capsid in vitro between PCV1 and PCV2. Therefore, these unique patterns on the PCV2 capsid surface, absent in PCV1 isolates, might be related to cell entry, virus function and pathogenesis. PMID:27902320
In silico analysis of surface structure variation of PCV2 capsid resulting from loop mutations of its capsid protein (Cap).

PubMed

Wang, Naidong; Zhan, Yang; Wang, Aibing; Zhang, Lijie; Khayat, Reza; Yang, Yi

2016-12-01

Outbreaks of porcine circovirus (PCV) type 2 (PCV2)-associated diseases have caused substantial economic losses worldwide in the last 20 years. The PCV capsid protein (Cap) is the sole structural protein and main antigenic determinant of this virus. In this study, not only were phylogenetic trees reconstructed, but variations of surface structure of the PCV capsid were analysed in the course of evolution. Unique surface patterns of the icosahedral fivefold axes of the PCV2 capsid were identified and characterized, all of which were absent in PCV type 1 (PCV1). Icosahedral fivefold axes, decorated with Loops BC, HI and DE, were distinctly different between PCV2 and PCV1. Loops BC, determining the outermost surface around the fivefold axes of PCV capsids, had limited homology between Caps of PCV1 and PCV2. A conserved tyrosine phosphorylation motif in Loop HI that might be recognized by non-receptor tyrosine kinase(s) in vivo was present only in PCV2. Particularly, the concurrent presence of 60 pairs of the conserved tyrosine and a canonical PXXP motif on the PCV2 capsid surface could be a mechanism for PXXP motif binding to and activation of an SH3-domain-containing tyrosine kinase in host cells. Additionally, a conserved cysteine in Loop DE of the PCV2 Cap was substituted by an arginine in PCV1, indicating potentially distinct assembly mechanisms of the capsid in vitro between PCV1 and PCV2. Therefore, these unique patterns on the PCV2 capsid surface, absent in PCV1 isolates, might be related to cell entry, virus function and pathogenesis.
An evolutionarily conserved motif in the TAB1 C-terminal region is necessary for interaction with and activation of TAK1 MAPKKK.

PubMed

Ono, K; Ohtomo, T; Sato, S; Sugamata, Y; Suzuki, M; Hisamoto, N; Ninomiya-Tsuji, J; Tsuchiya, M; Matsumoto, K

2001-06-29

TAK1, a member of the MAPKKK family, is involved in the intracellular signaling pathways mediated by transforming growth factor beta, interleukin 1, and Wnt. TAK1 kinase activity is specifically activated by the TAK1-binding protein TAB1. The C-terminal 68-amino acid sequence of TAB1 (TAB1-C68) is sufficient for TAK1 interaction and activation. Analysis of various truncated versions of TAB1-C68 defined a C-terminal 30-amino acid sequence (TAB1-C30) necessary for TAK1 binding and activation. NMR studies revealed that the TAB1-C30 region has a unique alpha-helical structure. We identified a conserved sequence motif, PYVDXA/TXF, in the C-terminal domain of mammalian TAB1, Xenopus TAB1, and its Caenorhabditis elegans homolog TAP-1, suggesting that this motif constitutes a specific TAK1 docking site. Alanine substitution mutagenesis showed that TAB1 Phe-484, located in the conserved motif, is crucial for TAK1 binding and activation. The C. elegans homolog of TAB1, TAP-1, was able to interact with and activate the C. elegans homolog of TAK1, MOM-4. However, the site in TAP-1 corresponding to Phe-484 of TAB1 is an alanine residue (Ala-364), and changing this residue to Phe abrogates the ability of TAP-1 to interact with and activate MOM-4. These results suggest that the Phe or Ala residue within the conserved motif of the TAB1-related proteins is important for interaction with and activation of specific TAK1 MAPKKK family members in vivo.
A Chromatin Insulator-Like Element in the Herpes Simplex Virus Type 1 Latency-Associated Transcript Region Binds CCCTC-Binding Factor and Displays Enhancer-Blocking and Silencing Activities

PubMed Central

Amelio, Antonio L.; McAnany, Peterjon K.; Bloom, David C.

2006-01-01

A previous study demonstrated that the latency-associated transcript (LAT) promoter and the LAT enhancer/reactivation critical region (rcr) are enriched in acetyl histone H3 (K9, K14) during herpes simplex virus type 1 (HSV-1) latency, whereas all lytic genes analyzed (ICP0, UL54, ICP4, and DNA polymerase) are not (N. J. Kubat, R. K. Tran, P. McAnany, and D. C. Bloom, J. Virol. 78:1139-1149, 2004). This suggests that the HSV-1 latent genome is organized into histone H3 (K9, K14) hyperacetylated and hypoacetylated regions corresponding to transcriptionally permissive and transcriptionally repressed chromatin domains, respectively. Such an organization implies that chromatin insulators, similar to those of cellular chromosomes, may separate distinct transcriptional domains of the HSV-1 latent genome. In the present study, we sought to identify cis elements that could partition the HSV-1 genome into distinct chromatin domains. Sequence analysis coupled with chromatin immunoprecipitation and luciferase reporter assays revealed that (i) the long and short repeats and the unique-short region of the HSV-1 genome contain clustered CTCF (CCCTC-binding factor) motifs, (ii) CTCF motif clusters similar to those in HSV-1 are conserved in other alphaherpesviruses, (iii) CTCF binds to these motifs on latent HSV-1 genomes in vivo, and (iv) a 1.5-kb region containing the CTCF motif cluster in the LAT region possesses insulator activities, specifically, enhancer blocking and silencing. The finding that CTCF, a cellular protein associated with chromatin insulators, binds to motifs on the latent genome and insulates the LAT enhancer suggests that CTCF may facilitate the formation of distinct chromatin boundaries during herpesvirus latency. PMID:16474142
New insights into the targeting of a subset of tail-anchored proteins to the outer mitochondrial membrane

PubMed Central

Marty, Naomi J.; Teresinski, Howard J.; Hwang, Yeen Ting; Clendening, Eric A.; Gidda, Satinder K.; Sliwinska, Elwira; Zhang, Daiyuan; Miernyk, Ján A.; Brito, Glauber C.; Andrews, David W.; Dyer, John M.; Mullen, Robert T.

2014-01-01

Tail-anchored (TA) proteins are a unique class of functionally diverse membrane proteins defined by their single C-terminal membrane-spanning domain and their ability to insert post-translationally into specific organelles with an Ncytoplasm-Corganelle interior orientation. The molecular mechanisms by which TA proteins are sorted to the proper organelles are not well-understood. Herein we present results indicating that a dibasic targeting motif (i.e., -R-R/K/H-X{X≠E}) identified previously in the C terminus of the mitochondrial isoform of the TA protein cytochrome b5, also exists in many other A. thaliana outer mitochondrial membrane (OMM)-TA proteins. This motif is conspicuously absent, however, in all but one of the TA protein subunits of the translocon at the outer membrane of mitochondria (TOM), suggesting that these two groups of proteins utilize distinct biogenetic pathways. Consistent with this premise, we show that the TA sequences of the dibasic-containing proteins are both necessary and sufficient for targeting to mitochondria, and are interchangeable, while the TA regions of TOM proteins lacking a dibasic motif are necessary, but not sufficient for localization, and cannot be functionally exchanged. We also present results from a comprehensive mutational analysis of the dibasic motif and surrounding sequences that not only greatly expands the functional definition and context-dependent properties of this targeting signal, but also led to the identification of other novel putative OMM-TA proteins. Collectively, these results provide important insight to the complexity of the targeting pathways involved in the biogenesis of OMM-TA proteins and help define a consensus targeting motif that is utilized by at least a subset of these proteins. PMID:25237314

Unique scorpion toxin with a putative ancestral fold provides insight into evolution of the inhibitor cystine knot motif.

PubMed

Smith, Jennifer J; Hill, Justine M; Little, Michelle J; Nicholson, Graham M; King, Glenn F; Alewood, Paul F

2011-06-28

The three-disulfide inhibitor cystine knot (ICK) motif is a fold common to venom peptides from spiders, scorpions, and aquatic cone snails. Over a decade ago it was proposed that the ICK motif is an elaboration of an ancestral two-disulfide fold coined the disulfide-directed β-hairpin (DDH). Here we report the isolation, characterization, and structure of a novel toxin [U(1)-liotoxin-Lw1a (U(1)-LITX-Lw1a)] from the venom of the scorpion Liocheles waigiensis that is the first example of a native peptide that adopts the DDH fold. U(1)-LITX-Lw1a not only represents the discovery of a missing link in venom protein evolution, it is the first member of a fourth structural fold to be adopted by scorpion-venom peptides. Additionally, we show that U(1)-LITX-Lw1a has potent insecticidal activity across a broad range of insect pest species, thereby providing a unique structural scaffold for bioinsecticide development.
Dynamic motif occupancy (DynaMO) analysis identifies transcription factors and their binding sites driving dynamic biological processes.

PubMed

Kuang, Zheng; Ji, Zhicheng; Boeke, Jef D; Ji, Hongkai

2018-01-09

Biological processes are usually associated with genome-wide remodeling of transcription driven by transcription factors (TFs). Identifying key TFs and their spatiotemporal binding patterns are indispensable to understanding how dynamic processes are programmed. However, most methods are designed to predict TF binding sites only. We present a computational method, dynamic motif occupancy analysis (DynaMO), to infer important TFs and their spatiotemporal binding activities in dynamic biological processes using chromatin profiling data from multiple biological conditions such as time-course histone modification ChIP-seq data. In the first step, DynaMO predicts TF binding sites with a random forests approach. Next and uniquely, DynaMO infers dynamic TF binding activities at predicted binding sites using their local chromatin profiles from multiple biological conditions. Another landmark of DynaMO is to identify key TFs in a dynamic process using a clustering and enrichment analysis of dynamic TF binding patterns. Application of DynaMO to the yeast ultradian cycle, mouse circadian clock and human neural differentiation exhibits its accuracy and versatility. We anticipate DynaMO will be generally useful for elucidating transcriptional programs in dynamic processes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Intragenic motifs regulate the transcriptional complexity of Pkhd1/PKHD1

PubMed Central

Boddu, Ravindra; Yang, Chaozhe; O’Connor, Amber K.; Hendrickson, Robert Curtis; Boone, Braden; Cui, Xiangqin; Garcia-Gonzalez, Miguel; Igarashi, Peter; Onuchic, Luiz F.; Germino, Gregory G.

2014-01-01

Autosomal recessive polycystic kidney disease (ARPKD) results from mutations in the human PKHD1 gene. Both this gene, and its mouse ortholog, Pkhd1, are primarily expressed in renal and biliary ductal structures. The mouse protein product, fibrocystin/polyductin complex (FPC), is a 445-kDa protein encoded by a 67-exon transcript that spans >500 kb of genomic DNA. In the current study, we observed multiple alternatively spliced Pkhd1 transcripts that varied in size and exon composition in embryonic mouse kidney, liver, and placenta samples, as well as among adult mouse pancreas, brain, heart, lung, testes, liver, and kidney. Using reverse transcription PCR and RNASeq, we identified 22 novel Pkhd1 kidney transcripts with unique exon junctions. Various mechanisms of alternative splicing were observed, including exon skipping, use of alternate acceptor/donor splice sites, and inclusion of novel exons. Bioinformatic analyses identified, and exon-trapping minigene experiments validated, consensus binding sites for serine/arginine-rich proteins that modulate alternative splicing. Using site-directed mutagenesis, we examined the functional importance of selected splice enhancers. In addition, we demonstrated that many of the novel transcripts were polysome bound, thus likely translated. Finally, we determined that the human PKHD1 R760H missense variant alters a splice enhancer motif that disrupts exon splicing in vitro and is predicted to truncate the protein. Taken together, these data provide evidence of the complex transcriptional regulation of Pkhd1/PKHD1 and identified motifs that regulate its splicing. Our studies indicate that Pkhd1/PKHD1 transcription is modulated, in part by intragenic factors, suggesting that aberrant PKHD1 splicing represents an unappreciated pathogenic mechanism in ARPKD. PMID:24984783
Identification and characterization of novel and potent transcription promoters of Francisella tularensis.

PubMed

Zaide, Galia; Grosfeld, Haim; Ehrlich, Sharon; Zvi, Anat; Cohen, Ofer; Shafferman, Avigdor

2011-03-01

Two alternative promoter trap libraries, based on the green fluorescence protein (gfp) reporter and on the chloramphenicol acetyltransferase (cat) cassette, were constructed for isolation of potent Francisella tularensis promoters. Of the 26,000 F. tularensis strain LVS gfp library clones, only 3 exhibited visible fluorescence following UV illumination and all appeared to carry the bacterioferritin promoter (Pbfr). Out of a total of 2,000 chloramphenicol-resistant LVS clones isolated from the cat promoter library, we arbitrarily selected 40 for further analysis. Over 80% of these clones carry unique F. tularensis DNA sequences which appear to drive a wide range of protein expression, as determined by specific chloramphenicol acetyltransferase (CAT) Western dot blot and enzymatic assays. The DNA sequence information for the 33 unique and novel F. tularensis promoters reported here, along with the results of in silico and primer extension analyses, suggest that F. tularensis possesses classical Escherichia coli σ(70)-related promoter motifs. These motifs include the -10 (TATAAT) and -35 [TTGA(C/T)A] domains and an AT-rich region upstream from -35, reminiscent of but distinct from the E. coli upstream region that is termed the UP element. The most efficient promoter identified (Pbfr) appears to be about 10 times more potent than the F. tularensis groEL promoter and is probably among the strongest promoters in F. tularensis. The battery of promoters identified in this work will be useful, among other things, for genetic manipulation in the background of F. tularensis intended to gain better understanding of the mechanisms involved in pathogenesis and virulence, as well as for vaccine development studies.
Cooperative Tertiary Interaction Network Guides RNA Folding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Behrouzi, Reza; Roh, Joon Ho; Kilburn, Duncan

2013-04-08

Noncoding RNAs form unique 3D structures, which perform many regulatory functions. To understand how RNAs fold uniquely despite a small number of tertiary interaction motifs, we mutated the major tertiary interactions in a group I ribozyme by single-base substitutions. The resulting perturbations to the folding energy landscape were measured using SAXS, ribozyme activity, hydroxyl radical footprinting, and native PAGE. Double- and triple-mutant cycles show that most tertiary interactions have a small effect on the stability of the native state. Instead, the formation of core and peripheral structural motifs is cooperatively linked in near-native folding intermediates, and this cooperativity depends onmore » the native helix orientation. The emergence of a cooperative interaction network at an early stage of folding suppresses nonnative structures and guides the search for the native state. We suggest that cooperativity in noncoding RNAs arose from natural selection of architectures conducive to forming a unique, stable fold.« less
Placement of molecules in (not out of) the cell

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dauter, Zbigniew, E-mail: dauter@anl.gov

2013-01-01

The importance of presenting macromolecular structures in unified, standard ways is discussed. To uniquely describe a crystal structure, it is sufficient to specify the crystal unit cell and symmetry, and describe the unique structural motif which is repeated by the space-group symmetry throughout the whole crystal. It is somewhat arbitrary how such a unique motif can be defined and positioned with respect to the unit-cell origin. As a result of such freedom, some isomorphous structures are presented in the Protein Data Bank in different locations and appear as if they have different atomic coordinates, despite being completely equivalent structurally. Thismore » may easily confuse those users of the PDB who are less familiar with crystallographic symmetry transformations. It would therefore be beneficial for the community of PDB users to introduce standard rules for locating crystal structures of macromolecules in the unit cells of various space groups.« less
A Rare SNP Identified a TCP Transcription Factor Essential for Tendril Development in Cucumber.

PubMed

Wang, Shenhao; Yang, Xueyong; Xu, Mengnan; Lin, Xingzhong; Lin, Tao; Qi, Jianjian; Shao, Guangjin; Tian, Nana; Yang, Qing; Zhang, Zhonghua; Huang, Sanwen

2015-12-07

Rare genetic variants are abundant in genomes but less tractable in genome-wide association study. Here we exploit a strategy of rare variation mapping to discover a gene essential for tendril development in cucumber (Cucumis sativus L.). In a collection of >3000 lines, we discovered a unique tendril-less line that forms branches instead of tendrils and, therefore, loses its climbing ability. We hypothesized that this unusual phenotype was caused by a rare variation and subsequently identified the causative single nucleotide polymorphism. The affected gene TEN encodes a TCP transcription factor conserved within the cucurbits and is expressed specifically in tendrils, representing a new organ identity gene. The variation occurs within a protein motif unique to the cucurbits and impairs its function as a transcriptional activator. Analyses of transcriptomes from near-isogenic lines identified downstream genes required for the tendril's capability to sense and climb a support. This study provides an example to explore rare functional variants in plant genomes. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.
Improving the Accuracy and Scalability of Discriminative Learning Methods for Markov Logic Networks

DTIC Science & Technology

2011-05-01

9 2.2 Inductive Logic Programming and Aleph . . . . . . . . . . . . 10 2.3 MLNs and Alchemy ...positive examples. Aleph allows users to customize each of 10 these steps, and thereby supports a variety of specific algorithms. 2.3 MLNs and Alchemy An...tural motifs. By limiting the search to each unique motif, LSM is able to find good clauses in an efficient manner. Alchemy (Kok, Singla, Richardson
An experimental test of a fundamental food web motif.

PubMed

Rip, Jason M K; McCann, Kevin S; Lynn, Denis H; Fawcett, Sonia

2010-06-07

Large-scale changes to the world's ecosystem are resulting in the deterioration of biostructure-the complex web of species interactions that make up ecological communities. A difficult, yet crucial task is to identify food web structures, or food web motifs, that are the building blocks of this baroque network of interactions. Once identified, these food web motifs can then be examined through experiments and theory to provide mechanistic explanations for how structure governs ecosystem stability. Here, we synthesize recent ecological research to show that generalist consumers coupling resources with different interaction strengths, is one such motif. This motif amazingly occurs across an enormous range of spatial scales, and so acts to distribute coupled weak and strong interactions throughout food webs. We then perform an experiment that illustrates the importance of this motif to ecological stability. We find that weak interactions coupled to strong interactions by generalist consumers dampen strong interaction strengths and increase community stability. This study takes a critical step by isolating a common food web motif and through clear, experimental manipulation, identifies the fundamental stabilizing consequences of this structure for ecological communities.
Auxiliary KChIP4a Suppresses A-type K+ Current through Endoplasmic Reticulum (ER) Retention and Promoting Closed-state Inactivation of Kv4 Channels*

PubMed Central

Tang, Yi-Quan; Liang, Ping; Zhou, Jingheng; Lu, Yanxin; Lei, Lei; Bian, Xiling; Wang, KeWei

2013-01-01

In the brain and heart, auxiliary Kv channel-interacting proteins (KChIPs) co-assemble with pore-forming Kv4 α-subunits to form a native K+ channel complex and regulate the expression and gating properties of Kv4 currents. Among the KChIP1–4 members, KChIP4a exhibits a unique N terminus that is known to suppress Kv4 function, but the underlying mechanism of Kv4 inhibition remains unknown. Using a combination of confocal imaging, surface biotinylation, and electrophysiological recordings, we identified a novel endoplasmic reticulum (ER) retention motif, consisting of six hydrophobic and aliphatic residues, 12–17 (LIVIVL), within the KChIP4a N-terminal KID, that functions to reduce surface expression of Kv4-KChIP complexes. This ER retention capacity is transferable and depends on its flanking location. In addition, adjacent to the ER retention motif, the residues 19–21 (VKL motif) directly promote closed-state inactivation of Kv4.3, thus leading to an inhibition of channel current. Taken together, our findings demonstrate that KChIP4a suppresses A-type Kv4 current via ER retention and enhancement of Kv4 closed-state inactivation. PMID:23576435
Auxiliary KChIP4a suppresses A-type K+ current through endoplasmic reticulum (ER) retention and promoting closed-state inactivation of Kv4 channels.

PubMed

Tang, Yi-Quan; Liang, Ping; Zhou, Jingheng; Lu, Yanxin; Lei, Lei; Bian, Xiling; Wang, KeWei

2013-05-24

In the brain and heart, auxiliary Kv channel-interacting proteins (KChIPs) co-assemble with pore-forming Kv4 α-subunits to form a native K(+) channel complex and regulate the expression and gating properties of Kv4 currents. Among the KChIP1-4 members, KChIP4a exhibits a unique N terminus that is known to suppress Kv4 function, but the underlying mechanism of Kv4 inhibition remains unknown. Using a combination of confocal imaging, surface biotinylation, and electrophysiological recordings, we identified a novel endoplasmic reticulum (ER) retention motif, consisting of six hydrophobic and aliphatic residues, 12-17 (LIVIVL), within the KChIP4a N-terminal KID, that functions to reduce surface expression of Kv4-KChIP complexes. This ER retention capacity is transferable and depends on its flanking location. In addition, adjacent to the ER retention motif, the residues 19-21 (VKL motif) directly promote closed-state inactivation of Kv4.3, thus leading to an inhibition of channel current. Taken together, our findings demonstrate that KChIP4a suppresses A-type Kv4 current via ER retention and enhancement of Kv4 closed-state inactivation.
Linear motif-mediated interactions have contributed to the evolution of modularity in complex protein interaction networks.

PubMed

Kim, Inhae; Lee, Heetak; Han, Seong Kyu; Kim, Sanguk

2014-10-01

The modular architecture of protein-protein interaction (PPI) networks is evident in diverse species with a wide range of complexity. However, the molecular components that lead to the evolution of modularity in PPI networks have not been clearly identified. Here, we show that weak domain-linear motif interactions (DLIs) are more likely to connect different biological modules than strong domain-domain interactions (DDIs). This molecular division of labor is essential for the evolution of modularity in the complex PPI networks of diverse eukaryotic species. In particular, DLIs may compensate for the reduction in module boundaries that originate from increased connections between different modules in complex PPI networks. In addition, we show that the identification of biological modules can be greatly improved by including molecular characteristics of protein interactions. Our findings suggest that transient interactions have played a unique role in shaping the architecture and modularity of biological networks over the course of evolution.
Neurogliaform cortical interneurons derive from cells in the preoptic area

PubMed Central

Cadilhac, Christelle; Prados, Julien; Holtmaat, Anthony

2018-01-01

Delineating the basic cellular components of cortical inhibitory circuits remains a fundamental issue in order to understand their specific contributions to microcircuit function. It is still unclear how current classifications of cortical interneuron subtypes relate to biological processes such as their developmental specification. Here we identified the developmental trajectory of neurogliaform cells (NGCs), the main effectors of a powerful inhibitory motif recruited by long-range connections. Using in vivo genetic lineage-tracing in mice, we report that NGCs originate from a specific pool of 5-HT3AR-expressing Hmx3+ cells located in the preoptic area (POA). Hmx3-derived 5-HT3AR+ cortical interneurons (INs) expressed the transcription factors PROX1, NR2F2, the marker reelin but not VIP and exhibited the molecular, morphological and electrophysiological profile of NGCs. Overall, these results indicate that NGCs are a distinct class of INs with a unique developmental trajectory and open the possibility to study their specific functional contribution to cortical inhibitory microcircuit motifs. PMID:29557780
Functional cis-regulatory modules encoded by mouse-specific endogenous retrovirus

PubMed Central

Sundaram, Vasavi; Choudhary, Mayank N. K.; Pehrsson, Erica; Xing, Xiaoyun; Fiore, Christopher; Pandey, Manishi; Maricque, Brett; Udawatta, Methma; Ngo, Duc; Chen, Yujie; Paguntalan, Asia; Ray, Tammy; Hughes, Ava; Cohen, Barak A.; Wang, Ting

2017-01-01

Cis-regulatory modules contain multiple transcription factor (TF)-binding sites and integrate the effects of each TF to control gene expression in specific cellular contexts. Transposable elements (TEs) are uniquely equipped to deposit their regulatory sequences across a genome, which could also contain cis-regulatory modules that coordinate the control of multiple genes with the same regulatory logic. We provide the first evidence of mouse-specific TEs that encode a module of TF-binding sites in mouse embryonic stem cells (ESCs). The majority (77%) of the individual TEs tested exhibited enhancer activity in mouse ESCs. By mutating individual TF-binding sites within the TE, we identified a module of TF-binding motifs that cooperatively enhanced gene expression. Interestingly, we also observed the same motif module in the in silico constructed ancestral TE that also acted cooperatively to enhance gene expression. Our results suggest that ancestral TE insertions might have brought in cis-regulatory modules into the mouse genome. PMID:28348391
SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data

PubMed Central

Dotu, Ivan; Adamson, Scott I.; Coleman, Benjamin; Fournier, Cyril; Ricart-Altimiras, Emma; Eyras, Eduardo

2018-01-01

RNA-protein binding is critical to gene regulation, controlling fundamental processes including splicing, translation, localization and stability, and aberrant RNA-protein interactions are known to play a role in a wide variety of diseases. However, molecular understanding of RNA-protein interactions remains limited; in particular, identification of RNA motifs that bind proteins has long been challenging, especially when such motifs depend on both sequence and structure. Moreover, although RNA binding proteins (RBPs) often contain more than one binding domain, algorithms capable of identifying more than one binding motif simultaneously have not been developed. In this paper we present a novel pipeline to determine binding peaks in crosslinking immunoprecipitation (CLIP) data, to discover multiple possible RNA sequence/structure motifs among them, and to experimentally validate such motifs. At the core is a new semi-automatic algorithm SARNAclust, the first unsupervised method to identify and deconvolve multiple sequence/structure motifs simultaneously. SARNAclust computes similarity between sequence/structure objects using a graph kernel, providing the ability to isolate the impact of specific features through the bulge graph formalism. Application of SARNAclust to synthetic data shows its capability of clustering 5 motifs at once with a V-measure value of over 0.95, while GraphClust achieves only a V-measure of 0.083 and RNAcontext cannot detect any of the motifs. When applied to existing eCLIP sets, SARNAclust finds known motifs for SLBP and HNRNPC and novel motifs for several other RBPs such as AGGF1, AKAP8L and ILF3. We demonstrate an experimental validation protocol, a targeted Bind-n-Seq-like high-throughput sequencing approach that relies on RNA inverse folding for oligo pool design, that can validate the components within the SLBP motif. Finally, we use this protocol to experimentally interrogate the SARNAclust motif predictions for protein ILF3. Our results support a newly identified partially double-stranded UUUUUGAGA motif similar to that known for the splicing factor HNRNPC. PMID:29596423
Regioselective Aziridination of Silyl Allenes and Application for the Synthesis of New Heterocycles

NASA Astrophysics Data System (ADS)

Burke, Eileen Grace

Rhodium catalyzed aziridination of homoallenic sulfamates has proven to be a successful first step in the synthesis of a diverse array of complex nitrogenated motifs. Previously, however, the resultant methyleneaziridine was limited to the exocyclic isomer. In this work, a reliable direction strategy for the formation of the endocyclic isomer was identified. Placement of a silicon group on the allene so that its C - Si bond is co-planar to the distal pi-bond allowed for stabilization of the developing positive charge during aziridination, and therefore selective activation to form the endocyclic methyleneaziridine. This strategy proved robust, and endocyclic methyleneaziridines were formed in high yields with exclusive formation of the desired isomer regardless of the substitution of the allene. With the endocyclic isomer now readily available, its reactivity could be explored. First, the endocyclic methyleneaziridine was applied to the synthesis of densely functionalized, nitrogen containing motifs. In comparison with their exocyclic counterparts, the endocyclic methyleneaziridines were found to have differing reactivity. The olefin could be epoxidized using meta-chloroperoxybenzoic acid (mCPBA), and the resulting spirocyclic intermediate rapidly rearranged to an azetidin-3-one. This synthesis of the highly substituted fourmembered heterocycle represented a novel approach to these motifs, and was found to be both flexible, and to selectively form a single diastereomer. Additional derivatization of these scaffolds gave a diverse array of complex products. Further use of the endocyclic methyleneaziridine focused not on the complexity of the product motif but rather on its utility. The remaining silyl group could be eliminated upon reaction with a fluoride source, triggering the formation of an alkyne and resultant opening of the aziridine. This strained heterocyclic alkyne and its synthesis represent a new addition to the field of strained alkyne synthesis. Uniquely, the arrangement of heteroatoms activated the alkyne without allowing for detrimental relaxation of ring strain, giving a strained alkyne that balanced reactivity and stability. These alkynes were applied to post-polymerization modification, wherein their unique capability of opening the strain inducing ring after reaction of the alkyne was successfully demonstrated.
RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching

NASA Astrophysics Data System (ADS)

Nasalean, Lorena; Stombaugh, Jesse; Zirbel, Craig L.; Leontis, Neocles B.

Structured RNA molecules resemble proteins in the hierarchical organization of their global structures, folding and broad range of functions. Structured RNAs are composed of recurrent modular motifs that play specific functional roles. Some motifs direct the folding of the RNA or stabilize the folded structure through tertiary interactions. Others bind ligands or proteins or catalyze chemical reactions. Therefore, it is desirable, starting from the RNA sequence, to be able to predict the locations of recurrent motifs in RNA molecules. Conversely, the potential occurrence of one or more known 3D RNA motifs may indicate that a genomic sequence codes for a structured RNA molecule. To identify known RNA structural motifs in new RNA sequences, precise structure-based definitions are needed that specify the core nucleotides of each motif and their conserved interactions. By comparing instances of each recurrent motif and applying base pair isosteriCity relations, one can identify neutral mutations that preserve its structure and function in the contexts in which it occurs.
Discovery of T Cell Receptor β Motifs Specific to HLA-B27-Positive Ankylosing Spondylitis by Deep Repertoire Sequence Analysis.

PubMed

Faham, Malek; Carlton, Victoria; Moorhead, Martin; Zheng, Jianbiao; Klinger, Mark; Pepin, Francois; Asbury, Thomas; Vignali, Marissa; Emerson, Ryan O; Robins, Harlan S; Ireland, James; Baechler-Gillespie, Emily; Inman, Robert D

2017-04-01

Ankylosing spondylitis (AS), a chronic inflammatory disorder, has a notable association with HLA-B27. One hypothesis suggests that a common antigen that binds to HLA-B27 is important for AS disease pathogenesis. This study was undertaken to determine sequences and motifs that are shared among HLA-B27-positive AS patients, using T cell repertoire next-generation sequencing. To identify motifs enriched among B27-positive AS patients, we performed T cell receptor β (TCRβ) repertoire sequencing on samples from 191 B27-positive AS patients, 43 B27-negative AS patients, and 227 controls, and we obtained >77 million TCRβ clonotype sequences. First, we assessed whether any of 50 previously published sequences were enriched in B27-positive AS patients. We then used training and test cohorts to identify discovered motifs that were enriched in B27-positive AS patients versus controls. Six previously published and 11 discovered motifs were enriched in the B27-positive AS samples as compared to controls. After combining motifs related by sequence, we identified a total of 15 independent motifs. Both the full set of 15 motifs and a set of 6 published motifs were enriched in the B27-positive AS patients as compared to B27-positive healthy individuals (P = 0.049 and P = 0.001, respectively). Using an independent cohort, we validated that at least some of these motifs were associated with AS, and not simply with B27-positive status. We identified TCRβ motifs that are enriched in B27-positive AS patients as compared to B27-positive healthy controls. This suggests that a common antigen, presented by HLA-B27 and detected by CD8+ T cells, may be associated with AS disease pathogenesis. © 2016, American College of Rheumatology.
Transcriptional mapping of the varicella-zoster virus regulatory genes encoding open reading frames 4 and 63.

PubMed Central

Kinchington, P R; Vergnes, J P; Defechereux, P; Piette, J; Turse, S E

1994-01-01

Four of the 68 varicella-zoster virus (VZV) unique open reading frames (ORFs), i.e., ORFs 4, 61, 62, and 63, encode proteins that influence viral transcription and are considered to be positional homologs of herpes simplex virus type 1 (HSV-1) immediate-early (IE) proteins. In order to identify the elements that regulate transcription of VZV ORFs 4 and 63, the encoded mRNAs were mapped in detail. For ORF 4, a major 1.8-kb and a minor 3.0-kb polyadenylated [poly(A)+] RNA were identified, whereas ORF 63-specific probes recognized 1.3- and 1.9-kb poly(A)+ RNAs. Probes specific for sequences adjacent to the ORFs and mapping of the RNA 3' ends indicated that the ORF 4 RNAs were 3' coterminal, whereas the RNAs for ORF 63 represented two different termination sites. S1 nuclease mapping and primer extension analyses indicated a single transcription initiation site for ORF 4 at 38 bp upstream of the ORF start codon. For ORF 63, multiple transcriptional start sites at 87 to 95, 151 to 153, and (tentatively) 238 to 243 bp upstream of the ORF start codon were identified. TATA box motifs at good positional locations were found upstream of all mapped transcription initiation sites. However, no sequences resembling the TAATGARAT motif, which confers IE regulation upon HSV-1 IE genes, were found. The finding of the absence of this motif was supported through analyses of the regulatory sequences of ORFs 4 and 63 in transient transfection assays alongside those of ORFs 61 and 62. Sequences representing the promoters for ORFs 4, 61, and 63 were all stimulated by VZV infection but failed to be stimulated by coexpression with the HSV-1 transactivator Vmw65. In contrast, the promoter for ORF 62, which contains TAATGARAT motifs, was activated by VZV infection and coexpression with Vmw65. These results extend the transcriptional knowledge for VZV and suggest that ORFs 4 and 63 contain regulatory signals different from those of the ORF 62 and HSV-1 IE genes. Images PMID:8189496
Discovery of Critical Residues for Viral Entry and Inhibition through Structural Insight of HIV-1 Fusion Inhibitor CP621–652*

PubMed Central

Chong, Huihui; Yao, Xue; Qiu, Zonglin; Qin, Bo; Han, Ruiyun; Waltersperger, Sandro; Wang, Meitian; Cui, Sheng; He, Yuxian

2012-01-01

The core structure of HIV-1 gp41 is a stable six-helix bundle (6-HB) folded by its trimeric N- and C-terminal heptad repeats (NHR and CHR). We previously identified that the 621QIWNNMT627 motif located at the upstream region of gp41 CHR plays critical roles for the stabilization of the 6-HB core and peptide CP621–652 containing this motif is a potent HIV-1 fusion inhibitor, however, the molecular determinants underlying the stability and anti-HIV activity remained elusive. In this study, we determined the high-resolution crystal structure of CP621–652 complexed by T21. We find that the 621QIWNNMT627 motif does not maintain the α-helical conformation. Instead, residues Met626 and Thr627 form a unique hook-like structure (denoted as M-T hook), in which Thr627 redirects the peptide chain to position Met626 above the left side of the hydrophobic pocket on the NHR trimer. The side chain of Met626 caps the hydrophobic pocket, stabilizing the interaction between the pocket and the pocket-binding domain. Our mutagenesis studies demonstrate that mutations of the M-T hook residues could completely abolish HIV-1 Env-mediated cell fusion and virus entry, and significantly destabilize the interaction of NHR and CHR peptides and reduce the anti-HIV activity of CP621–652. Our results identify an unusual structural feature that stabilizes the six-helix bundle, providing novel insights into the mechanisms of HIV-1 fusion and inhibition. PMID:22511760

Identification of a Novel Penicillin-Binding Protein from Helicobacter pylori

PubMed Central

Krishnamurthy, Partha; Parlow, Mary H.; Schneider, John; Burroughs, Stephanie; Wickland, Catherine; Vakil, Nimish B.; Dunn, Bruce E.; Phadnis, Suhas H.

1999-01-01

The Helicobacter pylori genome encodes four penicillin-binding proteins (PBPs). PBPs 1, 2, and 3 exhibit similarities to known PBPs. The sequence of PBP 4 is unique in that it displays a novel combination of two highly conserved PBP motifs and an absence of a third motif. Expression of PBP 4, but not PBP 1, 2, or 3, is significantly increased during mid- to late-log-phase growth. PMID:10438788
An effective approach for annotation of protein families with low sequence similarity and conserved motifs: identifying GDSL hydrolases across the plant kingdom.

PubMed

Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica

2016-02-18

The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through the graphical user interface ( http://compbio.math.hr/ ). Our results show that scanning with a carefully parameterized motif-HMM is an effective approach for annotation of protein families with low sequence similarity and conserved motifs. The results of this study expand current knowledge and provide new insights into the evolution of the large GDSL-lipase family in land plants.
Reversible conformational switching of i-motif DNA studied by fluorescence spectroscopy.

PubMed

Choi, Jungkweon; Majima, Tetsuro

2013-01-01

Non-B DNAs, which can form unique structures other than double helix of B-DNA, have attracted considerable attention from scientists in various fields including biology, chemistry and physics etc. Among them, i-motif DNA, which is formed from cytosine (C)-rich sequences found in telomeric DNA and the promoter region of oncogenes, has been extensively investigated as a signpost and controller for the oncogene expression at the transcription level and as a promising material in nanotechnology. Fluorescence techniques such as fluorescence resonance energy transfer (FRET) and the fluorescence quenching are important for studying DNA and in particular for the visualization of reversible conformational switching of i-motif DNA that is triggered by the protonation. Here, we review the latest studies on the conformational dynamics of i-motif DNA as well as the application of FRET and fluorescence quenching techniques to the visualization of reversible conformational switching of i-motif DNA in nano-biotechnology. © 2013 Wiley Periodicals, Inc. Photochemistry and Photobiology © 2013 The American Society of Photobiology.
Venomics of Remipede Crustaceans Reveals Novel Peptide Diversity and Illuminates the Venom's Biological Role.

PubMed

von Reumont, Björn M; Undheim, Eivind A B; Jauss, Robin-Tobias; Jenner, Ronald A

2017-07-26

We report the first integrated proteomic and transcriptomic investigation of a crustacean venom. Remipede crustaceans are the venomous sister group of hexapods, and the venom glands of the remipede Xibalbanus tulumensis express a considerably more complex cocktail of proteins and peptides than previously thought. We identified 32 venom protein families, including 13 novel peptide families that we name xibalbins, four of which lack similarities to any known structural class. Our proteomic data confirm the presence in the venom of 19 of the 32 families. The most highly expressed venom components are serine peptidases, chitinase and six of the xibalbins. The xibalbins represent Inhibitory Cystine Knot peptides (ICK), a double ICK peptide, peptides with a putative Cystine-stabilized α-helix/β-sheet motif, a peptide similar to hairpin-like β-sheet forming antimicrobial peptides, two peptides related to different hormone families, and four peptides with unique structural motifs. Remipede venom components represent the full range of evolutionary recruitment frequencies, from families that have been recruited into many animal venoms (serine peptidases, ICKs), to those having a very narrow taxonomic range (double ICKs), to those unique for remipedes. We discuss the most highly expressed venom components to shed light on their possible functional significance in the predatory and defensive use of remipede venom, and to provide testable ideas for any future bioactivity studies.
Venomics of Remipede Crustaceans Reveals Novel Peptide Diversity and Illuminates the Venom’s Biological Role

PubMed Central

von Reumont, Björn M.; Undheim, Eivind A. B.; Jauss, Robin-Tobias; Jenner, Ronald A.

2017-01-01

We report the first integrated proteomic and transcriptomic investigation of a crustacean venom. Remipede crustaceans are the venomous sister group of hexapods, and the venom glands of the remipede Xibalbanus tulumensis express a considerably more complex cocktail of proteins and peptides than previously thought. We identified 32 venom protein families, including 13 novel peptide families that we name xibalbins, four of which lack similarities to any known structural class. Our proteomic data confirm the presence in the venom of 19 of the 32 families. The most highly expressed venom components are serine peptidases, chitinase and six of the xibalbins. The xibalbins represent Inhibitory Cystine Knot peptides (ICK), a double ICK peptide, peptides with a putative Cystine-stabilized α-helix/β-sheet motif, a peptide similar to hairpin-like β-sheet forming antimicrobial peptides, two peptides related to different hormone families, and four peptides with unique structural motifs. Remipede venom components represent the full range of evolutionary recruitment frequencies, from families that have been recruited into many animal venoms (serine peptidases, ICKs), to those having a very narrow taxonomic range (double ICKs), to those unique for remipedes. We discuss the most highly expressed venom components to shed light on their possible functional significance in the predatory and defensive use of remipede venom, and to provide testable ideas for any future bioactivity studies. PMID:28933727
Detection and Preliminary Analysis of Motifs in Promoters of Anaerobically Induced Genes of Different Plant Species

PubMed Central

MOHANTY, BIJAYALAXMI; KRISHNAN, S. P. T.; SWARUP, SANJAY; BAJIC, VLADIMIR B.

2005-01-01

• Background and Aims Plants can suffer from oxygen limitation during flooding or more complete submergence and may therefore switch from Kreb's cycle respiration to fermentation in association with the expression of anaerobically inducible genes coding for enzymes involved in glycolysis and fermentation. The aim of this study was to clarify mechanisms of transcriptional regulation of these anaerobic genes by identifying motifs shared by their promoter regions. • Methods Statistically significant motifs were detected by an in silico method from 13 promoters of anaerobic genes. The selected motifs were common for the majority of analysed promoters. Their significance was evaluated by searching for their presence in transcription factor-binding site databases (TRANSFAC, PlantCARE and PLACE). Using several negative control data sets, it was tested whether the motifs found were specific to the anaerobic group. • Key Results Previously, anaerobic response elements have been identified in maize (Zea mays) and arabidopsis (Arabidopsis thaliana) genes. Known functional motifs were detected, such as GT and GC motifs, but also other motifs shared by most of the genes examined. Five motifs detected have not been found in plants hitherto but are present in the promoters of animal genes with various functions. The consensus sequences of these novel motifs are 5′-AAACAAA-3′, 5′-AGCAGC-3′, 5′-TCATCAC-3′, 5′-GTTT(A/C/T)GCAA-3′ and 5′-TTCCCTGTT-3′. • Conclusions It is believed that the promoter motifs identified could be functional by conferring anaerobic sensitivity to the genes that possess them. This proposal now requires experimental verification. PMID:16027132
Complete Genome Analysis of Three Novel Picornaviruses from Diverse Bat Species▿

PubMed Central

Lau, Susanna K. P.; Woo, Patrick C. Y.; Lai, Kenneth K. Y.; Huang, Yi; Yip, Cyril C. Y.; Shek, Chung-Tong; Lee, Paul; Lam, Carol S. F.; Chan, Kwok-Hung; Yuen, Kwok-Yung

2011-01-01

Although bats are important reservoirs of diverse viruses that can cause human epidemics, little is known about the presence of picornaviruses in these flying mammals. Among 1,108 bats of 18 species studied, three novel picornaviruses (groups 1, 2, and 3) were identified from alimentary specimens of 12 bats from five species and four genera. Two complete genomes, each from the three picornaviruses, were sequenced. Phylogenetic analysis showed that they fell into three distinct clusters in the Picornaviridae family, with low homologies to known picornaviruses, especially in leader and 2A proteins. Moreover, group 1 and 2 viruses are more closely related to each other than to group 3 viruses, which exhibit genome features distinct from those of the former two virus groups. In particular, the group 3 virus genome contains the shortest leader protein within Picornaviridae, a putative type I internal ribosome entry site (IRES) in the 5′-untranslated region instead of the type IV IRES found in group 1 and 2 viruses, one instead of two GXCG motifs in 2A, an L→V substitution in the DDLXQ motif in 2C helicase, and a conserved GXH motif in 3C protease. Group 1 and 2 viruses are unique among picornaviruses in having AMH instead of the GXH motif in 3Cpro. These findings suggest that the three picornaviruses belong to two novel genera in the Picornaviridae family. This report describes the discovery and complete genome analysis of three picornaviruses in bats, and their presence in diverse bat genera/species suggests the ability to cross the species barrier. PMID:21697464
Functional Incompatibility between the Generic NF-κB Motif and a Subtype-Specific Sp1III Element Drives the Formation of the HIV-1 Subtype C Viral Promoter

PubMed Central

Verma, Anjali; Rajagopalan, Pavithra; Lotke, Rishikesh; Varghese, Rebu; Selvam, Deepak; Kundu, Tapas K.

2016-01-01

ABSTRACT Of the various genetic subtypes of human immunodeficiency virus types 1 and 2 (HIV-1 and HIV-2) and simian immunodeficiency virus (SIV), only in subtype C of HIV-1 is a genetically variant NF-κB binding site found at the core of the viral promoter in association with a subtype-specific Sp1III motif. How the subtype-associated variations in the core transcription factor binding sites (TFBS) influence gene expression from the viral promoter has not been examined previously. Using panels of infectious viral molecular clones, we demonstrate that subtype-specific NF-κB and Sp1III motifs have evolved for optimal gene expression, and neither of the motifs can be replaced by a corresponding TFBS variant. The variant NF-κB motif binds NF-κB with an affinity 2-fold higher than that of the generic NF-κB site. Importantly, in the context of an infectious virus, the subtype-specific Sp1III motif demonstrates a profound loss of function in association with the generic NF-κB motif. An additional substitution of the Sp1III motif fully restores viral replication, suggesting that the subtype C-specific Sp1III has evolved to function with the variant, but not generic, NF-κB motif. A change of only two base pairs in the central NF-κB motif completely suppresses viral transcription from the provirus and converts the promoter into heterochromatin refractory to tumor necrosis factor alpha (TNF-α) induction. The present work represents the first demonstration of functional incompatibility between an otherwise functional NF-κB motif and a unique Sp1 site in the context of an HIV-1 promoter. Our work provides important leads as to the evolution of the HIV-1 subtype C viral promoter with relevance for gene expression regulation and viral latency. IMPORTANCE Subtype-specific genetic variations provide a powerful tool to examine how these variations offer a replication advantage to specific viral subtypes, if any. Only in subtype C of HIV-1 are two genetically distinct transcription factor binding sites positioned at the most critical location of the viral promoter. Since a single promoter regulates viral gene expression, the promoter variations can play a critical role in determining the replication fitness of the viral strains. Our work for the first time provides a scientific explanation for the presence of a unique NF-κB binding motif in subtype C, a major HIV-1 genetic family responsible for half of the global HIV-1 infections. The results offer compelling evidence that the subtype C viral promoter not only is stronger but also is endowed with a qualitative gain-of-function advantage. The genetically variant NF-κB and the Sp1III motifs may be respond differently to specific cell signal pathways, and these mechanisms must be examined. PMID:27194770
SALAD database: a motif-based database of protein annotations for plant comparative genomics

PubMed Central

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209 529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named ‘SALAD on ARRAYs’ to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis. PMID:19854933
SALAD database: a motif-based database of protein annotations for plant comparative genomics.

PubMed

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.
Unravelling daily human mobility motifs

PubMed Central

Schneider, Christian M.; Belik, Vitaly; Couronné, Thomas; Smoreda, Zbigniew; González, Marta C.

2013-01-01

Human mobility is differentiated by time scales. While the mechanism for long time scales has been studied, the underlying mechanism on the daily scale is still unrevealed. Here, we uncover the mechanism responsible for the daily mobility patterns by analysing the temporal and spatial trajectories of thousands of persons as individual networks. Using the concept of motifs from network theory, we find only 17 unique networks are present in daily mobility and they follow simple rules. These networks, called here motifs, are sufficient to capture up to 90 per cent of the population in surveys and mobile phone datasets for different countries. Each individual exhibits a characteristic motif, which seems to be stable over several months. Consequently, daily human mobility can be reproduced by an analytically tractable framework for Markov chains by modelling periods of high-frequency trips followed by periods of lower activity as the key ingredient. PMID:23658117
DOE Office of Scientific and Technical Information (OSTI.GOV)

Rajan, Rakhi; Taneja, Bhupesh; Mondragón, Alfonso

Topoisomerase V is an archaeal type I topoisomerase that is unique among topoisomerases due to presence of both topoisomerase and DNA repair activities in the same protein. It is organized as an N-terminal topoisomerase domain followed by 24 tandem helix-hairpin-helix (HhH) motifs. Structural studies have shown that the active site is buried by the (HhH) motifs. Here we show that the N-terminal domain can relax DNA in the absence of any HhH motifs and that the HhH motifs are required for stable protein-DNA complex formation. Crystal structures of various topoisomerase V fragments show changes in the relative orientation of themore » domains mediated by a long bent linker helix, and these movements are essential for the DNA to enter the active site. Phosphate ions bound to the protein near the active site helped model DNA in the topoisomerase domain and show how topoisomerase V may interact with DNA.« less
Engineering the shape and structure of materials by fractal cut.

PubMed

Cho, Yigil; Shin, Joong-Ho; Costa, Avelino; Kim, Tae Ann; Kunin, Valentin; Li, Ju; Lee, Su Yeon; Yang, Shu; Han, Heung Nam; Choi, In-Suk; Srolovitz, David J

2014-12-09

In this paper we discuss the transformation of a sheet of material into a wide range of desired shapes and patterns by introducing a set of simple cuts in a multilevel hierarchy with different motifs. Each choice of hierarchical cut motif and cut level allows the material to expand into a unique structure with a unique set of properties. We can reverse-engineer the desired expanded geometries to find the requisite cut pattern to produce it without changing the physical properties of the initial material. The concept was experimentally realized and applied to create an electrode that expands to >800% the original area with only very minor stretching of the underlying material. The generality of our approach greatly expands the design space for materials so that they can be tuned for diverse applications.
[Structure and evolution of the eukaryotic FANCJ-like proteins].

PubMed

Wuhe, Jike; Zefeng, Wu; Sanhong, Fan; Xuguang, Xi

2015-02-01

The FANCJ-like protein family is a class of ATP-dependent helicases that can catalytically unwind duplex DNA along the 5'-3' direction. It is involved in the processes of DNA damage repair, homologous recombination and G-quadruplex DNA unwinding, and plays a critical role in maintaining genome integrity. In this study, we systemically analyzed FNACJ-like proteins from 47 eukaryotic species and discussed their sequences diversity, origin and evolution, motif organization patterns and spatial structure differences. Four members of FNACJ-like proteins, including XPD, CHL1, RTEL1 and FANCJ, were found in eukaryotes, but some of them were seriously deficient in most fungi and some insects. For example, the Zygomycota fungi lost RTEL1, Basidiomycota and Ascomycota fungi lost RTEL1 and FANCJ, and Diptera insect lost FANCJ. FANCJ-like proteins contain canonical motor domains HD1 and HD2, and the HD1 domain further integrates with three unique domains Fe-S, Arch and Extra-D. Fe-S and Arch domains are relatively conservative in all members of the family, but the Extra-D domain is lost in XPD and differs from one another in rest members. There are 7, 10 and 2 specific motifs found from the three unique domains respectively, while 5 and 12 specific motifs are found from HD1 and HD2 domains except the conserved motifs reported previously. By analyzing the arrangement pattern of these specific motifs, we found that RTEL1 and FANCJ are more closer and share two specific motifs Vb2 and Vc in HD2 domain, which are likely related with their G-quadruplex DNA unwinding activity. The evidence of evolution showed that FACNJ-like proteins were originated from a helicase, which has a HD1 domain inserted by extra Fe-S domain and Arch domain. By three continuous gene duplication events and followed specialization, eukaryotes finally possessed the current four members of FANCJ-like proteins.
Novel structural features drive DNA binding properties of Cmr, a CRP family protein in TB complex mycobacteria.

PubMed

Ranganathan, Sridevi; Cheung, Jonah; Cassidy, Michael; Ginter, Christopher; Pata, Janice D; McDonough, Kathleen A

2018-01-09

Mycobacterium tuberculosis (Mtb) encodes two CRP/FNR family transcription factors (TF) that contribute to virulence, Cmr (Rv1675c) and CRPMt (Rv3676). Prior studies identified distinct chromosomal binding profiles for each TF despite their recognizing overlapping DNA motifs. The present study shows that Cmr binding specificity is determined by discriminator nucleotides at motif positions 4 and 13. X-ray crystallography and targeted mutational analyses identified an arginine-rich loop that expands Cmr's DNA interactions beyond the classical helix-turn-helix contacts common to all CRP/FNR family members and facilitates binding to imperfect DNA sequences. Cmr binding to DNA results in a pronounced asymmetric bending of the DNA and its high level of cooperativity is consistent with DNA-facilitated dimerization. A unique N-terminal extension inserts between the DNA binding and dimerization domains, partially occluding the site where the canonical cAMP binding pocket is found. However, an unstructured region of this N-terminus may help modulate Cmr activity in response to cellular signals. Cmr's multiple levels of DNA interaction likely enhance its ability to integrate diverse gene regulatory signals, while its novel structural features establish Cmr as an atypical CRP/FNR family member. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Discovering Sequence Motifs with Arbitrary Insertions and Deletions

PubMed Central

Frith, Martin C.; Saunders, Neil F. W.; Kobe, Bostjan; Bailey, Timothy L.

2008-01-01

Biology is encoded in molecular sequences: deciphering this encoding remains a grand scientific challenge. Functional regions of DNA, RNA, and protein sequences often exhibit characteristic but subtle motifs; thus, computational discovery of motifs in sequences is a fundamental and much-studied problem. However, most current algorithms do not allow for insertions or deletions (indels) within motifs, and the few that do have other limitations. We present a method, GLAM2 (Gapped Local Alignment of Motifs), for discovering motifs allowing indels in a fully general manner, and a companion method GLAM2SCAN for searching sequence databases using such motifs. glam2 is a generalization of the gapless Gibbs sampling algorithm. It re-discovers variable-width protein motifs from the PROSITE database significantly more accurately than the alternative methods PRATT and SAM-T2K. Furthermore, it usefully refines protein motifs from the ELM database: in some cases, the refined motifs make orders of magnitude fewer overpredictions than the original ELM regular expressions. GLAM2 performs respectably on the BAliBASE multiple alignment benchmark, and may be superior to leading multiple alignment methods for “motif-like” alignments with N- and C-terminal extensions. Finally, we demonstrate the use of GLAM2 to discover protein kinase substrate motifs and a gapped DNA motif for the LIM-only transcriptional regulatory complex: using GLAM2SCAN, we identify promising targets for the latter. GLAM2 is especially promising for short protein motifs, and it should improve our ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology. It may be equally useful for arbitrarily gapped motifs in DNA and RNA, although fewer examples of such motifs are known at present. GLAM2 is public domain software, available for download at http://bioinformatics.org.au/glam2. PMID:18437229
Identifying novel sequence variants of RNA 3D motifs

PubMed Central

Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

2015-01-01

Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723
Multiple Binding Modes between HNF4[alpha] and the LXXLL Motifs of PGC-1[alpha] Lead to Full Activation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rha, Geun Bae; Wu, Guangteng; Shoelson, Steven E.

2010-04-15

Hepatocyte nuclear factor 4{alpha} (HNF4{alpha}) is a novel nuclear receptor that participates in a hierarchical network of transcription factors regulating the development and physiology of such vital organs as the liver, pancreas, and kidney. Among the various transcriptional coregulators with which HNF4{alpha} interacts, peroxisome proliferation-activated receptor {gamma} (PPAR{gamma}) coactivator 1{alpha} (PGC-1{alpha}) represents a novel coactivator whose activation is unusually robust and whose binding mode appears to be distinct from that of canonical coactivators such as NCoA/SRC/p160 family members. To elucidate the potentially unique molecular mechanism of PGC-1{alpha} recruitment, we have determined the crystal structure of HNF4{alpha} in complex with amore » fragment of PGC-1{alpha} containing all three of its LXXLL motifs. Despite the presence of all three LXXLL motifs available for interactions, only one is bound at the canonical binding site, with no additional contacts observed between the two proteins. However, a close inspection of the electron density map indicates that the bound LXXLL motif is not a selected one but an averaged structure of more than one LXXLL motif. Further biochemical and functional studies show that the individual LXXLL motifs can bind but drive only minimal transactivation. Only when more than one LXXLL motif is involved can significant transcriptional activity be measured, and full activation requires all three LXXLL motifs. These findings led us to propose a model wherein each LXXLL motif has an additive effect, and the multiple binding modes by HNF4{alpha} toward the LXXLL motifs of PGC-1{alpha} could account for the apparent robust activation by providing a flexible mechanism for combinatorial recruitment of additional coactivators and mediators.« less
Overlapping ETS and CRE Motifs (G/CCGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins

PubMed Central

Chatterjee, Raghunath; Zhao, Jianfei; He, Ximiao; Shlyakhtenko, Andrey; Mann, Ishminder; Waterfall, Joshua J.; Meltzer, Paul; Sathyanarayana, B. K.; FitzGerald, Peter C.; Vinson, Charles

2012-01-01

Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X4-N1-30-X4) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif (C/GCCGGAAGCGGAA) and the ETS⇔CRE motif (C/GCGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif. PMID:23050235
Discriminative motif optimization based on perceptron training

PubMed Central

Patel, Ronak Y.; Stormo, Gary D.

2014-01-01

Motivation: Generating accurate transcription factor (TF) binding site motifs from data generated using the next-generation sequencing, especially ChIP-seq, is challenging. The challenge arises because a typical experiment reports a large number of sequences bound by a TF, and the length of each sequence is relatively long. Most traditional motif finders are slow in handling such enormous amount of data. To overcome this limitation, tools have been developed that compromise accuracy with speed by using heuristic discrete search strategies or limited optimization of identified seed motifs. However, such strategies may not fully use the information in input sequences to generate motifs. Such motifs often form good seeds and can be further improved with appropriate scoring functions and rapid optimization. Results: We report a tool named discriminative motif optimizer (DiMO). DiMO takes a seed motif along with a positive and a negative database and improves the motif based on a discriminative strategy. We use area under receiver-operating characteristic curve (AUC) as a measure of discriminating power of motifs and a strategy based on perceptron training that maximizes AUC rapidly in a discriminative manner. Using DiMO, on a large test set of 87 TFs from human, drosophila and yeast, we show that it is possible to significantly improve motifs identified by nine motif finders. The motifs are generated/optimized using training sets and evaluated on test sets. The AUC is improved for almost 90% of the TFs on test sets and the magnitude of increase is up to 39%. Availability and implementation: DiMO is available at http://stormo.wustl.edu/DiMO Contact: rpatel@genetics.wustl.edu, ronakypatel@gmail.com PMID:24369152

Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets

PubMed Central

2012-01-01

Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery. PMID:23281852
Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets.

PubMed

Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon

2012-01-01

To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
Comparative qualitative phosphoproteomics analysis identifies shared phosphorylation motifs and associated biological processes in evolutionary divergent plants.

PubMed

Al-Momani, Shireen; Qi, Da; Ren, Zhe; Jones, Andrew R

2018-06-15

Phosphorylation is one of the most prevalent post-translational modifications and plays a key role in regulating cellular processes. We carried out a bioinformatics analysis of pre-existing phosphoproteomics data, to profile two model species representing the largest subclasses in flowering plants the dicot Arabidopsis thaliana and the monocot Oryza sativa, to understand the extent to which phosphorylation signaling and function is conserved across evolutionary divergent plants. We identified 6537 phosphopeptides from 3189 phosphoproteins in Arabidopsis and 2307 phosphopeptides from 1613 phosphoproteins in rice. We identified phosphorylation motifs, finding nineteen pS motifs and two pT motifs shared in rice and Arabidopsis. The majority of shared motif-containing proteins were mapped to the same biological processes with similar patterns of fold enrichment, indicating high functional conservation. We also identified shared patterns of crosstalk between phosphoserines with enrichment for motifs pSXpS, pSXXpS and pSXXXpS, where X is any amino acid. Lastly, our results identified several pairs of motifs that are significantly enriched to co-occur in Arabidopsis proteins, indicating cross-talk between different sites, but this was not observed in rice. Our results demonstrate that there are evolutionary conserved mechanisms of phosphorylation-mediated signaling in plants, via analysis of high-throughput phosphorylation proteomics data from key monocot and dicot species: rice and Arabidposis thaliana. The results also suggest that there is increased crosstalk between phosphorylation sites in A. thaliana compared with rice. The results are important for our general understanding of cell signaling in plants, and the ability to use A. thaliana as a general model for plant biology. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Differential Transcription Factor Use by the KIR2DL4 Promoter Under Constitutive and IL-2/15-Treated Conditions

PubMed Central

Presnell, Steven R.; Zhang, Lei; Chlebowy, Corrin N.; Al-Attar, Ahmad; Lutz, Charles T.

2012-01-01

KIR2DL4 is unique among human KIR genes in expression, cellular localization, structure, and function, yet the transcription factors required for its expression have not been identified. Using mutagenesis, electrophoretic mobility shift assay, and co-transfection assays, we identified two redundant Runx binding sites in the 2DL4 promoter as essential for constitutive 2DL4 transcription, with contributions by a CRE site and initiator elements. IL-2-and IL-15-stimulated human NK cell lines increased 2DL4 promoter activity, which required functional Runx, CRE, and Ets sites. Chromatin immunoprecipitation experiments show that Runx3 and Ets1 bind the 2DL4 promoter in situ. 2DL4 promoter activity had similar transcription factor requirements in T cells. Runx, CRE, and Ets binding motifs are present in 2DL4 promoters from across primate species, but other postulated transcription factor binding sites are not preserved. Differences between 2DL4 and clonally-restricted KIR promoters suggest a model that explains the unique 2DL4 expression pattern in human NK cells. PMID:22467658
An Amino Acid Packing Code for α-helical Structure and Protein Design

PubMed Central

Joo, Hyun; Chavan, Archana G.; Phan, Jamie; Day, Ryan; Tsai, Jerry

2012-01-01

This work demonstrates that all packing in α-helices can be simplified to repetitive patterns of a single motif: the knob-socket. Using the precision of Voronoi Polyhedra/Deluaney Tessellations to identify contacts, the knob-socket is a 4 residue tetrahedral motif: a knob residue on one α-helix packs into the 3 residue socket on another α-helix. The principle of the knob-socket model relates the packing between levels of protein structure: the intra-helical packing arrangements within secondary structure that permit inter-helix tertiary packing interactions. Within an α-helix, the 3 residue sockets arrange residues into a uniform packing lattice. Inter-helix packing results from a definable pattern of interdigitated knob-socket motifs between 2 α-helices. Furthermore, the knob-socket model classifies 3 types of sockets: 1) free: favoring only intra-helical packing, 2) filled: favoring inter-helical interactions and 3) non: disfavoring α-helical structure. The amino acid propensities in these 3 socket classes essentially represent an amino acid code for structure in α-helical packing. Using this code, a novel yet straightforward approach for the design of α-helical structure was used to validate the knob-socket model. Unique sequences for 3 peptides were created to produce a predicted amount of α-helical structure: mostly helical, some helical, and no-helix. These 3 peptides were synthesized and helical content assessed using CD spectroscopy. The measured α-helicity of each peptide was consistent with the expected predictions. These results and analysis demonstrate that the knob-socket motif functions as the basic unit of packing and presents an intuitive tool to decipher the rules governing packing in protein structure. PMID:22426125
The Hexahistidine Motif of Host-Defense Protein Human Calprotectin Contributes to Zinc Withholding and Its Functional Versatility.

PubMed

Nakashige, Toshiki G; Stephan, Jules R; Cunden, Lisa S; Brophy, Megan Brunjes; Wommack, Andrew J; Keegan, Brenna C; Shearer, Jason M; Nolan, Elizabeth M

2016-09-21

Human calprotectin (CP, S100A8/S100A9 oligomer, MRP-8/MRP-14 oligomer) is an abundant host-defense protein that is involved in the metal-withholding innate immune response. CP coordinates a variety of divalent first-row transition metal ions, which is implicated in its antimicrobial function, and its ability to sequester nutrient Zn(II) ions from microbial pathogens has been recognized for over two decades. CP has two distinct transition-metal-binding sites formed at the S100A8/S100A9 dimer interface, including a histidine-rich site composed of S100A8 residues His17 and His27 and S100A9 residues His91 and His95. In this study, we report that CP binds Zn(II) at this site using a hexahistidine motif, completed by His103 and His105 of the S100A9 C-terminal tail and previously identified as the high-affinity Mn(II) and Fe(II) coordination site. Zn(II) binding at this unique site shields the S100A9 C-terminal tail from proteolytic degradation by proteinase K. X-ray absorption spectroscopy and Zn(II) competition titrations support the formation of a Zn(II)-His6 motif. Microbial growth studies indicate that the hexahistidine motif is important for preventing microbial Zn(II) acquisition from CP by the probiotic Lactobacillus plantarum and the opportunistic human pathogen Candida albicans. The Zn(II)-His6 site of CP expands the known biological coordination chemistry of Zn(II) and provides new insight into how the human innate immune system starves microbes of essential metal nutrients.
The Sequence-specific Peptide-binding Activity of the Protein Sulfide Isomerase AGR2 Directs Its Stable Binding to the Oncogenic Receptor EpCAM.

PubMed

Mohtar, M Aiman; Hernychova, Lenka; O'Neill, J Robert; Lawrence, Melanie L; Murray, Euan; Vojtesek, Borek; Hupp, Ted R

2018-04-01

AGR2 is an oncogenic endoplasmic reticulum (ER)-resident protein disulfide isomerase. AGR2 protein has a relatively unique property for a chaperone in that it can bind sequence-specifically to a specific peptide motif (TTIYY). A synthetic TTIYY-containing peptide column was used to affinity-purify AGR2 from crude lysates highlighting peptide selectivity in complex mixtures. Hydrogen-deuterium exchange mass spectrometry localized the dominant region in AGR2 that interacts with the TTIYY peptide to within a structural loop from amino acids 131-135 (VDPSL). A peptide binding site consensus of Tx[IL][YF][YF] was developed for AGR2 by measuring its activity against a mutant peptide library. Screening the human proteome for proteins harboring this motif revealed an enrichment in transmembrane proteins and we focused on validating EpCAM as a potential AGR2-interacting protein. AGR2 and EpCAM proteins formed a dose-dependent protein-protein interaction in vitro Proximity ligation assays demonstrated that endogenous AGR2 and EpCAM protein associate in cells. Introducing a single alanine mutation in EpCAM at Tyr251 attenuated its binding to AGR2 in vitro and in cells. Hydrogen-deuterium exchange mass spectrometry was used to identify a stable binding site for AGR2 on EpCAM, adjacent to the TLIYY motif and surrounding EpCAM's detergent binding site. These data define a dominant site on AGR2 that mediates its specific peptide-binding function. EpCAM forms a model client protein for AGR2 to study how an ER-resident chaperone can dock specifically to a peptide motif and regulate the trafficking a protein destined for the secretory pathway. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.
Simultaneous Drug Targeting of the Promoter MYC G-Quadruplex and BCL2 i-Motif in Diffuse Large B-Cell Lymphoma Delays Tumor Growth.

PubMed

Kendrick, Samantha; Muranyi, Andrea; Gokhale, Vijay; Hurley, Laurence H; Rimsza, Lisa M

2017-08-10

Secondary DNA structures are uniquely poised as therapeutic targets due to their molecular switch function in turning gene expression on or off and scaffold-like properties for protein and small molecule interaction. Strategies to alter gene transcription through these structures thus far involve targeting single DNA conformations. Here we investigate the feasibility of simultaneously targeting different secondary DNA structures to modulate two key oncogenes, cellular-myelocytomatosis (MYC) and B-cell lymphoma gene-2 (BCL2), in diffuse large B-cell lymphoma (DLBCL). Cotreatment with previously identified ellipticine and pregnanol derivatives that recognize the MYC G-quadruplex and BCL2 i-motif promoter DNA structures lowered mRNA levels and subsequently enhanced sensitivity to a standard chemotherapy drug, cyclophosphamide, in DLBCL cell lines. In vivo repression of MYC and BCL2 in combination with cyclophosphamide also significantly slowed tumor growth in DLBCL xenograft mice. Our findings demonstrate concurrent targeting of different DNA secondary structures offers an effective, precise, medicine-based approach to directly impede transcription and overcome aberrant pathways in aggressive malignancies.
Ligand and coactivator identity determines the requirement of the charge clamp for coactivation of the peroxisome proliferator-activated receptor gamma.

PubMed

Wu, Yifei; Chin, William W; Wang, Yong; Burris, Thomas P

2003-03-07

The activation function 2 (AF-2)-dependent recruitment of coactivator is essential for gene activation by nuclear receptors. We show that the peroxisome proliferator-activated receptor gamma (PPARgamma) (NR1C3) coactivator-1 (PGC-1) requires both the intact AF-2 domain of PPARgamma and the LXXLL domain of PGC-1 for ligand-dependent and ligand-independent interaction and coactivation. Although the AF-2 domain of PPARgamma is absolutely required for PGC-1-mediated coactivation, this coactivator displayed a unique lack of requirement for the charge clamp of the ligand-binding domain of the receptor that is thought to be essential for LXXLL motif recognition. The mutation of a single serine residue adjacent to the core LXXLL motif of PGC-1 led to restoration of the typical charge clamp requirement. Thus, the unique structural features of the PGC-1 LXXLL motif appear to mediate an atypical mode of interaction with PPARgamma. Unexpectedly, we discovered that various ligands display variability in terms of their requirement for the charge clamp of PPARgamma for coactivation by PGC-1. This ligand-selective variable requirement for the charge clamp was coactivator-specific. Thus, distinct structural determinants, which may be unique for a particular ligand, are utilized by the receptor to recognize the coactivator. Our data suggest that even subtle differences in ligand structure are perceived by the receptor and translated into a unique display of the coactivator-binding surface of the ligand-binding domain, allowing for differential recognition of coactivators that may underlie distinct pharmacological profiles observed for ligands of a particular nuclear receptor.
An analysis of multi-type relational interactions in FMA using graph motifs with disjointness constraints.

PubMed

Zhang, Guo-Qiang; Luo, Lingyun; Ogbuji, Chime; Joslyn, Cliff; Mejino, Jose; Sahoo, Satya S

2012-01-01

The interaction of multiple types of relationships among anatomical classes in the Foundational Model of Anatomy (FMA) can provide inferred information valuable for quality assurance. This paper introduces a method called Motif Checking (MOCH) to study the effects of such multi-relation type interactions for detecting logical inconsistencies as well as other anomalies represented by the motifs. MOCH represents patterns of multi-type interaction as small labeled (with multiple types of edges) sub-graph motifs, whose nodes represent class variables, and labeled edges represent relational types. By representing FMA as an RDF graph and motifs as SPARQL queries, fragments of FMA are automatically obtained as auditing candidates. Leveraging the scalability and reconfigurability of Semantic Web Technology, we performed exhaustive analyses of a variety of labeled sub-graph motifs. The quality assurance feature of MOCH comes from the distinct use of a subset of the edges of the graph motifs as constraints for disjointness, whereby bringing in rule-based flavor to the approach as well. With possible disjointness implied by antonyms, we performed manual inspection of the resulting FMA fragments and tracked down sources of abnormal inferred conclusions (logical inconsistencies), which are amendable for programmatic revision of the FMA. Our results demonstrate that MOCH provides a unique source of valuable information for quality assurance. Since our approach is general, it is applicable to any ontological system with an OWL representation.
An Analysis of Multi-type Relational Interactions in FMA Using Graph Motifs with Disjointness Constraints

PubMed Central

Zhang, Guo-Qiang; Luo, Lingyun; Ogbuji, Chime; Joslyn, Cliff; Mejino, Jose; Sahoo, Satya S

2012-01-01

The interaction of multiple types of relationships among anatomical classes in the Foundational Model of Anatomy (FMA) can provide inferred information valuable for quality assurance. This paper introduces a method called Motif Checking (MOCH) to study the effects of such multi-relation type interactions for detecting logical inconsistencies as well as other anomalies represented by the motifs. MOCH represents patterns of multi-type interaction as small labeled (with multiple types of edges) sub-graph motifs, whose nodes represent class variables, and labeled edges represent relational types. By representing FMA as an RDF graph and motifs as SPARQL queries, fragments of FMA are automatically obtained as auditing candidates. Leveraging the scalability and reconfigurability of Semantic Web Technology, we performed exhaustive analyses of a variety of labeled sub-graph motifs. The quality assurance feature of MOCH comes from the distinct use of a subset of the edges of the graph motifs as constraints for disjointness, whereby bringing in rule-based flavor to the approach as well. With possible disjointness implied by antonyms, we performed manual inspection of the resulting FMA fragments and tracked down sources of abnormal inferred conclusions (logical inconsistencies), which are amendable for programmatic revision of the FMA. Our results demonstrate that MOCH provides a unique source of valuable information for quality assurance. Since our approach is general, it is applicable to any ontological system with an OWL representation. PMID:23304382
A unique PDZ domain and arrestin-like fold interaction reveals mechanistic details of endocytic recycling by SNX27-retromer.

PubMed

Gallon, Matthew; Clairfeuille, Thomas; Steinberg, Florian; Mas, Caroline; Ghai, Rajesh; Sessions, Richard B; Teasdale, Rohan D; Collins, Brett M; Cullen, Peter J

2014-09-02

The sorting nexin 27 (SNX27)-retromer complex is a major regulator of endosome-to-plasma membrane recycling of transmembrane cargos that contain a PSD95, Dlg1, zo-1 (PDZ)-binding motif. Here we describe the core interaction in SNX27-retromer assembly and its functional relevance for cargo sorting. Crystal structures and NMR experiments reveal that an exposed β-hairpin in the SNX27 PDZ domain engages a groove in the arrestin-like structure of the vacuolar protein sorting 26A (VPS26A) retromer subunit. The structure establishes how the SNX27 PDZ domain simultaneously binds PDZ-binding motifs and retromer-associated VPS26. Importantly, VPS26A binding increases the affinity of the SNX27 PDZ domain for PDZ- binding motifs by an order of magnitude, revealing cooperativity in cargo selection. With disruption of SNX27 and retromer function linked to synaptic dysfunction and neurodegenerative disease, our work provides the first step, to our knowledge, in the molecular description of this important sorting complex, and more broadly describes a unique interaction between a PDZ domain and an arrestin-like fold.
Unique chloride-sensing properties of WNK4 permit the distal nephron to modulate potassium homeostasis.

PubMed

Terker, Andrew S; Zhang, Chong; Erspamer, Kayla J; Gamba, Gerardo; Yang, Chao-Ling; Ellison, David H

2016-01-01

Dietary potassium deficiency activates thiazide-sensitive sodium chloride cotransport along the distal nephron. This may explain, in part, the hypertension and cardiovascular mortality observed in individuals who consume a low-potassium diet. Recent data suggest that plasma potassium affects the distal nephron directly by influencing intracellular chloride, an inhibitor of the with-no-lysine kinase (WNK)-Ste20p-related proline- and alanine-rich kinase (SPAK) pathway. As previous studies used extreme dietary manipulations, we sought to determine whether the relationship between potassium and NaCl cotransporter (NCC) is physiologically relevant and clarify the mechanisms involved. We report that modest changes in both dietary and plasma potassium affect NCC in vivo. Kinase assay studies showed that chloride inhibits WNK4 kinase activity at lower concentrations than it inhibits activity of WNK1 or WNK3. Also, chloride inhibited WNK4 within the range of distal cell chloride concentration. Mutation of a previously identified WNK chloride-binding motif converted WNK4 effects on SPAK from inhibitory to stimulatory in mammalian cells. Disruption of this motif in WNKs 1, 3, and 4 had different effects on NCC, consistent with the three WNKs having different chloride sensitivities. Thus, potassium effects on NCC are graded within the physiological range, which explains how unique chloride-sensing properties of WNK4 enable it to mediate effects of potassium on NCC in vivo. Copyright © 2015 International Society of Nephrology. Published by Elsevier Inc. All rights reserved.
DMINDA: an integrated web server for DNA motif identification and analyses

PubMed Central

Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

2014-01-01

DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419
Common and unique elements of the ABA-regulated transcriptome of Arabidopsis guard cells

PubMed Central

2011-01-01

Background In the presence of drought and other desiccating stresses, plants synthesize and redistribute the phytohormone abscisic acid (ABA). ABA promotes plant water conservation by acting on specialized cells in the leaf epidermis, guard cells, which border and regulate the apertures of stomatal pores through which transpirational water loss occurs. Following ABA exposure, solute uptake into guard cells is rapidly inhibited and solute loss is promoted, resulting in inhibition of stomatal opening and promotion of stomatal closure, with consequent plant water conservation. There is a wealth of information on the guard cell signaling mechanisms underlying these rapid ABA responses. To investigate ABA regulation of gene expression in guard cells in a systematic genome-wide manner, we analyzed data from global transcriptomes of guard cells generated with Affymetrix ATH1 microarrays, and compared these results to ABA regulation of gene expression in leaves and other tissues. Results The 1173 ABA-regulated genes of guard cells identified by our study share significant overlap with ABA-regulated genes of other tissues, and are associated with well-defined ABA-related promoter motifs such as ABREs and DREs. However, we also computationally identified a unique cis-acting motif, GTCGG, associated with ABA-induction of gene expression specifically in guard cells. In addition, approximately 300 genes showing ABA-regulation unique to this cell type were newly uncovered by our study. Within the ABA-regulated gene set of guard cells, we found that many of the genes known to encode ion transporters associated with stomatal opening are down-regulated by ABA, providing one mechanism for long-term maintenance of stomatal closure during drought. We also found examples of both negative and positive feedback in the transcriptional regulation by ABA of known ABA-signaling genes, particularly with regard to the PYR/PYL/RCAR class of soluble ABA receptors and their downstream targets, the type 2C protein phosphatases. Our data also provide evidence for cross-talk at the transcriptional level between ABA and another hormonal inhibitor of stomatal opening, methyl jasmonate. Conclusions Our results engender new insights into the basic cell biology of guard cells, reveal common and unique elements of ABA-regulation of gene expression in guard cells, and set the stage for targeted biotechnological manipulations to improve plant water use efficiency. PMID:21554708
DynaMIT: the dynamic motif integration toolkit

PubMed Central

Dassi, Erik; Quattrone, Alessandro

2016-01-01

De-novo motif search is a frequently applied bioinformatics procedure to identify and prioritize recurrent elements in sequences sets for biological investigation, such as the ones derived from high-throughput differential expression experiments. Several algorithms have been developed to perform motif search, employing widely different approaches and often giving divergent results. In order to maximize the power of these investigations and ultimately be able to draft solid biological hypotheses, there is the need for applying multiple tools on the same sequences and merge the obtained results. However, motif reporting formats and statistical evaluation methods currently make such an integration task difficult to perform and mostly restricted to specific scenarios. We thus introduce here the Dynamic Motif Integration Toolkit (DynaMIT), an extremely flexible platform allowing to identify motifs employing multiple algorithms, integrate them by means of a user-selected strategy and visualize results in several ways; furthermore, the platform is user-extendible in all its aspects. DynaMIT is freely available at http://cibioltg.bitbucket.org. PMID:26253738
Motif finding in DNA sequences based on skipping nonconserved positions in background Markov chains.

PubMed

Zhao, Xiaoyan; Sze, Sing-Hoi

2011-05-01

One strategy to identify transcription factor binding sites is through motif finding in upstream DNA sequences of potentially co-regulated genes. Despite extensive efforts, none of the existing algorithms perform very well. We consider a string representation that allows arbitrary ignored positions within the nonconserved portion of single motifs, and use O(2(l)) Markov chains to model the background distributions of motifs of length l while skipping these positions within each Markov chain. By focusing initially on positions that have fixed nucleotides to define core occurrences, we develop an algorithm to identify motifs of moderate lengths. We compare the performance of our algorithm to other motif finding algorithms on a few benchmark data sets, and show that significant improvement in accuracy can be obtained when the sites are sufficiently conserved within a given sample, while comparable performance is obtained when the site conservation rate is low. A software program (PosMotif ) and detailed results are available online at http://faculty.cse.tamu.edu/shsze/posmotif.
MCAW-DB: A glycan profile database capturing the ambiguity of glycan recognition patterns.

PubMed

Hosoda, Masae; Takahashi, Yushi; Shiota, Masaaki; Shinmachi, Daisuke; Inomoto, Renji; Higashimoto, Shinichi; Aoki-Kinoshita, Kiyoko F

2018-05-11

Glycan-binding protein (GBP) interaction experiments, such as glycan microarrays, are often used to understand glycan recognition patterns. However, oftentimes the interpretation of glycan array experimental data makes it difficult to identify discrete GBP binding patterns due to their ambiguity. It is known that lectins, for example, are non-specific in their binding affinities; the same lectin can bind to different monosaccharides or even different glycan structures. In bioinformatics, several tools to mine the data generated from these sorts of experiments have been developed. These tools take a library of predefined motifs, which are commonly-found glycan patterns such as sialyl-Lewis X, and attempt to identify the motif(s) that are specific to the GBP being analyzed. In our previous work, as opposed to using predefined motifs, we developed the Multiple Carbohydrate Alignment with Weights (MCAW) tool to visualize the state of the glycans being recognized by the GBP under analysis. We previously reported on the effectiveness of our tool and algorithm by analyzing several glycan array datasets from the Consortium of Functional Glycomics (CFG). In this work, we report on our analysis of 1081 data sets which we collected from the CFG, the results of which we have made publicly and freely available as a database called MCAW-DB. We introduce this database, its usage and describe several analysis results. We show how MCAW-DB can be used to analyze glycan-binding patterns of GBPs amidst their ambiguity. For example, the visualization of glycan-binding patterns in MCAW-DB show how they correlate with the concentrations of the samples used in the array experiments. Using MCAW-DB, the patterns of glycans found to bind to various GBP-glycan binding proteins are visualized, indicating the binding "environment" of the glycans. Thus, the ambiguity of glycan recognition is numerically represented, along with the patterns of monosaccharides surrounding the binding region. The profiles in MCAW-DB could potentially be used as predictors of affinity of unknown or novel glycans to particular GBPs by comparing how well they match the existing profiles for those GBPs. Moreover, as the glycan profiles of diseased tissues become available, glycan alignments could also be used to identify glycan biomarkers unique to that tissue. Databases of these alignments may be of great use for drug discovery. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Physical-chemical property based sequence motifs and methods regarding same

DOEpatents

Braun, Werner [Friendswood, TX; Mathura, Venkatarajan S [Sarasota, FL; Schein, Catherine H [Friendswood, TX

2008-09-09

A data analysis system, program, and/or method, e.g., a data mining/data exploration method, using physical-chemical property motifs. For example, a sequence database may be searched for identifying segments thereof having physical-chemical properties similar to the physical-chemical property motifs.
Motif mismatches in microsatellites: insights from genome-wide investigation among 20 insect species.

PubMed

Behura, Susanta K; Severson, David W

2015-02-01

We present a detailed genome-wide comparative study of motif mismatches of microsatellites among 20 insect species representing five taxonomic orders. The results show that varying proportions (∼15-46%) of microsatellites identified in these species are imperfect in motif structure, and that they also vary in chromosomal distribution within genomes. It was observed that the genomic abundance of imperfect repeats is significantly associated with the length and number of motif mismatches of microsatellites. Furthermore, microsatellites with a higher number of mismatches tend to have lower abundance in the genome, suggesting that sequence heterogeneity of repeat motifs is a key determinant of genomic abundance of microsatellites. This relationship seems to be a general feature of microsatellites even in unrelated species such as yeast, roundworm, mouse and human. We provide a mechanistic explanation of the evolutionary link between motif heterogeneity and genomic abundance of microsatellites by examining the patterns of motif mismatches and allele sequences of single-nucleotide polymorphisms identified within microsatellite loci. Using Drosophila Reference Genetic Panel data, we further show that pattern of allelic variation modulates motif heterogeneity of microsatellites, and provide estimates of allele age of specific imperfect microsatellites found within protein-coding genes. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

Unitary circular code motifs in genomes of eukaryotes.

PubMed

El Soufi, Karim; Michel, Christian J

A set X of 20 trinucleotides was identified in genes of bacteria, eukaryotes, plasmids and viruses, which has in average the highest occurrence in reading frame compared to its two shifted frames (Michel, 2015; Arquès and Michel, 1996). This set X has an interesting mathematical property as X is a circular code (Arquès and Michel, 1996). Thus, the motifs from this circular code X, called X motifs, have the property to always retrieve, synchronize and maintain the reading frame in genes. The origin of this circular code X in genes is an open problem since its discovery in 1996. Here, we first show that the unitary circular codes (UCC), i.e. sets of one word, allow to generate unitary circular code motifs (UCC motifs), i.e. a concatenation of the same motif (simple repeats) leading to low complexity DNA. Three classes of UCC motifs are studied here: repeated dinucleotides (D + motifs), repeated trinucleotides (T + motifs) and repeated tetranucleotides (T + motifs). Thus, the D + , T + and T + motifs allow to retrieve, synchronize and maintain a frame modulo 2, modulo 3 and modulo 4, respectively, and their shifted frames (1 modulo 2; 1 and 2 modulo 3; 1, 2 and 3 modulo 4 according to the C 2 , C 3 and C 4 properties, respectively) in the DNA sequences. The statistical distribution of the D + , T + and T + motifs is analyzed in the genomes of eukaryotes. A UCC motif and its comp lementary UCC motif have the same distribution in the eukaryotic genomes. Furthermore, a UCC motif and its complementary UCC motif have increasing occurrences contrary to their number of hydrogen bonds, very significant with the T + motifs. The longest D + , T + and T + motifs in the studied eukaryotic genomes are also given. Surprisingly, a scarcity of repeated trinucleotides (T + motifs) in the large eukaryotic genomes is observed compared to the D + and T + motifs. This result has been investigated and may be explained by two outcomes. Repeated trinucleotides (T + motifs) are identified in the X motifs of low composition (cardinality less than 10) in the genomes of eukaryotes. Furthermore, identical trinucleotide pairs of the circular code X are preferentially used in the gene sequences of eukaryotes. These two results suggest that the unitary circular codes of trinucleotides may have been involved in the formation of the trinucleotide circular code X. Indeed, repeated trinucleotides in the X motifs in the genomes of eukaryotes may represent an intermediary evolution from repeated trinucleotides of cardinality 1 (T + motifs) in the genomes of eukaryotes up to the X motifs of cardinality 20 in the gene sequences of eukaryotes. Copyright © 2017 Elsevier B.V. All rights reserved.
A Bioinformatics Approach for Detecting Repetitive Nested Motifs using Pattern Matching.

PubMed

Romero, José R; Carballido, Jessica A; Garbus, Ingrid; Echenique, Viviana C; Ponzoni, Ignacio

2016-01-01

The identification of nested motifs in genomic sequences is a complex computational problem. The detection of these patterns is important to allow the discovery of transposable element (TE) insertions, incomplete reverse transcripts, deletions, and/or mutations. In this study, a de novo strategy for detecting patterns that represent nested motifs was designed based on exhaustive searches for pairs of motifs and combinatorial pattern analysis. These patterns can be grouped into three categories, motifs within other motifs, motifs flanked by other motifs, and motifs of large size. The methodology used in this study, applied to genomic sequences from the plant species Aegilops tauschii and Oryza sativa , revealed that it is possible to identify putative nested TEs by detecting these three types of patterns. The results were validated through BLAST alignments, which revealed the efficacy and usefulness of the new method, which is called Mamushka.
Structures of minimal catalytic fragments of topoisomerase V reveals conformational changes relevant for DNA binding

PubMed Central

Rajan, Rakhi; Taneja, Bhupesh; Mondragón, Alfonso

2010-01-01

Summary Topoisomerase V is an archaeal type I topoisomerase that is unique among topoisomerases due to presence of both topoisomerase and DNA repair activities in the same protein. It is organized as an N-terminal topoisomerase domain followed by 24 tandem helix hairpin helix (HhH) motifs. Structural studies have shown that the active site is buried by the (HhH) motifs. Here we show that the N-terminal domain can relax DNA in the absence of any HhH motifs and that the HhH motifs are required for stable protein-DNA complex formation. Crystal structures of various topoisomerase V fragments show changes in the relative orientation of the domains mediated by a long bent linker helix, and these movements are essential for the DNA to enter the active site. Phosphate ions bound to the protein near the active site helped model DNA in the topoisomerase domain and shows how topoisomerase V may interact with DNA. PMID:20637419
Identification and Characterization of a Novel Member of the Radical AdoMet Enzyme Superfamily and Implications for the Biosynthesis of the Hmd Hydrogenase Active Site Cofactor▿ †

PubMed Central

McGlynn, Shawn E.; Boyd, Eric S.; Shepard, Eric M.; Lange, Rachel K.; Gerlach, Robin; Broderick, Joan B.; Peters, John W.

2010-01-01

The genetic context, phylogeny, and biochemistry of a gene flanking the H2-forming methylene-H4-methanopterin dehydrogenase gene (hmdA), here designated hmdB, indicate that it is a new member of the radical S-adenosylmethionine enzyme superfamily. In contrast to the characteristic CX3CX2C or CX2CX4C motif defining this family, HmdB contains a unique CX5CX2C motif. PMID:19897660
Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations

PubMed Central

Zhu, Yicheng; Neeman, Teresa; Yap, Von Bing; Huttley, Gavin A.

2017-01-01

Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A→G mutations. We show that major effects of neighbors on germline mutation lie within ±2 of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T→C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif. PMID:27974498
Crystal structure of the C-terminal SH3 domain of the adaptor protein GADS in complex with SLP-76 motif peptide reveals a unique SH3-SH3 interaction.

PubMed

Dimasi, Nazzareno

2007-01-01

The Grb2-like adaptor protein GADS is essential for tyrosine kinase-dependent signaling in T lymphocytes. Following T cell receptor ligation, GADS interacts through its C-terminal SH3 domain with the adaptors SLP-76 and LAT, to form a multiprotein signaling complex that is crucial for T cell activation. To understand the structural basis for the selective recognition of GADS by SLP-76, herein is reported the crystal structure at 1.54 Angstrom of the C-terminal SH3 domain of GADS bound to the SLP-76 motif 233-PSIDRSTKP-241, which represents the minimal binding site. In addition to the unique structural features adopted by the bound SLP-76 peptide, the complex structure reveals a unique SH3-SH3 interaction. This homophilic interaction, which is observed in presence of the SLP-76 peptide and is present in solution, extends our understanding of the molecular mechanisms that could be employed by modular proteins to increase their signaling transduction specificity.
RNA Bricks—a database of RNA 3D motifs and their interactions

PubMed Central

Chojnowski, Grzegorz; Waleń, Tomasz; Bujnicki, Janusz M.

2014-01-01

The RNA Bricks database (http://iimcb.genesilico.pl/rnabricks), stores information about recurrent RNA 3D motifs and their interactions, found in experimentally determined RNA structures and in RNA–protein complexes. In contrast to other similar tools (RNA 3D Motif Atlas, RNA Frabase, Rloom) RNA motifs, i.e. ‘RNA bricks’ are presented in the molecular environment, in which they were determined, including RNA, protein, metal ions, water molecules and ligands. All nucleotide residues in RNA bricks are annotated with structural quality scores that describe real-space correlation coefficients with the electron density data (if available), backbone geometry and possible steric conflicts, which can be used to identify poorly modeled residues. The database is also equipped with an algorithm for 3D motif search and comparison. The algorithm compares spatial positions of backbone atoms of the user-provided query structure and of stored RNA motifs, without relying on sequence or secondary structure information. This enables the identification of local structural similarities among evolutionarily related and unrelated RNA molecules. Besides, the search utility enables searching ‘RNA bricks’ according to sequence similarity, and makes it possible to identify motifs with modified ribonucleotide residues at specific positions. PMID:24220091
Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach.

PubMed

Nielsen, Morten; Lundegaard, Claus; Worning, Peder; Hvid, Christina Sylvester; Lamberth, Kasper; Buus, Søren; Brunak, Søren; Lund, Ole

2004-06-12

Prediction of which peptides will bind a specific major histocompatibility complex (MHC) constitutes an important step in identifying potential T-cell epitopes suitable as vaccine candidates. MHC class II binding peptides have a broad length distribution complicating such predictions. Thus, identifying the correct alignment is a crucial part of identifying the core of an MHC class II binding motif. In this context, we wish to describe a novel Gibbs motif sampler method ideally suited for recognizing such weak sequence motifs. The method is based on the Gibbs sampling method, and it incorporates novel features optimized for the task of recognizing the binding motif of MHC classes I and II. The method locates the binding motif in a set of sequences and characterizes the motif in terms of a weight-matrix. Subsequently, the weight-matrix can be applied to identifying effectively potential MHC binding peptides and to guiding the process of rational vaccine design. We apply the motif sampler method to the complex problem of MHC class II binding. The input to the method is amino acid peptide sequences extracted from the public databases of SYFPEITHI and MHCPEP and known to bind to the MHC class II complex HLA-DR4(B1*0401). Prior identification of information-rich (anchor) positions in the binding motif is shown to improve the predictive performance of the Gibbs sampler. Similarly, a consensus solution obtained from an ensemble average over suboptimal solutions is shown to outperform the use of a single optimal solution. In a large-scale benchmark calculation, the performance is quantified using relative operating characteristics curve (ROC) plots and we make a detailed comparison of the performance with that of both the TEPITOPE method and a weight-matrix derived using the conventional alignment algorithm of ClustalW. The calculation demonstrates that the predictive performance of the Gibbs sampler is higher than that of ClustalW and in most cases also higher than that of the TEPITOPE method.
DMINDA: an integrated web server for DNA motif identification and analyses.

PubMed

Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

2014-07-01

DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

PubMed

Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

2016-08-09

Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance progress in elucidating transcription regulation mechanism, thus provide benefit to the genomic research community and prokaryotic genome researchers in particular.
Co-regulation analysis of co-expressed modules under cold and pathogen stress conditions in tomato.

PubMed

Abedini, Davar; Rashidi Monfared, Sajad

2018-06-01

A primary mechanism for controlling the development of multicellular organisms is transcriptional regulation, which carried out by transcription factors (TFs) that recognize and bind to their binding sites on promoter region. The distance from translation start site, order, orientation, and spacing between cis elements are key factors in the concentration of active nuclear TFs and transcriptional regulation of target genes. In this study, overrepresented motifs in cold and pathogenesis responsive genes were scanned via Gibbs sampling method, this method is based on detection of overrepresented motifs by means of a stochastic optimization strategy that searches for all possible sets of short DNA segments. Then, identified motifs were checked by TRANSFAC, PLACE and Soft Berry databases in order to identify putative TFs which, interact to the motifs. Several cis/trans regulatory elements were found using these databases. Moreover, cross-talk between cold and pathogenesis responsive genes were confirmed. Statistical analysis was used to determine distribution of identified motifs on promoter region. In addition, co-regulation analysis results, illustrated genes in pathogenesis responsive module are divided into two main groups. Also, promoter region was crunched to six subareas in order to draw the pattern of distribution of motifs in promoter subareas. The result showed the majority of motifs are concentrated on 700 nucleotides upstream of the translational start site (ATG). In contrast, this result isn't true in another group. In other words, there was no difference between total and compartmentalized regions in cold responsive genes.
B Cell Receptor Activation Predominantly Regulates AKT-mTORC1/2 Substrates Functionally Related to RNA Processing

PubMed Central

Mohammad, Dara K.; Ali, Raja H.; Turunen, Janne J.; Nore, Beston F.; Smith, C. I. Edvard

2016-01-01

Protein kinase B (AKT) phosphorylates numerous substrates on the consensus motif RXRXXpS/T, a docking site for 14-3-3 interactions. To identify novel AKT-induced phosphorylation events following B cell receptor (BCR) activation, we performed proteomics, biochemical and bioinformatics analyses. Phosphorylated consensus motif-specific antibody enrichment, followed by tandem mass spectrometry, identified 446 proteins, containing 186 novel phosphorylation events. Moreover, we found 85 proteins with up regulated phosphorylation, while in 277 it was down regulated following stimulation. Up regulation was mainly in proteins involved in ribosomal and translational regulation, DNA binding and transcription regulation. Conversely, down regulation was preferentially in RNA binding, mRNA splicing and mRNP export proteins. Immunoblotting of two identified RNA regulatory proteins, RBM25 and MEF-2D, confirmed the proteomics data. Consistent with these findings, the AKT-inhibitor (MK-2206) dramatically reduced, while the mTORC-inhibitor PP242 totally blocked phosphorylation on the RXRXXpS/T motif. This demonstrates that this motif, previously suggested as an AKT target sequence, also is a substrate for mTORC1/2. Proteins with PDZ, PH and/or SH3 domains contained the consensus motif, whereas in those with an HMG-box, H15 domains and/or NF-X1-zinc-fingers, the motif was absent. Proteins carrying the consensus motif were found in all eukaryotic clades indicating that they regulate a phylogenetically conserved set of proteins. PMID:27487157
Cellular automata simulation of topological effects on the dynamics of feed-forward motifs

PubMed Central

Apte, Advait A; Cain, John W; Bonchev, Danail G; Fong, Stephen S

2008-01-01

Background Feed-forward motifs are important functional modules in biological and other complex networks. The functionality of feed-forward motifs and other network motifs is largely dictated by the connectivity of the individual network components. While studies on the dynamics of motifs and networks are usually devoted to the temporal or spatial description of processes, this study focuses on the relationship between the specific architecture and the overall rate of the processes of the feed-forward family of motifs, including double and triple feed-forward loops. The search for the most efficient network architecture could be of particular interest for regulatory or signaling pathways in biology, as well as in computational and communication systems. Results Feed-forward motif dynamics were studied using cellular automata and compared with differential equation modeling. The number of cellular automata iterations needed for a 100% conversion of a substrate into a target product was used as an inverse measure of the transformation rate. Several basic topological patterns were identified that order the specific feed-forward constructions according to the rate of dynamics they enable. At the same number of network nodes and constant other parameters, the bi-parallel and tri-parallel motifs provide higher network efficacy than single feed-forward motifs. Additionally, a topological property of isodynamicity was identified for feed-forward motifs where different network architectures resulted in the same overall rate of the target production. Conclusion It was shown for classes of structural motifs with feed-forward architecture that network topology affects the overall rate of a process in a quantitatively predictable manner. These fundamental results can be used as a basis for simulating larger networks as combinations of smaller network modules with implications on studying synthetic gene circuits, small regulatory systems, and eventually dynamic whole-cell models. PMID:18304325
A Unique Collection of Palaeolithic Painted Portable Art: Characterization of Red and Yellow Pigments from the Parpalló Cave (Spain)

PubMed Central

Villaverde Bonilla, Valentín; Ródenas Marín, Isabel; Murcia Mascarós, Sonia

2016-01-01

In this work we analyze the pigments used in the decoration of red and yellow motifs present in the portable art of the Parpalló Cave (Gandía, Spain), one of the most important Palaeolithic sites in the Spanish Mediterranean region. Energy dispersive X-ray fluorescence spectrometry (EDXRF) and spectrophotometry in the visible region (CIEL*a*b*color coordinates and spectral reflectance curves) were used to perform in situ fast analyses of the red and yellow motifs with portable equipment and to characterize their elemental composition and their colorimetric perception, respectively. According to the elemental composition, the intensity of the fluorescence iron signals in red and yellow motifs are higher than average values in the rock substrates. As expected, red motifs possess high values of the chromatic coordinate a* and yellow motifs possess high values of b*. This characterization was complemented with FT-IR analyses of microsamples detached from the red and yellow colored zones of a small set of plaquettes. Our results show that the artists used red and yellow pigments in the decoration likely derived from natural iron oxides as hematite and goethite. PMID:27732605
A Unique Collection of Palaeolithic Painted Portable Art: Characterization of Red and Yellow Pigments from the Parpalló Cave (Spain).

PubMed

Roldán García, Clodoaldo; Villaverde Bonilla, Valentín; Ródenas Marín, Isabel; Murcia Mascarós, Sonia

2016-01-01

In this work we analyze the pigments used in the decoration of red and yellow motifs present in the portable art of the Parpalló Cave (Gandía, Spain), one of the most important Palaeolithic sites in the Spanish Mediterranean region. Energy dispersive X-ray fluorescence spectrometry (EDXRF) and spectrophotometry in the visible region (CIEL*a*b*color coordinates and spectral reflectance curves) were used to perform in situ fast analyses of the red and yellow motifs with portable equipment and to characterize their elemental composition and their colorimetric perception, respectively. According to the elemental composition, the intensity of the fluorescence iron signals in red and yellow motifs are higher than average values in the rock substrates. As expected, red motifs possess high values of the chromatic coordinate a* and yellow motifs possess high values of b*. This characterization was complemented with FT-IR analyses of microsamples detached from the red and yellow colored zones of a small set of plaquettes. Our results show that the artists used red and yellow pigments in the decoration likely derived from natural iron oxides as hematite and goethite.
NAC transcription factor genes: genome-wide identification, phylogenetic, motif and cis-regulatory element analysis in pigeonpea (Cajanus cajan (L.) Millsp.).

PubMed

Satheesh, Viswanathan; Jagannadham, P Tej Kumar; Chidambaranathan, Parameswaran; Jain, P K; Srinivasan, R

2014-12-01

The NAC (NAM, ATAF and CUC) proteins are plant-specific transcription factors implicated in development and stress responses. In the present study 88 pigeonpea NAC genes were identified from the recently published draft genome of pigeonpea by using homology based and de novo prediction programmes. These sequences were further subjected to phylogenetic, motif and promoter analyses. In motif analysis, highly conserved motifs were identified in the NAC domain and also in the C-terminal region of the NAC proteins. A phylogenetic reconstruction using pigeonpea, Arabidopsis and soybean NAC genes revealed 33 putative stress-responsive pigeonpea NAC genes. Several stress-responsive cis-elements were identified through in silico analysis of the promoters of these putative stress-responsive genes. This analysis is the first report of NAC gene family in pigeonpea and will be useful for the identification and selection of candidate genes associated with stress tolerance.
Accurate quantification of microRNA via single strand displacement reaction on DNA origami motif.

PubMed

Zhu, Jie; Feng, Xiaolu; Lou, Jingyu; Li, Weidong; Li, Sheng; Zhu, Hongxin; Yang, Lun; Zhang, Aiping; He, Lin; Li, Can

2013-01-01

DNA origami is an emerging technology that assembles hundreds of staple strands and one single-strand DNA into certain nanopattern. It has been widely used in various fields including detection of biological molecules such as DNA, RNA and proteins. MicroRNAs (miRNAs) play important roles in post-transcriptional gene repression as well as many other biological processes such as cell growth and differentiation. Alterations of miRNAs' expression contribute to many human diseases. However, it is still a challenge to quantitatively detect miRNAs by origami technology. In this study, we developed a novel approach based on streptavidin and quantum dots binding complex (STV-QDs) labeled single strand displacement reaction on DNA origami to quantitatively detect the concentration of miRNAs. We illustrated a linear relationship between the concentration of an exemplary miRNA as miRNA-133 and the STV-QDs hybridization efficiency; the results demonstrated that it is an accurate nano-scale miRNA quantifier motif. In addition, both symmetrical rectangular motif and asymmetrical China-map motif were tested. With significant linearity in both motifs, our experiments suggested that DNA Origami motif with arbitrary shape can be utilized in this method. Since this DNA origami-based method we developed owns the unique advantages of simple, time-and-material-saving, potentially multi-targets testing in one motif and relatively accurate for certain impurity samples as counted directly by atomic force microscopy rather than fluorescence signal detection, it may be widely used in quantification of miRNAs.
Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

PubMed Central

Fauteux, François; Strömvik, Martina V

2009-01-01

Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs. The majority of discovered motifs match experimentally characterized cis-regulatory elements. These results provide a good starting point for further experimental analysis of plant seed-specific promoters and our methodology can be used to unravel more transcriptional regulatory mechanisms in plants and other eukaryotes. PMID:19843335
Transcriptional regulation of human eosinophil RNases by an evolutionary- conserved sequence motif in primate genome

PubMed Central

Wang, Hsiu-Yu; Chang, Hao-Teng; Pai, Tun-Wen; Wu, Chung-I; Lee, Yuan-Hung; Chang, Yen-Hsin; Tai, Hsiu-Ling; Tang, Chuan-Yi; Chou, Wei-Yao; Chang, Margaret Dah-Tsyr

2007-01-01

Background Human eosinophil-derived neurotoxin (edn) and eosinophil cationic protein (ecp) are members of a subfamily of primate ribonuclease (rnase) genes. Although they are generated by gene duplication event, distinct edn and ecp expression profile in various tissues have been reported. Results In this study, we obtained the upstream promoter sequences of several representative primate eosinophil rnases. Bioinformatic analysis revealed the presence of a shared 34-nucleotide (nt) sequence stretch located at -81 to -48 in all edn promoters and macaque ecp promoter. Such a unique sequence motif constituted a region essential for transactivation of human edn in hepatocellular carcinoma cells. Gel electrophoretic mobility shift assay, transient transfection and scanning mutagenesis experiments allowed us to identify binding sites for two transcription factors, Myc-associated zinc finger protein (MAZ) and SV-40 protein-1 (Sp1), within the 34-nt segment. Subsequent in vitro and in vivo binding assays demonstrated a direct molecular interaction between this 34-nt region and MAZ and Sp1. Interestingly, overexpression of MAZ and Sp1 respectively repressed and enhanced edn promoter activity. The regulatory transactivation motif was mapped to the evolutionarily conserved -74/-65 region of the edn promoter, which was guanidine-rich and critical for recognition by both transcription factors. Conclusion Our results provide the first direct evidence that MAZ and Sp1 play important roles on the transcriptional activation of the human edn promoter through specific binding to a 34-nt segment present in representative primate eosinophil rnase promoters. PMID:17927842
Characterization of cyclo-Acetoacetyl-L-Tryptophan Dimethylallyltransferase in Cyclopiazonic Acid Biosynthesis: Substrate Promiscuity and Site Directed Mutagenesis Studies

PubMed Central

Liu, Xinyu; Walsh, Christopher T.

2009-01-01

The fungal neurotoxin α-cyclopiazonic acid (CPA), a nanomolar inhibitor of Ca2+-ATPase with a unique pentacyclic indole tetramic acid scaffold is assembled by a three enzyme pathway CpaS, CpaD and CpaO in Aspergillus sp. We recently characterized the first pathway-specific enzyme CpaS, a hybrid two module polyketide synthase-nonribosomal peptide synthetase (PKS-NRPS) that generates cyclo-acetoacetyl-L-tryptophan (cAATrp). Here we report the characterization of the second pathway-specific enzyme CpaD that regiospecifically dimethylallylates cAATrp to form β-cyclopiazonic acid. By exploring the tryptophan and tetramate moieties of cAATrp, we demonstrate that CpaD discriminates against free Trp but accepts tryptophan-containing thiohydantoins, diketopiperazines and linear peptides as substrates for C4-prenylation and also acts as regiospecific O-dimethylallyltransferase (DMAT) on a tyrosine-derived tetramic acid. Comparative evaluation of CpaDs from A. oryzae RIB40 and A. flavus NRRL3357 indicated the importance of the N-terminal region for its activity. Sequence alignment of CpaD with eleven homologous fungal Trp-DMATs revealed five regions of conservation suggesting the presense of critical motifs that could be diagonostic for discovering additional Trp-DMATs. Subsequent site-directed mutagenesis studies identified five polar/charged residues and five tyrosine residues within these motifs that are critical for CpaD activity. This motif characerization will enable a gene probe-based approach to discover additional biosynthetic Trp-DMATs. PMID:19877600

DLocalMotif: a discriminative approach for discovering local motifs in protein sequences.

PubMed

Mehdi, Ahmed M; Sehgal, Muhammad Shoaib B; Kobe, Bostjan; Bailey, Timothy L; Bodén, Mikael

2013-01-01

Local motifs are patterns of DNA or protein sequences that occur within a sequence interval relative to a biologically defined anchor or landmark. Current protein motif discovery methods do not adequately consider such constraints to identify biologically significant motifs that are only weakly over-represented but spatially confined. Using negatives, i.e. sequences known to not contain a local motif, can further increase the specificity of their discovery. This article introduces the method DLocalMotif that makes use of positional information and negative data for local motif discovery in protein sequences. DLocalMotif combines three scoring functions, measuring degrees of motif over-representation, entropy and spatial confinement, specifically designed to discriminatively exploit the availability of negative data. The method is shown to outperform current methods that use only a subset of these motif characteristics. We apply the method to several biological datasets. The analysis of peroxisomal targeting signals uncovers several novel motifs that occur immediately upstream of the dominant peroxisomal targeting signal-1 signal. The analysis of proline-tyrosine nuclear localization signals uncovers multiple novel motifs that overlap with C2H2 zinc finger domains. We also evaluate the method on classical nuclear localization signals and endoplasmic reticulum retention signals and find that DLocalMotif successfully recovers biologically relevant sequence properties. http://bioinf.scmb.uq.edu.au/dlocalmotif/
The heptanucleotide motif GAGACGC is a key component of a cis-acting promoter element that is critical for SnSAG1 expression in Sarcocystis neurona.

PubMed

Gaji, Rajshekhar Y; Howe, Daniel K

2009-07-01

The apicomplexan parasite Sarcocystis neurona undergoes a complex process of intracellular development, during which many genes are temporally regulated. The described study was undertaken to begin identifying the basic promoter elements that control gene expression in S. neurona. Sequence analysis of the 5'-flanking region of five S. neurona genes revealed a conserved heptanucleotide motif GAGACGC that is similar to the WGAGACG motif described upstream of multiple genes in Toxoplasma gondii. The promoter region for the major surface antigen gene SnSAG1, which contains three heptanucleotide motifs within 135 bases of the transcription start site, was dissected by functional analysis using a dual luciferase reporter assay. These analyses revealed that a minimal promoter fragment containing all three motifs was sufficient to drive reporter molecule expression, with the presence and orientation of the 5'-most heptanucleotide motif being absolutely critical for promoter function. Further studies should help to identify additional sequence elements important for promoter function and for controlling gene expression during intracellular development by this apicomplexan pathogen.
Histidine pairing at the metal transport site of mammalian ZnT transporters controls Zn2+ over Cd2+ selectivity.

PubMed

Hoch, Eitan; Lin, Wei; Chai, Jin; Hershfinkel, Michal; Fu, Dax; Sekler, Israel

2012-05-08

Zinc and cadmium are similar metal ions, but though Zn(2+) is an essential nutrient, Cd(2+) is a toxic and common pollutant linked to multiple disorders. Faster body turnover and ubiquitous distribution of Zn(2+) vs. Cd(2+) suggest that a mammalian metal transporter distinguishes between these metal ions. We show that the mammalian metal transporters, ZnTs, mediate cytosolic and vesicular Zn(2+) transport, but reject Cd(2+), thus constituting the first mammalian metal transporter with a refined selectivity against Cd(2+). Remarkably, the bacterial ZnT ortholog, YiiP, does not discriminate between Zn(2+) and Cd(2+). A phylogenetic comparison between the tetrahedral metal transport motif of YiiP and ZnTs identifies a histidine at the mammalian site that is critical for metal selectivity. Residue swapping at this position abolished metal selectivity of ZnTs, and fully reconstituted selective Zn(2+) transport of YiiP. Finally, we show that metal selectivity evolves through a reduction in binding but not the translocation of Cd(2+) by the transporter. Thus, our results identify a unique class of mammalian transporters and the structural motif required to discriminate between Zn(2+) and Cd(2+), and show that metal selectivity is tuned by a coordination-based mechanism that raises the thermodynamic barrier to Cd(2+) binding.
Histidine pairing at the metal transport site of mammalian ZnT transporters controls Zn2+ over Cd2+ selectivity

PubMed Central

Hoch, Eitan; Lin, Wei; Chai, Jin; Hershfinkel, Michal; Fu, Dax; Sekler, Israel

2012-01-01

Zinc and cadmium are similar metal ions, but though Zn2+ is an essential nutrient, Cd2+ is a toxic and common pollutant linked to multiple disorders. Faster body turnover and ubiquitous distribution of Zn2+ vs. Cd2+ suggest that a mammalian metal transporter distinguishes between these metal ions. We show that the mammalian metal transporters, ZnTs, mediate cytosolic and vesicular Zn2+ transport, but reject Cd2+, thus constituting the first mammalian metal transporter with a refined selectivity against Cd2+. Remarkably, the bacterial ZnT ortholog, YiiP, does not discriminate between Zn2+ and Cd2+. A phylogenetic comparison between the tetrahedral metal transport motif of YiiP and ZnTs identifies a histidine at the mammalian site that is critical for metal selectivity. Residue swapping at this position abolished metal selectivity of ZnTs, and fully reconstituted selective Zn2+ transport of YiiP. Finally, we show that metal selectivity evolves through a reduction in binding but not the translocation of Cd2+ by the transporter. Thus, our results identify a unique class of mammalian transporters and the structural motif required to discriminate between Zn2+ and Cd2+, and show that metal selectivity is tuned by a coordination-based mechanism that raises the thermodynamic barrier to Cd2+ binding. PMID:22529353
Motif discovery and motif finding from genome-mapped DNase footprint data.

PubMed

Kulakovskiy, Ivan V; Favorov, Alexander V; Makeev, Vsevolod J

2009-09-15

Footprint data is an important source of information on transcription factor recognition motifs. However, a footprinting fragment can contain no sequences similar to known protein recognition sites. Inspection of genome fragments nearby can help to identify missing site positions. Genome fragments containing footprints were supplied to a pipeline that constructed a position weight matrix (PWM) for different motif lengths and selected the optimal PWM. Fragments were aligned with the SeSiMCMC sampler and a new heuristic algorithm, Bigfoot. Footprints with missing hits were found for approximately 50% of factors. Adding only 2 bp on both sides of a footprinting fragment recovered most hits. We automatically constructed motifs for 41 Drosophila factors. New motifs can recognize footprints with a greater sensitivity at the same false positive rate than existing models. Also we discuss possible overfitting of constructed motifs. Software and the collection of regulatory motifs are freely available at http://line.imb.ac.ru/DMMPMM.
A systems wide mass spectrometric based linear motif screen to identify dominant in-vivo interacting proteins for the ubiquitin ligase MDM2.

PubMed

Nicholson, Judith; Scherl, Alex; Way, Luke; Blackburn, Elizabeth A; Walkinshaw, Malcolm D; Ball, Kathryn L; Hupp, Ted R

2014-06-01

Linear motifs mediate protein-protein interactions (PPI) that allow expansion of a target protein interactome at a systems level. This study uses a proteomics approach and linear motif sub-stratifications to expand on PPIs of MDM2. MDM2 is a multi-functional protein with over one hundred known binding partners not stratified by hierarchy or function. A new linear motif based on a MDM2 interaction consensus is used to select novel MDM2 interactors based on Nutlin-3 responsiveness in a cell-based proteomics screen. MDM2 binds a subset of peptide motifs corresponding to real proteins with a range of allosteric responses to MDM2 ligands. We validate cyclophilin B as a novel protein with a consensus MDM2 binding motif that is stabilised by Nutlin-3 in vivo, thus identifying one of the few known interactors of MDM2 that is stabilised by Nutlin-3. These data invoke two modes of peptide binding at the MDM2 N-terminus that rely on a consensus core motif to control the equilibrium between MDM2 binding proteins. This approach stratifies MDM2 interacting proteins based on the linear motif feature and provides a new biomarker assay to define clinically relevant Nutlin-3 responsive MDM2 interactors. Copyright © 2014 Elsevier Inc. All rights reserved.
Do motifs reflect evolved function?--No convergent evolution of genetic regulatory network subgraph topologies.

PubMed

Knabe, Johannes F; Nehaniv, Chrystopher L; Schilstra, Maria J

2008-01-01

Methods that analyse the topological structure of networks have recently become quite popular. Whether motifs (subgraph patterns that occur more often than in randomized networks) have specific functions as elementary computational circuits has been cause for debate. As the question is difficult to resolve with currently available biological data, we approach the issue using networks that abstractly model natural genetic regulatory networks (GRNs) which are evolved to show dynamical behaviors. Specifically one group of networks was evolved to be capable of exhibiting two different behaviors ("differentiation") in contrast to a group with a single target behavior. In both groups we find motif distribution differences within the groups to be larger than differences between them, indicating that evolutionary niches (target functions) do not necessarily mold network structure uniquely. These results show that variability operators can have a stronger influence on network topologies than selection pressures, especially when many topologies can create similar dynamics. Moreover, analysis of motif functional relevance by lesioning did not suggest that motifs were of greater importance to the functioning of the network than arbitrary subgraph patterns. Only when drastically restricting network size, so that one motif corresponds to a whole functionally evolved network, was preference for particular connection patterns found. This suggests that in non-restricted, bigger networks, entanglement with the rest of the network hinders topological subgraph analysis.
Unique genome organization of non-mammalian papillomaviruses provides insights into the evolution of viral early proteins

PubMed Central

Ruoppolo, Valeria; Schmidt, Annie; Lescroël, Amelie; Jongsomjit, Dennis; Elrod, Megan; Kraberger, Simona; Stainton, Daisy; Dugger, Katie M; Ballard, Grant; Ainley, David G

2017-01-01

Abstract The family Papillomaviridae contains more than 320 papillomavirus types, with most having been identified as infecting skin and mucosal epithelium in mammalian hosts. To date, only nine non-mammalian papillomaviruses have been described from birds (n = 5), a fish (n = 1), a snake (n = 1), and turtles (n = 2). The identification of papillomaviruses in sauropsids and a sparid fish suggests that early ancestors of papillomaviruses were already infecting the earliest Euteleostomi. The Euteleostomi clade includes more than 90 per cent of the living vertebrate species, and progeny virus could have been passed on to all members of this clade, inhabiting virtually every habitat on the planet. As part of this study, we isolated a novel papillomavirus from a 16-year-old female Adélie penguin (Pygoscelis adeliae) from Cape Crozier, Ross Island (Antarctica). The new papillomavirus shares ∼64 per cent genome-wide identity to a previously described Adélie penguin papillomavirus. Phylogenetic analyses show that the non-mammalian viruses (expect the python, Morelia spilota, associated papillomavirus) cluster near the base of the papillomavirus evolutionary tree. A papillomavirus isolated from an avian host (Northern fulmar; Fulmarus glacialis), like the two turtle papillomaviruses, lacks a putative E9 protein that is found in all other avian papillomaviruses. Furthermore, the Northern fulmar papillomavirus has an E7 more similar to the mammalian viruses than the other avian papillomaviruses. Typical E6 proteins of mammalian papillomaviruses have two Zinc finger motifs, whereas the sauropsid papillomaviruses only have one such motif. Furthermore, this motif is absent in the fish papillomavirus. Thus, it is highly likely that the most recent common ancestor of the mammalian and sauropsid papillomaviruses had a single motif E6. It appears that a motif duplication resulted in mammalian papillomaviruses having a double Zinc finger motif in E6. We estimated the divergence time between Northern fulmar-associated papillomavirus and the other Sauropsid papillomaviruses be to around 250 million years ago, during the Paleozoic-Mesozoic transition and our analysis dates the root of the papillomavirus tree between 400 and 600 million years ago. Our analysis shows evidence for niche adaptation and that these non-mammalian viruses have highly divergent E6 and E7 proteins, providing insights into the evolution of the early viral (onco-)proteins. PMID:29026649
Unique genome organization of non-mammalian papillomaviruses provides insights into the evolution of viral early proteins

USGS Publications Warehouse

Van Doorslaer, Koenraad; Ruoppolo, Valeria; Schmidt, Annie; Lescroël, Amelie; Jongsomjit, Dennis; Elrod, Megan; Kraberger, Simona; Stainton, Daisy; Dugger, Katie M.; Ballard, Grant; Ainley, David G.; Varsani, Arvind

2017-01-01

The family Papillomaviridae contains more than 320 papillomavirus types, with most having been identified as infecting skin and mucosal epithelium in mammalian hosts. To date, only nine non-mammalian papillomaviruses have been described from birds (n = 5), a fish (n = 1), a snake (n = 1), and turtles (n = 2). The identification of papillomaviruses in sauropsids and a sparid fish suggests that early ancestors of papillomaviruses were already infecting the earliest Euteleostomi. The Euteleostomi clade includes more than 90 per cent of the living vertebrate species, and progeny virus could have been passed on to all members of this clade, inhabiting virtually every habitat on the planet. As part of this study, we isolated a novel papillomavirus from a 16-year-old female Adélie penguin (Pygoscelis adeliae) from Cape Crozier, Ross Island (Antarctica). The new papillomavirus shares ∼64 per cent genome-wide identity to a previously described Adélie penguin papillomavirus. Phylogenetic analyses show that the non-mammalian viruses (expect the python, Morelia spilota, associated papillomavirus) cluster near the base of the papillomavirus evolutionary tree. A papillomavirus isolated from an avian host (Northern fulmar; Fulmarus glacialis), like the two turtle papillomaviruses, lacks a putative E9 protein that is found in all other avian papillomaviruses. Furthermore, the Northern fulmar papillomavirus has an E7 more similar to the mammalian viruses than the other avian papillomaviruses. Typical E6 proteins of mammalian papillomaviruses have two Zinc finger motifs, whereas the sauropsid papillomaviruses only have one such motif. Furthermore, this motif is absent in the fish papillomavirus. Thus, it is highly likely that the most recent common ancestor of the mammalian and sauropsid papillomaviruses had a single motif E6. It appears that a motif duplication resulted in mammalian papillomaviruses having a double Zinc finger motif in E6. We estimated the divergence time between Northern fulmar-associated papillomavirus and the other Sauropsid papillomaviruses be to around 250 million years ago, during the Paleozoic-Mesozoic transition and our analysis dates the root of the papillomavirus tree between 400 and 600 million years ago. Our analysis shows evidence for niche adaptation and that these non-mammalian viruses have highly divergent E6 and E7 proteins, providing insights into the evolution of the early viral (onco-)proteins.
Unique genome organization of non-mammalian papillomaviruses provides insights into the evolution of viral early proteins.

PubMed

Van Doorslaer, Koenraad; Ruoppolo, Valeria; Schmidt, Annie; Lescroël, Amelie; Jongsomjit, Dennis; Elrod, Megan; Kraberger, Simona; Stainton, Daisy; Dugger, Katie M; Ballard, Grant; Ainley, David G; Varsani, Arvind

2017-07-01

The family Papillomaviridae contains more than 320 papillomavirus types, with most having been identified as infecting skin and mucosal epithelium in mammalian hosts. To date, only nine non-mammalian papillomaviruses have been described from birds ( n = 5), a fish ( n = 1), a snake ( n = 1), and turtles ( n = 2). The identification of papillomaviruses in sauropsids and a sparid fish suggests that early ancestors of papillomaviruses were already infecting the earliest Euteleostomi. The Euteleostomi clade includes more than 90 per cent of the living vertebrate species, and progeny virus could have been passed on to all members of this clade, inhabiting virtually every habitat on the planet. As part of this study, we isolated a novel papillomavirus from a 16-year-old female Adélie penguin ( Pygoscelis adeliae ) from Cape Crozier, Ross Island (Antarctica). The new papillomavirus shares ∼64 per cent genome-wide identity to a previously described Adélie penguin papillomavirus. Phylogenetic analyses show that the non-mammalian viruses (expect the python, Morelia spilota , associated papillomavirus) cluster near the base of the papillomavirus evolutionary tree. A papillomavirus isolated from an avian host (Northern fulmar; Fulmarus glacialis ), like the two turtle papillomaviruses, lacks a putative E9 protein that is found in all other avian papillomaviruses. Furthermore, the Northern fulmar papillomavirus has an E7 more similar to the mammalian viruses than the other avian papillomaviruses. Typical E6 proteins of mammalian papillomaviruses have two Zinc finger motifs, whereas the sauropsid papillomaviruses only have one such motif. Furthermore, this motif is absent in the fish papillomavirus. Thus, it is highly likely that the most recent common ancestor of the mammalian and sauropsid papillomaviruses had a single motif E6. It appears that a motif duplication resulted in mammalian papillomaviruses having a double Zinc finger motif in E6. We estimated the divergence time between Northern fulmar-associated papillomavirus and the other Sauropsid papillomaviruses be to around 250 million years ago, during the Paleozoic-Mesozoic transition and our analysis dates the root of the papillomavirus tree between 400 and 600 million years ago. Our analysis shows evidence for niche adaptation and that these non-mammalian viruses have highly divergent E6 and E7 proteins, providing insights into the evolution of the early viral (onco-)proteins.
Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria.

PubMed

Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A

2013-09-02

In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome-wide collection of reference RNA motif regulons is available in the RegPrecise database (http://regprecise.lbl.gov/).
Top surface blade residues and the central channel water molecules are conserved in every repeat of the integrin-like β-propeller structures.

PubMed

Denesyuk, Alexander; Denessiouk, Konstantin; Johnson, Mark S

2018-02-01

An integrin-like β-propeller domain contains seven repeats of a four-stranded antiparallel β-sheet motif (blades). Previously we described a 3D structural motif within each blade of the integrin-type β-propeller. Here, we show unique structural links that join different blades of the β-propeller structure, which together with the structural motif for a single blade are repeated in a β-propeller to provide the functional top face of the barrel, found to be involved in protein-protein interactions and substrate recognition. We compare functional top face diagrams of the integrin-type β-propeller domain and two non-integrin type β-propeller domains of virginiamycin B lyase and WD Repeat-Containing Protein 5. Copyright © 2017 Elsevier Inc. All rights reserved.
Identification of cancer-specific motifs in mimotope profiles of serum antibody repertoire.

PubMed

Gerasimov, Ekaterina; Zelikovsky, Alex; Măndoiu, Ion; Ionov, Yurij

2017-06-07

For fighting cancer, earlier detection is crucial. Circulating auto-antibodies produced by the patient's own immune system after exposure to cancer proteins are promising bio-markers for the early detection of cancer. Since an antibody recognizes not the whole antigen but 4-7 critical amino acids within the antigenic determinant (epitope), the whole proteome can be represented by a random peptide phage display library. This opens the possibility to develop an early cancer detection test based on a set of peptide sequences identified by comparing cancer patients' and healthy donors' global peptide profiles of antibody specificities. Due to the enormously large number of peptide sequences contained in global peptide profiles generated by next generation sequencing, the large number of cancer and control sera is required to identify cancer-specific peptides with high degree of statistical significance. To decrease the number of peptides in profiles generated by nextgen sequencing without losing cancer-specific sequences we used for generation of profiles the phage library enriched by panning on the pool of cancer sera. To further decrease the complexity of profiles we used computational methods for transforming a list of peptides constituting the mimotope profiles to the list motifs formed by similar peptide sequences. We have shown that the amino-acid order is meaningful in mimotope motifs since they contain significantly more peptides than motifs among peptides where amino-acids are randomly permuted. Also the single sample motifs significantly differ from motifs in peptides drawn from multiple samples. Finally, multiple cancer-specific motifs have been identified.
Unique secreted–surface protein complex of Lactobacillus rhamnosus, identified by phage display

PubMed Central

Gagic, Dragana; Wen, Wesley; Collett, Michael A; Rakonjac, Jasna

2013-01-01

Proteins are the most diverse structures on bacterial surfaces; hence, they are candidates for species- and strain-specific interactions of bacteria with the host, environment, and other microorganisms. Genomics has decoded thousands of bacterial surface and secreted proteins, yet the function of most cannot be predicted because of the enormous variability and a lack of experimental data that would allow deduction of function through homology. Here, we used phage display to identify a pair of interacting extracellular proteins in the probiotic bacterium Lactobacillus rhamnosus HN001. A secreted protein, SpcA, containing two bacterial immunoglobulin-like domains type 3 (Big-3) and a domain distantly related to plant pathogen response domain 1 (PR-1-like) was identified by screening of an L. rhamnosus HN001 library using HN001 cells as bait. The SpcA-“docking” protein, SpcB, was in turn detected by another phage display library screening, using purified SpcA as bait. SpcB is a 3275-residue cell-surface protein that contains general features of large glycosylated Serine-rich adhesins/fibrils from gram-positive bacteria, including the hallmark signal sequence motif KxYKxGKxW. Both proteins are encoded by genes within a L. rhamnosus-unique gene cluster that distinguishes this species from other lactobacilli. To our knowledge, this is the first example of a secreted-docking protein pair identified in lactobacilli. PMID:23233310
Identification of GATC- and CCGG- recognizing Type II REases and their putative specificity-determining positions using Scan2S—a novel motif scan algorithm with optional secondary structure constraints

PubMed Central

Niv, Masha Y.; Skrabanek, Lucy; Roberts, Richard J.; Scheraga, Harold A.; Weinstein, Harel

2008-01-01

Restriction endonucleases (REases) are DNA-cleaving enzymes that have become indispensable tools in molecular biology. Type II REases are highly divergent in sequence despite their common structural core, function and, in some cases, common specificities towards DNA sequences. This makes it difficult to identify and classify them functionally based on sequence, and has hampered the efforts of specificity-engineering. Here, we define novel REase sequence motifs, which extend beyond the PD-(D/E)XK hallmark, and incorporate secondary structure information. The automated search using these motifs is carried out with a newly developed fast regular expression matching algorithm that accommodates long patterns with optional secondary structure constraints. Using this new tool, named Scan2S, motifs derived from REases with specificity towards GATC- and CGGG-containing DNA sequences successfully identify REases of the same specificity. Notably, some of these sequences are not identified by standard sequence detection tools. The new motifs highlight potential specificity-determining positions that do not fully overlap for the GATC- and the CCGG-recognizing REases and are candidates for specificity re-engineering. PMID:17972284
Identification of GATC- and CCGG-recognizing Type II REases and their putative specificity-determining positions using Scan2S--a novel motif scan algorithm with optional secondary structure constraints.

PubMed

Niv, Masha Y; Skrabanek, Lucy; Roberts, Richard J; Scheraga, Harold A; Weinstein, Harel

2008-05-01

Restriction endonucleases (REases) are DNA-cleaving enzymes that have become indispensable tools in molecular biology. Type II REases are highly divergent in sequence despite their common structural core, function and, in some cases, common specificities towards DNA sequences. This makes it difficult to identify and classify them functionally based on sequence, and has hampered the efforts of specificity-engineering. Here, we define novel REase sequence motifs, which extend beyond the PD-(D/E)XK hallmark, and incorporate secondary structure information. The automated search using these motifs is carried out with a newly developed fast regular expression matching algorithm that accommodates long patterns with optional secondary structure constraints. Using this new tool, named Scan2S, motifs derived from REases with specificity towards GATC- and CGGG-containing DNA sequences successfully identify REases of the same specificity. Notably, some of these sequences are not identified by standard sequence detection tools. The new motifs highlight potential specificity-determining positions that do not fully overlap for the GATC- and the CCGG-recognizing REases and are candidates for specificity re-engineering.
NoFold: RNA structure clustering without folding or alignment.

PubMed

Middleton, Sarah A; Kim, Junhyong

2014-11-01

Structures that recur across multiple different transcripts, called structure motifs, often perform a similar function-for example, recruiting a specific RNA-binding protein that then regulates translation, splicing, or subcellular localization. Identifying common motifs between coregulated transcripts may therefore yield significant insight into their binding partners and mechanism of regulation. However, as most methods for clustering structures are based on folding individual sequences or doing many pairwise alignments, this results in a tradeoff between speed and accuracy that can be problematic for large-scale data sets. Here we describe a novel method for comparing and characterizing RNA secondary structures that does not require folding or pairwise alignment of the input sequences. Our method uses the idea of constructing a distance function between two objects by their respective distances to a collection of empirical examples or models, which in our case consists of 1973 Rfam family covariance models. Using this as a basis for measuring structural similarity, we developed a clustering pipeline called NoFold to automatically identify and annotate structure motifs within large sequence data sets. We demonstrate that NoFold can simultaneously identify multiple structure motifs with an average sensitivity of 0.80 and precision of 0.98 and generally exceeds the performance of existing methods. We also perform a cross-validation analysis of the entire set of Rfam families, achieving an average sensitivity of 0.57. We apply NoFold to identify motifs enriched in dendritically localized transcripts and report 213 enriched motifs, including both known and novel structures. © 2014 Middleton and Kim; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Identification and preliminary characterization of a protein motif related to the zinc finger.

PubMed Central

Lovering, R; Hanson, I M; Borden, K L; Martin, S; O'Reilly, N J; Evan, G I; Rahman, D; Pappin, D J; Trowsdale, J; Freemont, P S

1993-01-01

We have identified a protein motif, related to the zinc finger, which defines a newly discovered family of proteins. The motif was found in the sequence of the human RING1 gene, which is proximal to the major histocompatibility complex region on chromosome six. We propose naming this motif the "RING finger" and it is found in 27 proteins, all of which have putative DNA binding functions. We have synthesized a peptide corresponding to the RING1 motif and examined a number of properties, including metal and DNA binding. We provide evidence to support the suggestion that the RING finger motif is the DNA binding domain of this newly defined family of proteins. Images Fig. 1 Fig. 4 PMID:7681583
Genome Wide Identification, Evolutionary, and Expression Analysis of VQ Genes from Two Pyrus Species.

PubMed

Cao, Yunpeng; Meng, Dandan; Abdullah, Muhammad; Jin, Qing; Lin, Yi; Cai, Yongping

2018-04-23

The VQ motif-containing gene, a member of the plant-specific genes, is involved in the plant developmental process and various stress responses. The VQ motif-containing gene family has been studied in several plants, such as rice ( Oryza sativa ), maize ( Zea mays ), and Arabidopsis ( Arabidopsis thaliana ). However, no systematic study has been performed in Pyrus species, which have important economic value. In our study, we identified 41 and 28 VQ motif-containing genes in Pyrus bretschneideri and Pyrus communis , respectively. Phylogenetic trees were calculated using A. thaliana and O. sativa VQ motif-containing genes as a template, allowing us to categorize these genes into nine subfamilies. Thirty-two and eight paralogous of VQ motif-containing genes were found in P. bretschneideri and P. communis , respectively, showing that the VQ motif-containing genes had a more remarkable expansion in P. bretschneideri than in P. communis . A total of 31 orthologous pairs were identified from the P. bretschneideri and P. communis VQ motif-containing genes. Additionally, among the paralogs, we found that these duplication gene pairs probably derived from segmental duplication/whole-genome duplication (WGD) events in the genomes of P. bretschneideri and P. communis , respectively. The gene expression profiles in both P. bretschneideri and P. communis fruits suggested functional redundancy for some orthologous gene pairs derived from a common ancestry, and sub-functionalization or neo-functionalization for some of them. Our study provided the first systematic evolutionary analysis of the VQ motif-containing genes in Pyrus , and highlighted the diversification and duplication of VQ motif-containing genes in both P. bretschneideri and P. communis .
Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space

PubMed Central

Karnik, Rahul; Beer, Michael A.

2015-01-01

The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884

Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

PubMed

Karnik, Rahul; Beer, Michael A

2015-01-01

The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.
Site-specific identification of heparan and chondroitin sulfate glycosaminoglycans in hybrid proteoglycans.

PubMed

Noborn, Fredrik; Gomez Toledo, Alejandro; Green, Anders; Nasir, Waqas; Sihlbom, Carina; Nilsson, Jonas; Larson, Göran

2016-10-03

Heparan sulfate (HS) and chondroitin sulfate (CS) are complex polysaccharides that regulate important biological pathways in virtually all metazoan organisms. The polysaccharides often display opposite effects on cell functions with HS and CS structural motifs presenting unique binding sites for specific ligands. Still, the mechanisms by which glycan biosynthesis generates complex HS and CS polysaccharides required for the regulation of mammalian physiology remain elusive. Here we present a glycoproteomic approach that identifies and differentiates between HS and CS attachment sites and provides identity to the core proteins. Glycopeptides were prepared from perlecan, a complex proteoglycan known to be substituted with both HS and CS chains, further digested with heparinase or chondroitinase ABC to reduce the HS and CS chain lengths respectively, and thereafter analyzed by nLC-MS/MS. This protocol enabled the identification of three consensus HS sites and one hybrid site, carrying either a HS or a CS chain. Inspection of the amino acid sequence at the hybrid attachment locus indicates that certain peptide motifs may encode for the chain type selection process. This analytical approach will become useful when addressing fundamental questions in basic biology specifically in elucidating the functional roles of site-specific glycosylations of proteoglycans.
Expressed sequence tags from the plant trypanosomatid Phytomonas serpens.

PubMed

Pappas, Georgios J; Benabdellah, Karim; Zingales, Bianca; González, Antonio

2005-08-01

We have generated 2190 expressed sequence tags (ESTs) from a cDNA library of the plant trypanosomatid Phytomonas serpens. Upon processing and clustering the set of 1893 accepted sequences was reduced to 697 clusters consisting of 452 singletons and 245 contigs. Functional categories were assigned based on BLAST searches against a database of the eukaryotic orthologous groups of proteins (KOG). Thirty six percent of the generated sequences showed no hits against the KOG database and 39.6% presented similarity to the KOG classes corresponding to translation, ribosomal structure and biogenesis. The most populated cluster contained 45 ESTs homologous to members of the glucose transporter family. This fact can be immediately correlated to the reported Phytomonas dependence on anaerobic glycolytic ATP production due to the lack of cytochrome-mediated respiratory chain. In this context, not only a number of enzymes of the glycolytic pathway were identified but also of the Krebs cycle as well as specific components of the respiratory chain. The data here reported, including a few hundred unique sequences and the description of tandemly repeated motifs and putative transcript stability motifs at untranslated mRNA ends, represent an initial approach to overcome the lack of information on the molecular biology of this organism.
MLL/WDR5 Complex Regulates Kif2A Localization to Ensure Chromosome Congression and Proper Spindle Assembly during Mitosis.

PubMed

Ali, Aamir; Veeranki, Sailaja Naga; Chinchole, Akash; Tyagi, Shweta

2017-06-19

Mixed-lineage leukemia (MLL), along with multisubunit (WDR5, RbBP5, ASH2L, and DPY30) complex catalyzes the trimethylation of H3K4, leading to gene activation. Here, we characterize a chromatin-independent role for MLL during mitosis. MLL and WDR5 localize to the mitotic spindle apparatus, and loss of function of MLL complex by RNAi results in defects in chromosome congression and compromised spindle formation. We report interaction of MLL complex with several kinesin and dynein motors. We further show that the MLL complex associates with Kif2A, a member of the Kinesin-13 family of microtubule depolymerase, and regulates the spindle localization of Kif2A during mitosis. We have identified a conserved WDR5 interaction (Win) motif, so far unique to the MLL family, in Kif2A. The Win motif of Kif2A engages in direct interactions with WDR5 for its spindle localization. Our findings highlight a non-canonical mitotic function of MLL complex, which may have a direct impact on chromosomal stability, frequently compromised in cancer. Copyright © 2017 Elsevier Inc. All rights reserved.
Accurate Quantification of microRNA via Single Strand Displacement Reaction on DNA Origami Motif

PubMed Central

Lou, Jingyu; Li, Weidong; Li, Sheng; Zhu, Hongxin; Yang, Lun; Zhang, Aiping; He, Lin; Li, Can

2013-01-01

DNA origami is an emerging technology that assembles hundreds of staple strands and one single-strand DNA into certain nanopattern. It has been widely used in various fields including detection of biological molecules such as DNA, RNA and proteins. MicroRNAs (miRNAs) play important roles in post-transcriptional gene repression as well as many other biological processes such as cell growth and differentiation. Alterations of miRNAs' expression contribute to many human diseases. However, it is still a challenge to quantitatively detect miRNAs by origami technology. In this study, we developed a novel approach based on streptavidin and quantum dots binding complex (STV-QDs) labeled single strand displacement reaction on DNA origami to quantitatively detect the concentration of miRNAs. We illustrated a linear relationship between the concentration of an exemplary miRNA as miRNA-133 and the STV-QDs hybridization efficiency; the results demonstrated that it is an accurate nano-scale miRNA quantifier motif. In addition, both symmetrical rectangular motif and asymmetrical China-map motif were tested. With significant linearity in both motifs, our experiments suggested that DNA Origami motif with arbitrary shape can be utilized in this method. Since this DNA origami-based method we developed owns the unique advantages of simple, time-and-material-saving, potentially multi-targets testing in one motif and relatively accurate for certain impurity samples as counted directly by atomic force microscopy rather than fluorescence signal detection, it may be widely used in quantification of miRNAs. PMID:23990889
A distinct sortase SrtB anchors and processes a streptococcal adhesin AbpA with a novel structural property

PubMed Central

Liang, Xiaobo; Liu, Bing; Zhu, Fan; Scannapieco, Frank A.; Haase, Elaine M.; Matthews, Steve; Wu, Hui

2016-01-01

Surface display of proteins by sortases in Gram-positive bacteria is crucial for bacterial fitness and virulence. We found a unique gene locus encoding an amylase-binding adhesin AbpA and a sortase B in oral streptococci. AbpA possesses a new distinct C-terminal cell wall sorting signal. We demonstrated that this C-terminal motif is required for anchoring AbpA to cell wall. In vitro and in vivo studies revealed that SrtB has dual functions, anchoring AbpA to the cell wall and processing AbpA into a ladder profile. Solution structure of AbpA determined by NMR reveals a novel structure comprising a small globular α/β domain and an extended coiled-coil heliacal domain. Structural and biochemical studies identified key residues that are crucial for amylase binding. Taken together, our studies document a unique sortase/adhesion substrate system in streptococci adapted to the oral environment rich in salivary amylase. PMID:27492581
A Phosphorylated Cytoplasmic Autoantigen, GW182, Associates with a Unique Population of Human mRNAs within Novel Cytoplasmic Speckles

PubMed Central

Eystathioy, Theophany; Chan, Edward K. L.; Tenenbaum, Scott A.; Keene, Jack D.; Griffith, Kevin; Fritzler, Marvin J.

2002-01-01

A novel human cellular structure has been identified that contains a unique autoimmune antigen and multiple messenger RNAs. This complex was discovered using an autoimmune serum from a patient with motor and sensory neuropathy and contains a protein of 182 kDa. The gene and cDNA encoding the protein indicated an open reading frame with glycine-tryptophan (GW) repeats and a single RNA recognition motif. Both the patient's serum and a rabbit serum raised against the recombinant GW protein costained discrete cytoplasmic speckles designated as GW bodies (GWBs) that do not overlap with the Golgi complex, endosomes, lysosomes, or peroxisomes. The mRNAs associated with GW182 represent a clustered set of transcripts that are presumed to reside within the GW complexes. We propose that the GW ribonucleoprotein complex is involved in the posttranscriptional regulation of gene expression by sequestering a specific subset of gene transcripts involved in cell growth and homeostasis. PMID:11950943
Identity and functions of CxxC-derived motifs.

PubMed

Fomenko, Dmitri E; Gladyshev, Vadim N

2003-09-30

Two cysteines separated by two other residues (the CxxC motif) are employed by many redox proteins for formation, isomerization, and reduction of disulfide bonds and for other redox functions. The place of the C-terminal cysteine in this motif may be occupied by serine (the CxxS motif), modifying the functional repertoire of redox proteins. Here we found that the CxxC motif may also give rise to a motif, in which the C-terminal cysteine is replaced with threonine (the CxxT motif). Moreover, in contrast to a view that the N-terminal cysteine in the CxxC motif always serves as a nucleophilic attacking group, this residue could also be replaced with threonine (the TxxC motif), serine (the SxxC motif), or other residues. In each of these CxxC-derived motifs, the presence of a downstream alpha-helix was strongly favored. A search for conserved CxxC-derived motif/helix patterns in four complete genomes representing bacteria, archaea, and eukaryotes identified known redox proteins and suggested possible redox functions for several additional proteins. Catalytic sites in peroxiredoxins were major representatives of the TxxC motif, whereas those in glutathione peroxidases represented the CxxT motif. Structural assessments indicated that threonines in these enzymes could stabilize catalytic thiolates, suggesting revisions to previously proposed catalytic triads. Each of the CxxC-derived motifs was also observed in natural selenium-containing proteins, in which selenocysteine was present in place of a catalytic cysteine.
Automatic Network Fingerprinting through Single-Node Motifs

PubMed Central

Echtermeyer, Christoph; da Fontoura Costa, Luciano; Rodrigues, Francisco A.; Kaiser, Marcus

2011-01-01

Complex networks have been characterised by their specific connectivity patterns (network motifs), but their building blocks can also be identified and described by node-motifs—a combination of local network features. One technique to identify single node-motifs has been presented by Costa et al. (L. D. F. Costa, F. A. Rodrigues, C. C. Hilgetag, and M. Kaiser, Europhys. Lett., 87, 1, 2009). Here, we first suggest improvements to the method including how its parameters can be determined automatically. Such automatic routines make high-throughput studies of many networks feasible. Second, the new routines are validated in different network-series. Third, we provide an example of how the method can be used to analyse network time-series. In conclusion, we provide a robust method for systematically discovering and classifying characteristic nodes of a network. In contrast to classical motif analysis, our approach can identify individual components (here: nodes) that are specific to a network. Such special nodes, as hubs before, might be found to play critical roles in real-world networks. PMID:21297963
Phosphatidylinositol-4-kinase type II alpha contains an AP-3-sorting motif and a kinase domain that are both required for endosome traffic.

PubMed

Craige, Branch; Salazar, Gloria; Faundez, Victor

2008-04-01

The adaptor complex 3 (AP-3) targets membrane proteins from endosomes to lysosomes, lysosome-related organelles and synaptic vesicles. Phosphatidylinositol-4-kinase type II alpha (PI4KIIalpha) is one of several proteins possessing catalytic domains that regulate AP-3-dependent sorting. Here we present evidence that PI4KIIalpha uniquely behaves both as a membrane protein cargo as well as an enzymatic regulator of adaptor function. In fact, AP-3 and PI4KIIalpha form a complex that requires a dileucine-sorting motif present in PI4KIIalpha. Mutagenesis of either the PI4KIIalpha-sorting motif or its kinase-active site indicates that both are necessary to interact with AP-3 and properly localize PI4KIIalpha to LAMP-1-positive endosomes. Similarly, both the kinase activity and the sorting signal present in PI4KIIalpha are necessary to rescue endosomal PI4KIIalpha siRNA-induced mutant phenotypes. We propose a mechanism whereby adaptors use canonical sorting motifs to selectively recruit a regulatory enzymatic activity to restricted membrane domains.
Conservation of the PTEN catalytic motif in the bacterial undecaprenyl pyrophosphate phosphatase, BacA/UppP.

PubMed

Bickford, Justin S; Nick, Harry S

2013-12-01

Isoprenoid lipid carriers are essential in protein glycosylation and bacterial cell envelope biosynthesis. The enzymes involved in their metabolism (synthases, kinases and phosphatases) are therefore critical to cell viability. In this review, we focus on two broad groups of isoprenoid pyrophosphate phosphatases. One group, containing phosphatidic acid phosphatase motifs, includes the eukaryotic dolichyl pyrophosphate phosphatases and proposed recycling bacterial undecaprenol pyrophosphate phosphatases, PgpB, YbjB and YeiU/LpxT. The second group comprises the bacterial undecaprenol pyrophosphate phosphatase, BacA/UppP, responsible for initial formation of undecaprenyl phosphate, which we predict contains a tyrosine phosphate phosphatase motif resembling that of the tumour suppressor, phosphatase and tensin homologue (PTEN). Based on protein sequence alignments across species and 2D structure predictions, we propose catalytic and lipid recognition motifs unique to BacA/UppP enzymes. The verification of our proposed active-site residues would provide new strategies for the development of substrate-specific inhibitors which mimic both the lipid and pyrophosphate moieties, leading to the development of novel antimicrobial agents.
DNA motifs associated with aberrant CpG island methylation.

PubMed

Feltus, F Alex; Lee, Eva K; Costello, Joseph F; Plass, Christoph; Vertino, Paula M

2006-05-01

Epigenetic silencing involving the aberrant methylation of promoter region CpG islands is widely recognized as a tumor suppressor silencing mechanism in cancer. However, the molecular pathways underlying aberrant DNA methylation remain elusive. Recently we showed that, on a genome-wide level, CpG island loci differ in their intrinsic susceptibility to aberrant methylation and that this susceptibility can be predicted based on underlying sequence context. These data suggest that there are sequence/structural features that contribute to the protection from or susceptibility to aberrant methylation. Here we use motif elicitation coupled with classification techniques to identify DNA sequence motifs that selectively define methylation-prone or methylation-resistant CpG islands. Motifs common to 28 methylation-prone or 47 methylation-resistant CpG island-containing genomic fragments were determined using the MEME and MAST algorithms (). The five most discriminatory motifs derived from methylation-prone sequences were found to be associated with CpG islands in general and were nonrandomly distributed throughout the genome. In contrast, the eight most discriminatory motifs derived from the methylation-resistant CpG islands were randomly distributed throughout the genome. Interestingly, this latter group tended to associate with Alu and other repetitive sequences. Used together, the frequency of occurrence of these motifs successfully discriminated methylation-prone and methylation-resistant CpG island groups with an accuracy of 87% after 10-fold cross-validation. The motifs identified here are candidate methylation-targeting or methylation-protection DNA sequences.
Computational Analyses of Synergism in Small Molecular Network Motifs

PubMed Central

Zhang, Yili; Smolen, Paul; Baxter, Douglas A.; Byrne, John H.

2014-01-01

Cellular functions and responses to stimuli are controlled by complex regulatory networks that comprise a large diversity of molecular components and their interactions. However, achieving an intuitive understanding of the dynamical properties and responses to stimuli of these networks is hampered by their large scale and complexity. To address this issue, analyses of regulatory networks often focus on reduced models that depict distinct, reoccurring connectivity patterns referred to as motifs. Previous modeling studies have begun to characterize the dynamics of small motifs, and to describe ways in which variations in parameters affect their responses to stimuli. The present study investigates how variations in pairs of parameters affect responses in a series of ten common network motifs, identifying concurrent variations that act synergistically (or antagonistically) to alter the responses of the motifs to stimuli. Synergism (or antagonism) was quantified using degrees of nonlinear blending and additive synergism. Simulations identified concurrent variations that maximized synergism, and examined the ways in which it was affected by stimulus protocols and the architecture of a motif. Only a subset of architectures exhibited synergism following paired changes in parameters. The approach was then applied to a model describing interlocked feedback loops governing the synthesis of the CREB1 and CREB2 transcription factors. The effects of motifs on synergism for this biologically realistic model were consistent with those for the abstract models of single motifs. These results have implications for the rational design of combination drug therapies with the potential for synergistic interactions. PMID:24651495
Myocilin, a Component of a Membrane-Associated Protein Complex Driven by a Homologous Q-SNARE Domain

PubMed Central

Dismuke, W. Michael; McKay, Brian S.; Stamer, W. Daniel

2012-01-01

Myocilin is a widely expressed protein with no known function, however, mutations in myocilin appear to manifest uniquely as ocular hypertension and the blinding disease glaucoma. Using the protein homology/analogy recognition engine (PHYRE) we find that the olfactomedin domain of myocilin is similar in sequence motif and structure to a six-bladed, kelch repeat motif based on the known crystal structures of such proteins. Additionally, using sequence analysis we identify a coiled-coil segment of myocilin with homology to human Q-SNARE proteins. Using COS-7 cells expressing full length human myocilin and a version lacking the C-terminal olfactomedin domain, we identified a membrane-associated protein complex containing myocilin by hydrodynamic analysis. The myocilin construct that included the coiled-coil but lacked the olfactomedin domain formed complexes similar to the full-length protein, indicating that the coiled-coil domain of myocilin is sufficient for myocilin to bind to the large detergent resistant complex. In human retina and retinal pigment epithelium, which express myocilin, we detected the protein in a large, SDS-resistant, membrane-associated complex. We characterized the hydrodynamic properties of myocilin in human tissues as either a 15s complex with an Mr=405,000–440,000 yielding a slightly elongated globular shape similar to known SNARE complexes or a dimer of 6.4s and Mr=108,000. By identifying the Q-SNARE homology within the second coil of myocilin and documenting its participation in a SNARE-like complex, we provide evidence of a SNARE domain containing protein associated with a human disease. PMID:22463803
Expression, subcellular localization, and cis-regulatory structure of duplicated phytoene synthase genes in melon (Cucumis melo L.).

PubMed

Qin, Xiaoqiong; Coku, Ardian; Inoue, Kentaro; Tian, Li

2011-10-01

Carotenoids perform many critical functions in plants, animals, and humans. It is therefore important to understand carotenoid biosynthesis and its regulation in plants. Phytoene synthase (PSY) catalyzes the first committed and rate-limiting step in carotenoid biosynthesis. While PSY is present as a single copy gene in Arabidopsis, duplicated PSY genes have been identified in many economically important monocot and dicot crops. CmPSY1 was previously identified from melon (Cucumis melo L.), but was not functionally characterized. We isolated a second PSY gene, CmPSY2, from melon in this work. CmPSY2 possesses a unique intron/exon structure that has not been observed in other plant PSYs. Both CmPSY1 and CmPSY2 are functional in vitro, but exhibit distinct expression patterns in different melon tissues and during fruit development, suggesting differential regulation of the duplicated melon PSY genes. In vitro chloroplast import assays verified the plastidic localization of CmPSY1 and CmPSY2 despite the lack of an obvious plastid target peptide in CmPSY2. Promoter motif analysis of the duplicated melon and tomato PSY genes and the Arabidopsis PSY revealed distinctive cis-regulatory structures of melon PSYs and identified gibberellin-responsive motifs in all PSYs except for SlPSY1, which has not been reported previously. Overall, these data provide new insights into the evolutionary history of plant PSY genes and the regulation of PSY expression by developmental and environmental signals that may involve different regulatory networks.
Four signature motifs define the first class of structurally related large coiled-coil proteins in plants.

PubMed Central

Gindullis, Frank; Rose, Annkatrin; Patel, Shalaka; Meier, Iris

2002-01-01

Background Animal and yeast proteins containing long coiled-coil domains are involved in attaching other proteins to the large, solid-state components of the cell. One subgroup of long coiled-coil proteins are the nuclear lamins, which are involved in attaching chromatin to the nuclear envelope and have recently been implicated in inherited human diseases. In contrast to other eukaryotes, long coiled-coil proteins have been barely investigated in plants. Results We have searched the completed Arabidopsis genome and have identified a family of structurally related long coiled-coil proteins. Filament-like plant proteins (FPP) were identified by sequence similarity to a tomato cDNA that encodes a coiled-coil protein which interacts with the nuclear envelope-associated protein, MAF1. The FPP family is defined by four novel unique sequence motifs and by two clusters of long coiled-coil domains separated by a non-coiled-coil linker. All family members are expressed in a variety of Arabidopsis tissues. A homolog sharing the structural features was identified in the monocot rice, indicating conservation among angiosperms. Conclusion Except for myosins, this is the first characterization of a family of long coiled-coil proteins in plants. The tomato homolog of the FPP family binds in a yeast two-hybrid assay to a nuclear envelope-associated protein. This might suggest that FPP family members function in nuclear envelope biology. Because the full Arabidopsis genome does not appear to contain genes for lamins, it is of interest to investigate other long coiled-coil proteins, which might functionally replace lamins in the plant kingdom. PMID:11972898
Mining for class-specific motifs in protein sequence classification

PubMed Central

2013-01-01

Background In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms. PMID:23496846
Cooperative Team Networks

DTIC Science & Technology

2016-06-01

team processes, such as identifying motifs of dynamic communication exchanges which goes well beyond simple dyadic and triadic configurations; as well...new metrics and ways to formulate team processes, such as identifying motifs of dynamic communication exchanges which goes well beyond simple dyadic ...sensing, communication , information, and decision networks - Darryl Ahner (AFIT: Air Force Inst Tech) Panel Session: Mathematical Models of
Identification of potential host plant mimics of CLAVATA3/ESR (CLE)-like peptides from the plant-parasitic nematode Heterodera schachtii.

PubMed

Wang, Jianying; Replogle, Amy; Hussey, Richard; Baum, Thomas; Wang, Xiaohong; Davis, Eric L; Mitchum, Melissa G

2011-02-01

In this article, we present the cloning of two CLAVATA3/ESR (CLE)-like genes, HsCLE1 and HsCLE2, from the beet cyst nematode Heterodera schachtii, a plant-parasitic cyst nematode with a relatively broad host range that includes the model plant Arabidopsis. CLEs are small secreted peptide ligands that play important roles in plant growth and development. By secreting peptide mimics of plant CLEs, the nematode can developmentally reprogramme root cells for the formation of unique feeding sites within host roots for its own benefit. Both HsCLE1 and HsCLE2 encode small secreted polypeptides with a conserved C-terminal CLE domain sharing highest similarity to Arabidopsis CLEs 1-7. Moreover, HsCLE2 contains a 12-amino-acid CLE motif that is identical to AtCLE5 and AtCLE6. Like all other plant and nematode CLEs identified to date, HsCLEs caused wuschel-like phenotypes when overexpressed in Arabidopsis, and this activity was abolished when the proteins were expressed without the CLE motif. HsCLEs could also function in planta without a signal peptide, highlighting the unique, yet conserved function of nematode CLE variable domains in trafficking CLE peptides for secretion. In a direct comparison of HsCLE2 overexpression phenotypes with those of AtCLE5 and AtCLE6, similar shoot and root phenotypes were observed. Exogenous application of 12-amino-acid synthetic peptides corresponding to the CLE motifs of HsCLEs and AtCLE5/6 suggests that the function of this class of CLEs may be subject to complex endogenous regulation. When seedlings were grown on high concentrations of peptide (10 µm), root growth was suppressed; however, when seedlings were grown on low concentrations of peptide (0.1 µm), root growth was stimulated. Together, these findings indicate that AtCLEs1-7 may be the target peptides mimicked by HsCLEs to promote parasitism. © 2010 The Authors. Molecular Plant Pathology © 2010 BSPP and Blackwell Publishing Ltd.
Structural Characterization of Proline-rich Tyrosine Kinase 2 (PYK2) Reveals a Unique (DFG-out) Conformation and Enables Inhibitor Design

DOE Office of Scientific and Technical Information (OSTI.GOV)

Han, Seungil; Mistry, Anil; Chang, Jeanne S.

Proline-rich tyrosine kinase 2 (PYK2) is a cytoplasmic, non-receptor tyrosine kinase implicated in multiple signaling pathways. It is a negative regulator of osteogenesis and considered a viable drug target for osteoporosis treatment. The high-resolution structures of the human PYK2 kinase domain with different inhibitor complexes establish the conventional bilobal kinase architecture and show the conformational variability of the DFG loop. The basis for the lack of selectivity for the classical kinase inhibitor, PF-431396, within the FAK family is explained by our structural analyses. Importantly, the novel DFG-out conformation with two diarylurea inhibitors (BIRB796, PF-4618433) reveals a distinct subclass of non-receptormore » tyrosine kinases identifiable by the gatekeeper Met-502 and the unique hinge loop conformation of Leu-504. This is the first example of a leucine residue in the hinge loop that blocks the ATP binding site in the DFG-out conformation. Our structural, biophysical, and pharmacological studies suggest that the unique features of the DFG motif, including Leu-504 hinge-loop variability, can be exploited for the development of selective protein kinase inhibitors.« less

Analysis of the transcriptome of Panax notoginseng root uncovers putative triterpene saponin-biosynthetic genes and genetic markers

PubMed Central

2011-01-01

Background Panax notoginseng (Burk) F.H. Chen is important medicinal plant of the Araliacease family. Triterpene saponins are the bioactive constituents in P. notoginseng. However, available genomic information regarding this plant is limited. Moreover, details of triterpene saponin biosynthesis in the Panax species are largely unknown. Results Using the 454 pyrosequencing technology, a one-quarter GS FLX titanium run resulted in 188,185 reads with an average length of 410 bases for P. notoginseng root. These reads were processed and assembled by 454 GS De Novo Assembler software into 30,852 unique sequences. A total of 70.2% of unique sequences were annotated by Basic Local Alignment Search Tool (BLAST) similarity searches against public sequence databases. The Kyoto Encyclopedia of Genes and Genomes (KEGG) assignment discovered 41 unique sequences representing 11 genes involved in triterpene saponin backbone biosynthesis in the 454-EST dataset. In particular, the transcript encoding dammarenediol synthase (DS), which is the first committed enzyme in the biosynthetic pathway of major triterpene saponins, is highly expressed in the root of four-year-old P. notoginseng. It is worth emphasizing that the candidate cytochrome P450 (Pn02132 and Pn00158) and UDP-glycosyltransferase (Pn00082) gene most likely to be involved in hydroxylation or glycosylation of aglycones for triterpene saponin biosynthesis were discovered from 174 cytochrome P450s and 242 glycosyltransferases by phylogenetic analysis, respectively. Putative transcription factors were detected in 906 unique sequences, including Myb, homeobox, WRKY, basic helix-loop-helix (bHLH), and other family proteins. Additionally, a total of 2,772 simple sequence repeat (SSR) were identified from 2,361 unique sequences, of which, di-nucleotide motifs were the most abundant motif. Conclusion This study is the first to present a large-scale EST dataset for P. notoginseng root acquired by next-generation sequencing (NGS) technology. The candidate genes involved in triterpene saponin biosynthesis, including the putative CYP450s and UGTs, were obtained in this study. Additionally, the identification of SSRs provided plenty of genetic makers for molecular breeding and genetics applications in this species. These data will provide information on gene discovery, transcriptional regulation and marker-assisted selection for P. notoginseng. The dataset establishes an important foundation for the study with the purpose of ensuring adequate drug resources for this species. PMID:22369100
Conserved and divergent features of the structure and function of La and La-related proteins (LARPs)

PubMed Central

Bayfield, Mark A.; Yang, Ruiqing; Maraia, Richard J.

2010-01-01

Genuine La proteins contain two RNA binding motifs, a La motif (LAM) followed by a RNA recognition motif (RRM), arranged in a unique way to bind RNA. These proteins interact with an extensive variety of cellular RNAs and exhibit activities in two broad categories: i) to promote the metabolism of nascent pol III transcripts, including precursor-tRNAs, by binding to their common, UUU-3’OH containing ends, and ii) to modulate the translation of certain mRNAs involving an unknown binding mechanism. Characterization of several La-RNA crystal structures as well as biochemical studies reveal insight into their unique two-motif domain architecture and how the LAM recognizes UUU-3’OH while the RRM binds other parts of a pre-tRNA. Recent studies of members of distinct families of conserved La-related proteins (LARPs) indicate that some of these harbor activity related to genuine La proteins, suggesting that their UUU-3’OH binding mode has been appropriated for the assembly and regulation of a specific snRNP (e.g., 7SK snRNA assembly by hLARP7/PIP7S). Analyses of other LARP family members (i.e., hLARP4, hLARP6) suggest more diverged RNA binding modes and specialization for cytoplasmic mRNA-related functions. Thus it appears that while genuine La proteins exhibit broad general involvement in both snRNA-related and mRNA-related functions, different LARP families may have evolved specialized activities in either snRNA or mRNA related functions. In this review, we summarize recent progress that has led to greater understanding of the structure and function of La proteins and their roles in tRNA processing and RNP assembly dynamics, as well as progress on the different LARPs. PMID:20138158
Conserved and divergent features of the structure and function of La and La-related proteins (LARPs).

PubMed

Bayfield, Mark A; Yang, Ruiqing; Maraia, Richard J

2010-01-01

Genuine La proteins contain two RNA binding motifs, a La motif (LAM) followed by a RNA recognition motif (RRM), arranged in a unique way to bind RNA. These proteins interact with an extensive variety of cellular RNAs and exhibit activities in two broad categories: i) to promote the metabolism of nascent pol III transcripts, including precursor-tRNAs, by binding to their common, UUU-3'OH containing ends, and ii) to modulate the translation of certain mRNAs involving an unknown binding mechanism. Characterization of several La-RNA crystal structures as well as biochemical studies reveal insight into their unique two-motif domain architecture and how the LAM recognizes UUU-3'OH while the RRM binds other parts of a pre-tRNA. Recent studies of members of distinct families of conserved La-related proteins (LARPs) indicate that some of these harbor activity related to genuine La proteins, suggesting that their UUU-3'OH binding mode has been appropriated for the assembly and regulation of a specific snRNP (e.g., 7SK snRNP assembly by hLARP7/PIP7S). Analyses of other LARP family members suggest more diverged RNA binding modes and specialization for cytoplasmic mRNA-related functions. Thus it appears that while genuine La proteins exhibit broad general involvement in both snRNA-related and mRNA-related functions, different LARP families may have evolved specialized activities in either snRNA or mRNA-related functions. In this review, we summarize recent progress that has led to greater understanding of the structure and function of La proteins and their roles in tRNA processing and RNP assembly dynamics, as well as progress on the different LARPs.
Computational and experimental analysis of short peptide motifs for enzyme inhibition.

PubMed

Fu, Jinglin; Larini, Luca; Cooper, Anthony J; Whittaker, John W; Ahmed, Azka; Dong, Junhao; Lee, Minyoung; Zhang, Ting

2017-01-01

The metabolism of living systems involves many enzymes that play key roles as catalysts and are essential to biological function. Searching ligands with the ability to modulate enzyme activities is central to diagnosis and therapeutics. Peptides represent a promising class of potential enzyme modulators due to the large chemical diversity, and well-established methods for library synthesis. Peptides and their derivatives are found to play critical roles in modulating enzymes and mediating cellular uptakes, which are increasingly valuable in therapeutics. We present a methodology that uses molecular dynamics (MD) and point-variant screening to identify short peptide motifs that are critical for inhibiting β-galactosidase (β-Gal). MD was used to simulate the conformations of peptides and to suggest short motifs that were most populated in simulated conformations. The function of the simulated motifs was further validated by the experimental point-variant screening as critical segments for inhibiting the enzyme. Based on the validated motifs, we eventually identified a 7-mer short peptide for inhibiting an enzyme with low μM IC50. The advantage of our methodology is the relatively simplified simulation that is informative enough to identify the critical sequence of a peptide inhibitor, with a precision comparable to truncation and alanine scanning experiments. Our combined experimental and computational approach does not rely on a detailed understanding of mechanistic and structural details. The MD simulation suggests the populated motifs that are consistent with the results of the experimental alanine and truncation scanning. This approach appears to be applicable to both natural and artificial peptides. With more discovered short motifs in the future, they could be exploited for modulating biocatalysis, and developing new medicine.
Chiral Alkyl Halides: Underexplored Motifs in Medicine

PubMed Central

Gál, Bálint; Bucher, Cyril; Burns, Noah Z.

2016-01-01

While alkyl halides are valuable intermediates in synthetic organic chemistry, their use as bioactive motifs in drug discovery and medicinal chemistry is rare in comparison. This is likely attributable to the common misconception that these compounds are merely non-specific alkylators in biological systems. A number of chlorinated compounds in the pharmaceutical and food industries, as well as a growing number of halogenated marine natural products showing unique bioactivity, illustrate the role that chiral alkyl halides can play in drug discovery. Through a series of case studies, we demonstrate in this review that these motifs can indeed be stable under physiological conditions, and that halogenation can enhance bioactivity through both steric and electronic effects. Our hope is that, by placing such compounds in the minds of the chemical community, they may gain more traction in drug discovery and inspire more synthetic chemists to develop methods for selective halogenation. PMID:27827902
Identifying the scale-dependent motifs in atmospheric surface layer by ordinal pattern analysis

NASA Astrophysics Data System (ADS)

Li, Qinglei; Fu, Zuntao

2018-07-01

Ramp-like structures in various atmospheric surface layer time series have been long studied, but the presence of motifs with the finer scale embedded within larger scale ramp-like structures has largely been overlooked in the reported literature. Here a novel, objective and well-adapted methodology, the ordinal pattern analysis, is adopted to study the finer-scaled motifs in atmospheric boundary-layer (ABL) time series. The studies show that the motifs represented by different ordinal patterns take clustering properties and 6 dominated motifs out of the whole 24 motifs account for about 45% of the time series under particular scales, which indicates the higher contribution of motifs with the finer scale to the series. Further studies indicate that motif statistics are similar for both stable conditions and unstable conditions at larger scales, but large discrepancies are found at smaller scales, and the frequencies of motifs "1234" and/or "4321" are a bit higher under stable conditions than unstable conditions. Under stable conditions, there are great changes for the occurrence frequencies of motifs "1234" and "4321", where the occurrence frequencies of motif "1234" decrease from nearly 24% to 4.5% with the scale factor increasing, and the occurrence frequencies of motif "4321" change nonlinearly with the scale increasing. These great differences of dominated motifs change with scale can be taken as an indicator to quantify the flow structure changes under different stability conditions, and motif entropy can be defined just by only 6 dominated motifs to quantify this time-scale independent property of the motifs. All these results suggest that the defined scale of motifs with the finer scale should be carefully taken into consideration in the interpretation of turbulence coherent structures.
Peptide-binding motifs of two common equine class I MHC molecules in Thoroughbred horses.

PubMed

Bergmann, Tobias; Lindvall, Mikaela; Moore, Erin; Moore, Eugene; Sidney, John; Miller, Donald; Tallmadge, Rebecca L; Myers, Paisley T; Malaker, Stacy A; Shabanowitz, Jeffrey; Osterrieder, Nikolaus; Peters, Bjoern; Hunt, Donald F; Antczak, Douglas F; Sette, Alessandro

2017-05-01

Quantitative peptide-binding motifs of MHC class I alleles provide a valuable tool to efficiently identify putative T cell epitopes. Detailed information on equine MHC class I alleles is still very limited, and to date, only a single equine MHC class I allele, Eqca-1*00101 (ELA-A3 haplotype), has been characterized. The present study extends the number of characterized ELA class I specificities in two additional haplotypes found commonly in the Thoroughbred breed. Accordingly, we here report quantitative binding motifs for the ELA-A2 allele Eqca-16*00101 and the ELA-A9 allele Eqca-1*00201. Utilizing analyses of endogenously bound and eluted ligands and the screening of positional scanning combinatorial libraries, detailed and quantitative peptide-binding motifs were derived for both alleles. Eqca-16*00101 preferentially binds peptides with aliphatic/hydrophobic residues in position 2 and at the C-terminus, and Eqca-1*00201 has a preference for peptides with arginine in position 2 and hydrophobic/aliphatic residues at the C-terminus. Interestingly, the Eqca-16*00101 motif resembles that of the human HLA A02-supertype, while the Eqca-1*00201 motif resembles that of the HLA B27-supertype and two macaque class I alleles. It is expected that the identified motifs will facilitate the selection of candidate epitopes for the study of immune responses in horses.
DNA motif elucidation using belief propagation.

PubMed

Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei

2013-09-01

Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k=8∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM.
Genome-wide comparison of ferritin family from Archaea, Bacteria, Eukarya, and Viruses: its distribution, characteristic motif, and phylogenetic relationship

NASA Astrophysics Data System (ADS)

Bai, Lina; Xie, Ting; Hu, Qingqing; Deng, Changyan; Zheng, Rong; Chen, Wanping

2015-10-01

Ferritins are highly conserved proteins that are widely distributed in various species from archaea to humans. The ubiquitous characteristic of these proteins reflects the pivotal contribution of ferritins to the safe storage and timely delivery of iron to achieve iron homeostasis. This study investigated the ferritin genes in 248 genomes from various species, including viruses, archaea, bacteria, and eukarya. The distribution comparison suggests that mammals and eudicots possess abundant ferritin genes, whereas fungi contain very few ferritin genes. Archaea and bacteria show considerable numbers of ferritin genes. Generally, prokaryotes possess three types of ferritin (the typical ferritin, bacterioferritin, and DNA-binding protein from starved cell), whereas eukaryotes have various subunit types of ferritin, thereby indicating the individuation of the ferritin family during evolution. The characteristic motif analysis of ferritins suggested that all key residues specifying the unique structural motifs of ferritin are highly conserved across three domains of life. Meanwhile, the characteristic motifs were also distinguishable between ferritin groups, especially phytoferritins, which show a plant-specific motif. The phylogenetic analyses show that ferritins within the same subfamily or subunits are generally clustered together. The phylogenetic relationships among ferritin members suggest that both gene duplication and horizontal transfer contribute to the wide variety of ferritins, and their possible evolutionary scenario was also proposed. The results contribute to a better understanding of the distribution, characteristic motif, and evolutionary relationship of the ferritin family.
Genome Wide Identification, Evolutionary, and Expression Analysis of VQ Genes from Two Pyrus Species

PubMed Central

Meng, Dandan; Abdullah, Muhammad; Jin, Qing; Lin, Yi; Cai, Yongping

2018-01-01

The VQ motif-containing gene, a member of the plant-specific genes, is involved in the plant developmental process and various stress responses. The VQ motif-containing gene family has been studied in several plants, such as rice (Oryza sativa), maize (Zea mays), and Arabidopsis (Arabidopsis thaliana). However, no systematic study has been performed in Pyrus species, which have important economic value. In our study, we identified 41 and 28 VQ motif-containing genes in Pyrus bretschneideri and Pyrus communis, respectively. Phylogenetic trees were calculated using A. thaliana and O. sativa VQ motif-containing genes as a template, allowing us to categorize these genes into nine subfamilies. Thirty-two and eight paralogous of VQ motif-containing genes were found in P. bretschneideri and P. communis, respectively, showing that the VQ motif-containing genes had a more remarkable expansion in P. bretschneideri than in P. communis. A total of 31 orthologous pairs were identified from the P. bretschneideri and P. communis VQ motif-containing genes. Additionally, among the paralogs, we found that these duplication gene pairs probably derived from segmental duplication/whole-genome duplication (WGD) events in the genomes of P. bretschneideri and P. communis, respectively. The gene expression profiles in both P. bretschneideri and P. communis fruits suggested functional redundancy for some orthologous gene pairs derived from a common ancestry, and sub-functionalization or neo-functionalization for some of them. Our study provided the first systematic evolutionary analysis of the VQ motif-containing genes in Pyrus, and highlighted the diversification and duplication of VQ motif-containing genes in both P. bretschneideri and P. communis. PMID:29690608
Statistical tests to compare motif count exceptionalities

PubMed Central

Robin, Stéphane; Schbath, Sophie; Vandewalle, Vincent

2007-01-01

Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use. PMID:17346349
Identification of novel RNA secondary structures within the hepatitis C virus genome reveals a cooperative involvement in genome packaging

PubMed Central

Stewart, H.; Bingham, R.J.; White, S. J.; Dykeman, E. C.; Zothner, C.; Tuplin, A. K.; Stockley, P. G.; Twarock, R.; Harris, M.

2016-01-01

The specific packaging of the hepatitis C virus (HCV) genome is hypothesised to be driven by Core-RNA interactions. To identify the regions of the viral genome involved in this process, we used SELEX (systematic evolution of ligands by exponential enrichment) to identify RNA aptamers which bind specifically to Core in vitro. Comparison of these aptamers to multiple HCV genomes revealed the presence of a conserved terminal loop motif within short RNA stem-loop structures. We postulated that interactions of these motifs, as well as sub-motifs which were present in HCV genomes at statistically significant levels, with the Core protein may drive virion assembly. We mutated 8 of these predicted motifs within the HCV infectious molecular clone JFH-1, thereby producing a range of mutant viruses predicted to possess altered RNA secondary structures. RNA replication and viral titre were unaltered in viruses possessing only one mutated structure. However, infectivity titres were decreased in viruses possessing a higher number of mutated regions. This work thus identified multiple novel RNA motifs which appear to contribute to genome packaging. We suggest that these structures act as cooperative packaging signals to drive specific RNA encapsidation during HCV assembly. PMID:26972799
The Emerging Field of RNA Nanotechnology

PubMed Central

Guo, Peixuan

2011-01-01

RNA can be designed and manipulated just like DNA while having different rules for base-pairing and displaying functions similar to proteins. The large variety of loops and motifs in RNA allow them to fold into numerous complicated structures. This diversity provides a platform for identifying viable building blocks for particle assemblies, substrate binding and manufacture engineering. RNA thermal stability allows production of multivalent nanostructures with defined stoichiometry. Here we review the unique qualities of RNA nanotechnology and their distinct properties inside the body. We describe techniques for constructing RNA nanoparticles from different building blocks and their applications in nanomedicine. Finally, we discuss challenges in predicting and synthesizing RNA and offer some perspectives on the yield and cost of RNA production. PMID:21102465
CTC-Endothelial Cell Interactions during Metastasis

DTIC Science & Technology

2013-04-01

endothelial cells via a variety of E-selectin ligands ( ESL ). These ESLs express a unique carbohydrate motif, sLex, which appears to be required for... ESL binding. The chemokine receptor CXCR4 has also been reported to supporting transendothelial migration of prostate cells through bone marrow
Structural Determination of Functional Domains in Early B-cell Factor (EBF) Family of Transcription Factors Reveals Similarities to Rel DNA-binding Proteins and a Novel Dimerization Motif*

PubMed Central

Siponen, Marina I.; Wisniewska, Magdalena; Lehtiö, Lari; Johansson, Ida; Svensson, Linda; Raszewski, Grzegorz; Nilsson, Lennart; Sigvardsson, Mikael; Berglund, Helena

2010-01-01

The early B-cell factor (EBF) transcription factors are central regulators of development in several organs and tissues. This protein family shows low sequence similarity to other protein families, which is why structural information for the functional domains of these proteins is crucial to understand their biochemical features. We have used a modular approach to determine the crystal structures of the structured domains in the EBF family. The DNA binding domain reveals a striking resemblance to the DNA binding domains of the Rel homology superfamily of transcription factors but contains a unique zinc binding structure, termed zinc knuckle. Further the EBF proteins contain an IPT/TIG domain and an atypical helix-loop-helix domain with a novel type of dimerization motif. The data presented here provide insights into unique structural features of the EBF proteins and open possibilities for detailed molecular investigations of this important transcription factor family. PMID:20592035
Galectin-3 in angiogenesis and metastasis

PubMed Central

Funasaka, Tatsuyoshi; Raz, Avraham; Nangia-Makker, Pratima

2014-01-01

Galectin-3 is a member of the family of β-galactoside-binding lectins characterized by evolutionarily conserved sequences defined by structural similarities in their carbohydrate-recognition domains. Galectin-3 is a unique, chimeric protein consisting of three distinct structural motifs: (i) a short NH2 terminal domain containing a serine phosphorylation site; (ii) a repetitive proline-rich collagen-α-like sequence cleavable by matrix metalloproteases; and (iii) a globular COOH-terminal domain containing a carbohydrate-binding motif and an NWGR anti-death motif. It is ubiquitously expressed and has diverse biological functions depending on its subcellular localization. Galectin-3 is mainly found in the cytoplasm, also seen in the nucleus and can be secreted by non-classical, secretory pathways. In general, secreted galectin-3 mediates cell migration, cell adhesion and cell–cell interactions through the binding with high affinity to galactose-containing glycoproteins on the cell surface. Cytoplasmic galectin-3 exhibits anti-apoptotic activity and regulates several signal transduction pathways, whereas nuclear galectin-3 has been associated with pre-mRNA splicing and gene expression. Its unique chimeric structure enables it to interact with a plethora of ligands and modulate diverse functions such as cell growth, adhesion, migration, invasion, angiogenesis, immune function, apoptosis and endocytosis emphasizing its significance in the process of tumor progression. In this review, we have focused on the role of galectin-3 in tumor metastasis with special emphasis on angiogenesis. PMID:25138305
PhyloGibbs-MP: Module Prediction and Discriminative Motif-Finding by Gibbs Sampling

PubMed Central

Siddharthan, Rahul

2008-01-01

PhyloGibbs, our recent Gibbs-sampling motif-finder, takes phylogeny into account in detecting binding sites for transcription factors in DNA and assigns posterior probabilities to its predictions obtained by sampling the entire configuration space. Here, in an extension called PhyloGibbs-MP, we widen the scope of the program, addressing two major problems in computational regulatory genomics. First, PhyloGibbs-MP can localise predictions to small, undetermined regions of a large input sequence, thus effectively predicting cis-regulatory modules (CRMs) ab initio while simultaneously predicting binding sites in those modules—tasks that are usually done by two separate programs. PhyloGibbs-MP's performance at such ab initio CRM prediction is comparable with or superior to dedicated module-prediction software that use prior knowledge of previously characterised transcription factors. Second, PhyloGibbs-MP can predict motifs that differentiate between two (or more) different groups of regulatory regions, that is, motifs that occur preferentially in one group over the others. While other “discriminative motif-finders” have been published in the literature, PhyloGibbs-MP's implementation has some unique features and flexibility. Benchmarks on synthetic and actual genomic data show that this algorithm is successful at enhancing predictions of differentiating sites and suppressing predictions of common sites and compares with or outperforms other discriminative motif-finders on actual genomic data. Additional enhancements include significant performance and speed improvements, the ability to use “informative priors” on known transcription factors, and the ability to output annotations in a format that can be visualised with the Generic Genome Browser. In stand-alone motif-finding, PhyloGibbs-MP remains competitive, outperforming PhyloGibbs-1.0 and other programs on benchmark data. PMID:18769735
Commensurate distances and similar motifs in genetic congruence and protein interaction networks in yeast

PubMed Central

Ye, Ping; Peyser, Brian D; Spencer, Forrest A; Bader, Joel S

2005-01-01

Background In a genetic interaction, the phenotype of a double mutant differs from the combined phenotypes of the underlying single mutants. When the single mutants have no growth defect, but the double mutant is lethal or exhibits slow growth, the interaction is termed synthetic lethality or synthetic fitness. These genetic interactions reveal gene redundancy and compensating pathways. Recently available large-scale data sets of genetic interactions and protein interactions in Saccharomyces cerevisiae provide a unique opportunity to elucidate the topological structure of biological pathways and how genes function in these pathways. Results We have defined congruent genes as pairs of genes with similar sets of genetic interaction partners and constructed a genetic congruence network by linking congruent genes. By comparing path lengths in three types of networks (genetic interaction, genetic congruence, and protein interaction), we discovered that high genetic congruence not only exhibits correlation with direct protein interaction linkage but also exhibits commensurate distance with the protein interaction network. However, consistent distances were not observed between genetic and protein interaction networks. We also demonstrated that congruence and protein networks are enriched with motifs that indicate network transitivity, while the genetic network has both transitive (triangle) and intransitive (square) types of motifs. These results suggest that robustness of yeast cells to gene deletions is due in part to two complementary pathways (square motif) or three complementary pathways, any two of which are required for viability (triangle motif). Conclusion Genetic congruence is superior to genetic interaction in prediction of protein interactions and function associations. Genetically interacting pairs usually belong to parallel compensatory pathways, which can generate transitive motifs (any two of three pathways needed) or intransitive motifs (either of two pathways needed). PMID:16283923
Detection of core-periphery structure in networks based on 3-tuple motifs

NASA Astrophysics Data System (ADS)

Ma, Chuang; Xiang, Bing-Bing; Chen, Han-Shuang; Small, Michael; Zhang, Hai-Feng

2018-05-01

Detecting mesoscale structure, such as community structure, is of vital importance for analyzing complex networks. Recently, a new mesoscale structure, core-periphery (CP) structure, has been identified in many real-world systems. In this paper, we propose an effective algorithm for detecting CP structure based on a 3-tuple motif. In this algorithm, we first define a 3-tuple motif in terms of the patterns of edges as well as the property of nodes, and then a motif adjacency matrix is constructed based on the 3-tuple motif. Finally, the problem is converted to find a cluster that minimizes the smallest motif conductance. Our algorithm works well in different CP structures: including single or multiple CP structure, and local or global CP structures. Results on the synthetic and the empirical networks validate the high performance of our method.
Porous Hydrogen-Bonded Organic Frameworks.

PubMed

Han, Yi-Fei; Yuan, Ying-Xue; Wang, Hong-Bo

2017-02-13

Ordered porous solid-state architectures constructed via non-covalent supramolecular self-assembly have attracted increasing interest due to their unique advantages and potential applications. Porous metal-coordination organic frameworks (MOFs) are generated by the assembly of metal coordination centers and organic linkers. Compared to MOFs, porous hydrogen-bonded organic frameworks (HOFs) are readily purified and recovered via simple recrystallization. However, due to lacking of sufficiently ability to orientate self-aggregation of building motifs in predictable manners, rational design and preparation of porous HOFs are still challenging. Herein, we summarize recent developments about porous HOFs and attempt to gain deeper insights into the design strategies of basic building motifs.

Ni2+-binding RNA motifs with an asymmetric purine-rich internal loop and a G-A base pair.

PubMed Central

Hofmann, H P; Limmer, S; Hornung, V; Sprinzl, M

1997-01-01

RNA molecules with high affinity for immobilized Ni2+ were isolated from an RNA pool with 50 randomized positions by in vitro selection-amplification. The selected RNAs preferentially bind Ni2+ and Co2+ over other cations from first series transition metals. Conserved structure motifs, comprising about 15 nt, were identified that are likely to represent the Ni2+ binding sites. Two conserved motifs contain an asymmetric purine-rich internal loop and probably a mismatch G-A base pair. The structure of one of these motifs was studied with proton NMR spectroscopy and formation of the G-A pair at the junction of helix and internal loop was demonstrated. Using Ni2+ as a paramagnetic probe, a divalent metal ion binding site near this G-A base pair was identified. Ni2+ ions bound to this motif exert a specific stabilization effect. We propose that small asymmetric purine-rich loops that contain a G-A interaction may represent a divalent metal ion binding site in RNA. PMID:9409620
A dileucine motif is involved in plasma membrane expression and endocytosis of rat sodium taurocholate cotransporting polypeptide (Ntcp).

PubMed

Stross, Claudia; Kluge, Stefanie; Weissenberger, Katrin; Winands, Elisabeth; Häussinger, Dieter; Kubitz, Ralf

2013-11-15

The sodium taurocholate cotransporting polypeptide (Ntcp) is the major uptake transporter for bile salts into liver parenchymal cells, and PKC-mediated endocytosis was shown to regulate the number of Ntcp molecules at the plasma membrane. In this study, mechanisms of Ntcp internalization were analyzed by flow cytometry, immunofluorescence, and Western blot analyses in HepG2 cells. PKC activation induced endocytosis of Ntcp from the plasma membrane by ~30%. Endocytosis of Ntcp was clathrin dependent and was followed by lysosomal degradation. A dileucine motif located in the third intracellular loop of Ntcp was essential for endocytosis but also for processing and plasma membrane targeting, suggesting a dual function of this motif for intracellular trafficking of Ntcp. Mutation of two of five potential phosphorylation sites surrounding the dileucine motif (Thr225 and Ser226) inhibited PKC-mediated endocytosis. In conclusion, we could identify a motif, which is critical for Ntcp plasma membrane localization. Endocytic retrieval protects hepatocytes from elevated bile salt concentrations and is of special interest, because NTCP has been identified as a receptor for the hepatitis B and D virus.
iLIR@viral: A web resource for LIR motif-containing proteins in viruses.

PubMed

Jacomin, Anne-Claire; Samavedam, Siva; Charles, Hannah; Nezis, Ioannis P

2017-10-03

Macroautophagy/autophagy has been shown to mediate the selective lysosomal degradation of pathogenic bacteria and viruses (xenophagy), and to contribute to the activation of innate and adaptative immune responses. Autophagy can serve as an antiviral defense mechanism but also as a proviral process during infection. Atg8-family proteins play a central role in the autophagy process due to their ability to interact with components of the autophagy machinery as well as selective autophagy receptors and adaptor proteins. Such interactions are usually mediated through LC3-interacting region (LIR) motifs. So far, only one viral protein has been experimentally shown to have a functional LIR motif, leaving open a vast field for investigation. Here, we have developed the iLIR@viral database ( http://ilir.uk/virus/ ) as a freely accessible web resource listing all the putative canonical LIR motifs identified in viral proteins. Additionally, we used a curated text-mining analysis of the literature to identify novel putative LIR motif-containing proteins (LIRCPs) in viruses. We anticipate that iLIR@viral will assist with elucidating the full complement of LIRCPs in viruses.
Global analyses of TetR family transcriptional regulators in mycobacteria indicates conservation across species and diversity in regulated functions.

PubMed

Balhana, Ricardo J C; Singla, Ashima; Sikder, Mahmudul Hasan; Withers, Mike; Kendall, Sharon L

2015-06-27

Mycobacteria inhabit diverse niches and display high metabolic versatility. They can colonise both humans and animals and are also able to survive in the environment. In order to succeed, response to environmental cues via transcriptional regulation is required. In this study we focused on the TetR family of transcriptional regulators (TFTRs) in mycobacteria. We used InterPro to classify the entire complement of transcriptional regulators in 10 mycobacterial species and these analyses showed that TFTRs are the most abundant family of regulators in all species. We identified those TFTRs that are conserved across all species analysed and those that are unique to the pathogens included in the analysis. We examined genomic contexts of 663 of the conserved TFTRs and observed that the majority of TFTRs are separated by 200 bp or less from divergently oriented genes. Analyses of divergent genes indicated that the TFTRs control diverse biochemical functions not limited to efflux pumps. TFTRs typically bind to palindromic motifs and we identified 11 highly significant novel motifs in the upstream regions of divergently oriented TFTRs. The C-terminal ligand binding domain from the TFTR complement in M. tuberculosis showed great diversity in amino acid sequence but with an overall architecture common to other TFTRs. This study suggests that mycobacteria depend on TFTRs for the transcriptional control of a number of metabolic functions yet the physiological role of the majority of these regulators remain unknown.
Forkhead Box Transcription Factors of the FOXA Class Are Required for Basal Transcription of Angiotensin-Converting Enzyme 2

PubMed Central

Pedersen, Kim Brint; Chodavarapu, Harshita

2017-01-01

Angiotensin-converting enzyme 2 (ACE2) has protective effects on a wide range of morbidities associated with elevated angiotensin-II signaling. Most tissues, including pancreatic islets, express ACE2 mainly from the proximal promoter region. We previously found that hepatocyte nuclear factors 1α and 1β stimulate ACE2 expression from three highly conserved hepatocyte nuclear factor 1 binding motifs in the proximal promoter region. We hypothesized that other highly conserved motifs would also affect ACE2 expression. By systematic mutation of conserved elements, we identified five regions affecting ACE2 expression, of which two regions bound transcriptional activators. One of these is a functional FOXA binding motif. We further identified the main protein binding the FOXA motif in 832/13 insulinoma cells as well as in mouse pancreatic islets as FOXA2. PMID:29082356
Motif enrichment tool.

PubMed

Blatti, Charles; Sinha, Saurabh

2014-07-01

The Motif Enrichment Tool (MET) provides an online interface that enables users to find major transcriptional regulators of their gene sets of interest. MET searches the appropriate regulatory region around each gene and identifies which transcription factor DNA-binding specificities (motifs) are statistically overrepresented. Motif enrichment analysis is currently available for many metazoan species including human, mouse, fruit fly, planaria and flowering plants. MET also leverages high-throughput experimental data such as ChIP-seq and DNase-seq from ENCODE and ModENCODE to identify the regulatory targets of a transcription factor with greater precision. The results from MET are produced in real time and are linked to a genome browser for easy follow-up analysis. Use of the web tool is free and open to all, and there is no login requirement. ADDRESS: http://veda.cs.uiuc.edu/MET/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identification of helix capping and β-turn motifs from NMR chemical shifts

PubMed Central

Shen, Yang; Bax, Ad

2012-01-01

We present an empirical method for identification of distinct structural motifs in proteins on the basis of experimentally determined backbone and 13Cβ chemical shifts. Elements identified include the N-terminal and C-terminal helix capping motifs and five types of β-turns: I, II, I′, II′ and VIII. Using a database of proteins of known structure, the NMR chemical shifts, together with the PDB-extracted amino acid preference of the helix capping and β-turn motifs are used as input data for training an artificial neural network algorithm, which outputs the statistical probability of finding each motif at any given position in the protein. The trained neural networks, contained in the MICS (motif identification from chemical shifts) program, also provide a confidence level for each of their predictions, and values ranging from ca 0.7–0.9 for the Matthews correlation coefficient of its predictions far exceed that attainable by sequence analysis. MICS is anticipated to be useful both in the conventional NMR structure determination process and for enhancing on-going efforts to determine protein structures solely on the basis of chemical shift information, where it can aid in identifying protein database fragments suitable for use in building such structures. PMID:22314702
Proteome-wide search for functional motifs altered in tumors: Prediction of nuclear export signals inactivated by cancer-related mutations

PubMed Central

Prieto, Gorka; Fullaondo, Asier; Rodríguez, Jose A.

2016-01-01

Large-scale sequencing projects are uncovering a growing number of missense mutations in human tumors. Understanding the phenotypic consequences of these alterations represents a formidable challenge. In silico prediction of functionally relevant amino acid motifs disrupted by cancer mutations could provide insight into the potential impact of a mutation, and guide functional tests. We have previously described Wregex, a tool for the identification of potential functional motifs, such as nuclear export signals (NESs), in proteins. Here, we present an improved version that allows motif prediction to be combined with data from large repositories, such as the Catalogue of Somatic Mutations in Cancer (COSMIC), and to be applied to a whole proteome scale. As an example, we have searched the human proteome for candidate NES motifs that could be altered by cancer-related mutations included in the COSMIC database. A subset of the candidate NESs identified was experimentally tested using an in vivo nuclear export assay. A significant proportion of the selected motifs exhibited nuclear export activity, which was abrogated by the COSMIC mutations. In addition, our search identified a cancer mutation that inactivates the NES of the human deubiquitinase USP21, and leads to the aberrant accumulation of this protein in the nucleus. PMID:27174732
Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins

PubMed Central

Kinjo, Akira R.; Nakamura, Haruki

2012-01-01

Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478
Triadic motifs in the dependence networks of virtual societies.

PubMed

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-06-10

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Triadic motifs in the dependence networks of virtual societies

NASA Astrophysics Data System (ADS)

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-06-01

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Triadic motifs in the dependence networks of virtual societies

PubMed Central

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-01-01

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755
SLiMSearch 2.0: biological context for short linear motifs in proteins

PubMed Central

Davey, Norman E.; Haslam, Niall J.; Shields, Denis C.

2011-01-01

Short, linear motifs (SLiMs) play a critical role in many biological processes. The SLiMSearch 2.0 (Short, Linear Motif Search) web server allows researchers to identify occurrences of a user-defined SLiM in a proteome, using conservation and protein disorder context statistics to rank occurrences. User-friendly output and visualizations of motif context allow the user to quickly gain insight into the validity of a putatively functional motif occurrence. For each motif occurrence, overlapping UniProt features and annotated SLiMs are displayed. Visualization also includes annotated multiple sequence alignments surrounding each occurrence, showing conservation and protein disorder statistics in addition to known and predicted SLiMs, protein domains and known post-translational modifications. In addition, enrichment of Gene Ontology terms and protein interaction partners are provided as indicators of possible motif function. All web server results are available for download. Users can search motifs against the human proteome or a subset thereof defined by Uniprot accession numbers or GO term. The SLiMSearch server is available at: http://bioware.ucd.ie/slimsearch2.html. PMID:21622654
INCENP Centromere and Spindle Targeting: Identification of Essential Conserved Motifs and Involvement of Heterochromatin Protein HP1

PubMed Central

Ainsztein, Alexandra M.; Kandels-Lewis, Stefanie E.; Mackay, Alastair M.; Earnshaw, William C.

1998-01-01

The inner centromere protein (INCENP) has a modular organization, with domains required for chromosomal and cytoskeletal functions concentrated near the amino and carboxyl termini, respectively. In this study we have identified an autonomous centromere- and midbody-targeting module in the amino-terminal 68 amino acids of INCENP. Within this module, we have identified two evolutionarily conserved amino acid sequence motifs: a 13–amino acid motif that is required for targeting to centromeres and transfer to the spindle, and an 11–amino acid motif that is required for transfer to the spindle by molecules that have targeted previously to the centromere. To begin to understand the mechanisms of INCENP function in mitosis, we have performed a yeast two-hybrid screen for interacting proteins. These and subsequent in vitro binding experiments identify a physical interaction between INCENP and heterochromatin protein HP1Hsα. Surprisingly, this interaction does not appear to be involved in targeting INCENP to the centromeric heterochromatin, but may instead have a role in its transfer from the chromosomes to the anaphase spindle. PMID:9864353
D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

PubMed Central

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-01-01

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. DMATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the coregulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sosbox cisregulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. DMATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861
D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.

PubMed

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-07-27

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/
A motif detection and classification method for peptide sequences using genetic programming.

PubMed

Tomita, Yasuyuki; Kato, Ryuji; Okochi, Mina; Honda, Hiroyuki

2008-08-01

An exploration of common rules (property motifs) in amino acid sequences has been required for the design of novel sequences and elucidation of the interactions between molecules controlled by the structural or physical environment. In the present study, we developed a new method to search property motifs that are common in peptide sequence data. Our method comprises the following two characteristics: (i) the automatic determination of the position and length of common property motifs by calculating the physicochemical similarity of amino acids, and (ii) the quick and effective exploration of motif candidates that discriminates the positives and negatives by the introduction of genetic programming (GP). Our method was evaluated by two types of model data sets. First, the intentionally buried property motifs were searched in the artificially derived peptide data containing intentionally buried property motifs. As a result, the expected property motifs were correctly extracted by our algorithm. Second, the peptide data that interact with MHC class II molecules were analyzed as one of the models of biologically active peptides with buried motifs in various lengths. Twofold MHC class II binding peptides were identified with the rule using our method, compared to the existing scoring matrix method. In conclusion, our GP based motif searching approach enabled to obtain knowledge of functional aspects of the peptides without any prior knowledge.
Recurring sequence-structure motifs in (βα)8-barrel proteins and experimental optimization of a chimeric protein designed based on such motifs.

PubMed

Wang, Jichao; Zhang, Tongchuan; Liu, Ruicun; Song, Meilin; Wang, Juncheng; Hong, Jiong; Chen, Quan; Liu, Haiyan

2017-02-01

An interesting way of generating novel artificial proteins is to combine sequence motifs from natural proteins, mimicking the evolutionary path suggested by natural proteins comprising recurring motifs. We analyzed the βα and αβ modules of TIM barrel proteins by structure alignment-based sequence clustering. A number of preferred motifs were identified. A chimeric TIM was designed by using recurring elements as mutually compatible interfaces. The foldability of the designed TIM protein was then significantly improved by six rounds of directed evolution. The melting temperature has been improved by more than 20°C. A variety of characteristics suggested that the resulting protein is well-folded. Our analysis provided a library of peptide motifs that is potentially useful for different protein engineering studies. The protein engineering strategy of using recurring motifs as interfaces to connect partial natural proteins may be applied to other protein folds. Copyright © 2016 Elsevier B.V. All rights reserved.
Further delineation of nonhomologous-based recombination and evidence for subtelomeric segmental duplications in 1p36 rearrangements.

PubMed

D'Angelo, Carla S; Gajecka, Marzena; Kim, Chong A; Gentles, Andrew J; Glotzbach, Caron D; Shaffer, Lisa G; Koiffmann, Célia P

2009-06-01

The mechanisms involved in the formation of subtelomeric rearrangements are now beginning to be elucidated. Breakpoint sequencing analysis of 1p36 rearrangements has made important contributions to this line of inquiry. Despite the unique architecture of segmental duplications inherent to human subtelomeres, no common mechanism has been identified thus far and different nonexclusive recombination-repair mechanisms seem to predominate. In order to gain further insights into the mechanisms of chromosome breakage, repair, and stabilization mediating subtelomeric rearrangements in humans, we investigated the constitutional rearrangements of 1p36. Cloning of the breakpoint junctions in a complex rearrangement and three non-reciprocal translocations revealed similarities at the junctions, such as microhomology of up to three nucleotides, along with no significant sequence identity in close proximity to the breakpoint regions. All the breakpoints appeared to be unique and their occurrence was limited to non-repetitive, unique DNA sequences. Several recombination- or cleavage-associated motifs that may promote non-homologous recombination were observed in close proximity to the junctions. We conclude that NHEJ is likely the mechanism of DNA repair that generates these rearrangements. Additionally, two apparently pure terminal deletions were also investigated, and the refinement of the breakpoint regions identified two distinct genomic intervals ~25-kb apart, each containing a series of 1p36 specific segmental duplications with 90-98% identity. Segmental duplications can serve as substrates for ectopic homologous recombination or stimulate genomic rearrangements.
Conservation of the Human Integrin-Type Beta-Propeller Domain in Bacteria

PubMed Central

Chouhan, Bhanupratap; Denesyuk, Alexander; Heino, Jyrki; Johnson, Mark S.; Denessiouk, Konstantin

2011-01-01

Integrins are heterodimeric cell-surface receptors with key functions in cell-cell and cell-matrix adhesion. Integrin α and β subunits are present throughout the metazoans, but it is unclear whether the subunits predate the origin of multicellular organisms. Several component domains have been detected in bacteria, one of which, a specific 7-bladed β-propeller domain, is a unique feature of the integrin α subunits. Here, we describe a structure-derived motif, which incorporates key features of each blade from the X-ray structures of human αIIbβ3 and αVβ3, includes elements of the FG-GAP/Cage and Ca2+-binding motifs, and is specific only for the metazoan integrin domains. Separately, we searched for the metazoan integrin type β-propeller domains among all available sequences from bacteria and unicellular eukaryotic organisms, which must incorporate seven repeats, corresponding to the seven blades of the β-propeller domain, and so that the newly found structure-derived motif would exist in every repeat. As the result, among 47 available genomes of unicellular eukaryotes we could not find a single instance of seven repeats with the motif. Several sequences contained three repeats, a predicted transmembrane segment, and a short cytoplasmic motif associated with some integrins, but otherwise differ from the metazoan integrin α subunits. Among the available bacterial sequences, we found five examples containing seven sequential metazoan integrin-specific motifs within the seven repeats. The motifs differ in having one Ca2+-binding site per repeat, whereas metazoan integrins have three or four sites. The bacterial sequences are more conserved in terms of motif conservation and loop length, suggesting that the structure is more regular and compact than those example structures from human integrins. Although the bacterial examples are not full-length integrins, the full-length metazoan-type 7-bladed β-propeller domains are present, and sometimes two tandem copies are found. PMID:22022374

BEAM web server: a tool for structural RNA motif discovery.

PubMed

Pietrosanto, Marco; Adinolfi, Marta; Casula, Riccardo; Ausiello, Gabriele; Ferrè, Fabrizio; Helmer-Citterich, Manuela

2018-03-15

RNA structural motif finding is a relevant problem that becomes computationally hard when working on high-throughput data (e.g. eCLIP, PAR-CLIP), often represented by thousands of RNA molecules. Currently, the BEAM server is the only web tool capable to handle tens of thousands of RNA in input with a motif discovery procedure that is only limited by the current secondary structure prediction accuracies. The recently developed method BEAM (BEAr Motifs finder) can analyze tens of thousands of RNA molecules and identify RNA secondary structure motifs associated to a measure of their statistical significance. BEAM is extremely fast thanks to the BEAR encoding that transforms each RNA secondary structure in a string of characters. BEAM also exploits the evolutionary knowledge contained in a substitution matrix of secondary structure elements, extracted from the RFAM database of families of homologous RNAs. The BEAM web server has been designed to streamline data pre-processing by automatically handling folding and encoding of RNA sequences, giving users a choice for the preferred folding program. The server provides an intuitive and informative results page with the list of secondary structure motifs identified, the logo of each motif, its significance, graphic representation and information about its position in the RNA molecules sharing it. The web server is freely available at http://beam.uniroma2.it/ and it is implemented in NodeJS and Python with all major browsers supported. marco.pietrosanto@uniroma2.it. Supplementary data are available at Bioinformatics online.
Unique CD44 intronic SNP is associated with tumor grade in breast cancer: a case control study and in silico analysis.

PubMed

Esmaeili, Rezvan; Abdoli, Nasrin; Yadegari, Fatemeh; Neishaboury, Mohamadreza; Farahmand, Leila; Kaviani, Ahmad; Majidzadeh-A, Keivan

2018-01-01

CD44 encoded by a single gene is a cell surface transmembrane glycoprotein. Exon 2 is one of the important exons to bind CD44 protein to hyaluronan. Experimental evidences show that hyaluronan-CD44 interaction intensifies the proliferation, migration, and invasion of breast cancer cells. Therefore, the current study aimed at investigating the association between specific polymorphisms in exon 2 and its flanking region of CD44 with predisposition to breast cancer. In the current study, 175 Iranian female patients with breast cancer and 175 age-matched healthy controls were recruited in biobank, Breast Cancer Research Center, Tehran, Iran. Single nucleotide polymorphisms of CD44 exon 2 and its flanking were analyzed via polymerase chain reaction and gene sequencing techniques. Association between the observed variation with breast cancer risk and clinico-pathological characteristics were studied. Subsequently, bioinformatics analysis was conducted to predict potential exonic splicing enhancer (ESE) motifs changed as the result of a mutation. A unique polymorphism of the gene encoding CD44 was identified at position 14 nucleotide upstream of exon 2 (A37692→G) by the sequencing method. The A > G polymorphism exhibited a significant association with higher-grades of breast cancer, although no significant relation was found between this polymorphism and breast cancer risk. Finally, computational analysis revealed that the intronic mutation generated a new consensus-binding motif for the splicing factor, SC35, within intron 1. The current study results indicated that A > G polymorphism was associated with breast cancer development; in addition, in silico analysis with ESE finder prediction software showed that the change created a new SC35 binding site.
Secretome weaponries of Cochliobolus lunatus interacting with potato leaf at different temperature regimes reveal a CL[xxxx]LHM - motif

PubMed Central

2014-01-01

Background Plant and animal pathogenic fungus Cochliobolus lunatus cause great economic damages worldwide every year. C. lunatus displays an increased temperature dependent-virulence to a wide range of hosts. Nonetheless, this phenomenon is poorly understood due to lack of insights on the coordinated secretome weaponries produced by C. lunatus under heat-stress conditions on putative hosts. To understand the mechanism better, we dissected the secretome of C. lunatus interacting with potato (Solanum tuberosum L.) leaf at different temperature regimes. Results C. lunatus produced melanized colonizing hyphae in and on potato leaf, finely modulated the ambient pH as a function of temperature and secreted diverse set of proteins. Using two dimensional gel electrophoresis (2-D) and mass spectrometry (MS) technology, we observed discrete secretomes at 20°C, 28°C and 38°C. A total of 21 differentially expressed peptide spots and 10 unique peptide spots (that did not align on the gels) matched with 28 unique protein models predicted from C. lunatus m118 v.2 genome peptides. Furthermore, C. lunatus secreted peptides via classical and non-classical pathways related to virulence, proteolysis, nucleic acid metabolism, carbohydrate metabolism, heat stress, signal trafficking and some with unidentified catalytic domains. Conclusions We have identified a set of 5 soluble candidate effectors of unknown function from C. lunatus secretome weaponries against potato crop at different temperature regimes. Our findings demonstrate that C. lunatus has a repertoire of signature secretome which mediates thermo-pathogenicity and share a leucine rich “CL[xxxx]LHM”-motif. Considering the rapidly evolving temperature dependent-virulence and host diversity of C. lunatus, this data will be useful for designing new protection strategies. PMID:24650331
Secretome weaponries of Cochliobolus lunatus interacting with potato leaf at different temperature regimes reveal a CL[xxxx]LHM - motif.

PubMed

Louis, Bengyella; Waikhom, Sayanika Devi; Roy, Pranab; Bhardwaj, Pardeep Kumar; Singh, Mohendro Wakambam; Goyari, Sailendra; Sharma, Chandradev K; Talukdar, Narayan Chandra

2014-03-20

Plant and animal pathogenic fungus Cochliobolus lunatus cause great economic damages worldwide every year. C. lunatus displays an increased temperature dependent-virulence to a wide range of hosts. Nonetheless, this phenomenon is poorly understood due to lack of insights on the coordinated secretome weaponries produced by C. lunatus under heat-stress conditions on putative hosts. To understand the mechanism better, we dissected the secretome of C. lunatus interacting with potato (Solanum tuberosum L.) leaf at different temperature regimes. C. lunatus produced melanized colonizing hyphae in and on potato leaf, finely modulated the ambient pH as a function of temperature and secreted diverse set of proteins. Using two dimensional gel electrophoresis (2-D) and mass spectrometry (MS) technology, we observed discrete secretomes at 20°C, 28°C and 38°C. A total of 21 differentially expressed peptide spots and 10 unique peptide spots (that did not align on the gels) matched with 28 unique protein models predicted from C. lunatus m118 v.2 genome peptides. Furthermore, C. lunatus secreted peptides via classical and non-classical pathways related to virulence, proteolysis, nucleic acid metabolism, carbohydrate metabolism, heat stress, signal trafficking and some with unidentified catalytic domains. We have identified a set of 5 soluble candidate effectors of unknown function from C. lunatus secretome weaponries against potato crop at different temperature regimes. Our findings demonstrate that C. lunatus has a repertoire of signature secretome which mediates thermo-pathogenicity and share a leucine rich "CL[xxxx]LHM"-motif. Considering the rapidly evolving temperature dependent-virulence and host diversity of C. lunatus, this data will be useful for designing new protection strategies.
Roles of conserved proline and glycosyltransferase motifs of EmbC in biosynthesis of lipoarabinomannan.

PubMed

Berg, Stefan; Starbuck, James; Torrelles, Jordi B; Vissa, Varalakshmi D; Crick, Dean C; Chatterjee, Delphi; Brennan, Patrick J

2005-02-18

D-Arabinans, composed of D-arabinofuranose (D-Araf), dominate the structure of mycobacterial cell walls in two settings, as part of lipoarabinomannan (LAM) and arabinogalactan, each with markedly different structures and functions. Little is known of the complexity of their biosynthesis. beta-D-Arabinofuranosyl-1-monophosphoryldecaprenol is the only known sugar donor. EmbA, EmbB, and EmbC, products of the paralogous genes embA, embB, and embC, the sites of resistance to the anti-tuberculosis drug ethambutol (EMB), are the only known implicated enzymes. EmbA and -B apparently contribute to the synthesis of arabinogalactan, whereas EmbC is reserved for the synthesis of LAM. The Emb proteins show no overall similarity to any known proteins beyond Mycobacterium and related genera. However, functional motifs, equivalent to a proline-rich motif of several bacterial polysaccharide co-polymerases and a superfamily of glycosyltransferases, were found. Site-directed mutagenesis in glycosyltransferase superfamily C resulted in complete ablation of LAM synthesis. Point mutations in three amino acids of the proline motif of EmbC resulted in marked reduction of LAM-arabinan synthesis and accumulation of an unknown intermediate and of the known precursor lipomannan. Yet the pattern of the differently linked d-Araf units observed in wild type LAM-arabinan was largely retained in the proline motif mutants. The results allow for the presentation of a unique model of arabinan synthesis.
ProMotE: an efficient algorithm for counting independent motifs in uncertain network topologies.

PubMed

Ren, Yuanfang; Sarkar, Aisharjya; Kahveci, Tamer

2018-06-26

Identifying motifs in biological networks is essential in uncovering key functions served by these networks. Finding non-overlapping motif instances is however a computationally challenging task. The fact that biological interactions are uncertain events further complicates the problem, as it makes the existence of an embedding of a given motif an uncertain event as well. In this paper, we develop a novel method, ProMotE (Probabilistic Motif Embedding), to count non-overlapping embeddings of a given motif in probabilistic networks. We utilize a polynomial model to capture the uncertainty. We develop three strategies to scale our algorithm to large networks. Our experiments demonstrate that our method scales to large networks in practical time with high accuracy where existing methods fail. Moreover, our experiments on cancer and degenerative disease networks show that our method helps in uncovering key functional characteristics of biological networks.
Tripartite motif-containing 29 (TRIM29) is a novel marker for lymph node metastasis in gastric cancer.

PubMed

Kosaka, Yoshimasa; Inoue, Hiroshi; Ohmachi, Takahiro; Yokoe, Takeshi; Matsumoto, Toshifumi; Mimori, Koshi; Tanaka, Fumiaki; Watanabe, Masahiko; Mori, Masaki

2007-09-01

Tripartite motif-containing 29 (TRIM29) belongs to the TRIM protein family, which has unique structural characteristics, including multiple zinc finger motifs and a leucine zipper motif. TRIM29, also known as ataxia telangiectasia group D complementing gene, possesses radiosensitivity suppressor functions. Although TRIM29 has been reported to be underexpressed in prostate and breast cancer, its expression in gastrointestinal cancer has not been studied. By use of real-time reverse transcriptase-polymerase chain reaction, we analyzed TRIM29 mRNA expression status with respect to various clinicopathological parameters in 124 patients with gastric cancer. An immunohistochemical study was also conducted. The expression of TRIM29 was far higher in gastric cancer tumor tissue. Increased TRIM29 mRNA expression was markedly associated with such parameters as histological grade, large tumor size, extent of tumor invasion, and lymph node metastasis. In the TRIM29 high-expression group, it was an independent predictor for lymph node metastasis. Furthermore, patients with high TRIM29 mRNA expression showed a far poorer survival rate than those with low TRIM29 mRNA expression. TRIM29 expression may serve as a good marker of lymph node metastasis in gastric cancer.
Molecular modeling of the elastomeric properties of repeating units and building blocks of resilin, a disordered elastic protein.

PubMed

Khandaker, Md Shahriar K; Dudek, Daniel M; Beers, Eric P; Dillard, David A; Bevan, David R

2016-08-01

The mechanisms responsible for the properties of disordered elastomeric proteins are not well known. To better understand the relationship between elastomeric behavior and amino acid sequence, we investigated resilin, a disordered rubber-like protein, found in specialized regions of the cuticle of insects. Resilin of Drosophila melanogaster contains Gly-rich repetitive motifs comprised of the amino acids, PSSSYGAPGGGNGGR, which confer elastic properties to resilin. The repetitive motifs of insect resilin can be divided into smaller partially conserved building blocks: PSS, SYGAP, GGGN and GGR. Using molecular dynamics (MD) simulations, we studied the relative roles of SYGAP, and its less common variants SYSAP and TYGAP, on the elastomeric properties of resilin. Results showed that SYGAP adopts a bent structure that is one-half to one-third the end-to-end length of the other motifs having an equal number of amino acids but containing SYSAP or TYGAP substituted for SYGAP. The bent structure of SYGAP forms due to conformational freedom of glycine, and hydrogen bonding within the motif apparently plays a role in maintaining this conformation. These structural features of SYGAP result in higher extensibility compared to other motifs, which may contribute to elastic properties at the macroscopic level. Overall, the results are consistent with a role for the SYGAP building block in the elastomeric properties of these disordered proteins. What we learned from simulating the repetitive motifs of resilin may be applicable to the biology and mechanics of other elastomeric biomaterials, and may provide us the deeper understanding of their unique properties. Copyright © 2016 Elsevier Ltd. All rights reserved.
Unfolding Kinetics of the Human Telomere i-Motif Under a 10 pN Force Imposed by the α-Hemolysin Nanopore Identify Transient Folded-State Lifetimes at Physiological pH.

PubMed

Ding, Yun; Fleming, Aaron M; He, Lidong; Burrows, Cynthia J

2015-07-22

Cytosine (C)-rich DNA can adopt i-motif folds under acidic conditions, with the human telomere i-motif providing a well-studied example. The dimensions of this i-motif are appropriate for capture in the nanocavity of the α-hemolysin (α-HL) protein pore under an electrophoretic force. Interrogation of the current vs time (i-t) traces when the i-motif interacts with α-HL identified characteristic signals that were pH dependent. These features were evaluated from pH 5.0 to 7.2, a region surrounding the transition pH of the i-motif (6.1). When the i-motif without polynucleotide tails was studied at pH 5.0, the folded structure entered the nanocavity of α-HL from either the top or bottom face to yield characteristic current patterns. Addition of a 5' 25-mer poly-2'-deoxyadensosine tail allowed capture of the i-motif from the unfolded terminus, and this was used to analyze the pH dependency of unfolding. At pH values below the transition point, only folded strands were observed, and when the pH was increased above the transition pH, the number of folded events decreased, while the unfolded events increased. At pH 6.8 and 7.2 4% and 2% of the strands were still folded, respectively. The lifetimes for the folded states at pH 6.8 and 7.2 were 21 and 9 ms, respectively, at 160 mV electrophoretic force. These lifetimes are sufficiently long to affect enzymes operating on DNA. Furthermore, these transient lifetimes are readily obtained using the α-HL nanopore, a feature that is not easily achievable by other methods.
The mosaic mutants of cucumber: A system to produce mitochondrial knock-downs

USDA-ARS?s Scientific Manuscript database

The mitochondrial (mt) DNA of cucumber has several unique attributes, including paternal transmission and large size due in part to the accumulation of repetitive DNAs. Recombination among these repetitive motifs generates structural rearrangements in the mt DNAs. When the highly inbred line ‘B’ of ...
Generation and analysis of expressed sequence tags from a cDNA library of the fruiting body of Ganoderma lucidum

PubMed Central

2010-01-01

Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644
Distribution of circular proteins in plants: large-scale mapping of cyclotides in the Violaceae.

PubMed

Burman, Robert; Yeshak, Mariamawit Y; Larsson, Sonny; Craik, David J; Rosengren, K Johan; Göransson, Ulf

2015-01-01

During the last decade there has been increasing interest in small circular proteins found in plants of the violet family (Violaceae). These so-called cyclotides consist of a circular chain of approximately 30 amino acids, including six cysteines forming three disulfide bonds, arranged in a cyclic cystine knot (CCK) motif. In this study we map the occurrence and distribution of cyclotides throughout the Violaceae. Plant material was obtained from herbarium sheets containing samples up to 200 years of age. Even the oldest specimens contained cyclotides in the preserved leaves, with no degradation products observable, confirming their place as one of the most stable proteins in nature. Over 200 samples covering 17 of the 23-31 genera in Violaceae were analyzed, and cyclotides were positively identified in 150 species. Each species contained a unique set of between one and 25 cyclotides, with many exclusive to individual plant species. We estimate the number of different cyclotides in the Violaceae to be 5000-25,000, and propose that cyclotides are ubiquitous among all Violaceae species. Twelve new cyclotides from six phylogenetically dispersed genera were sequenced. Furthermore, the first glycosylated derivatives of cyclotides were identified and characterized, further increasing the diversity and complexity of this unique protein family.
Informative priors based on transcription factor structural class improve de novo motif discovery.

PubMed

Narlikar, Leelavati; Gordân, Raluca; Ohler, Uwe; Hartemink, Alexander J

2006-07-15

An important problem in molecular biology is to identify the locations at which a transcription factor (TF) binds to DNA, given a set of DNA sequences believed to be bound by that TF. In previous work, we showed that information in the DNA sequence of a binding site is sufficient to predict the structural class of the TF that binds it. In particular, this suggests that we can predict which locations in any DNA sequence are more likely to be bound by certain classes of TFs than others. Here, we argue that traditional methods for de novo motif finding can be significantly improved by adopting an informative prior probability that a TF binding site occurs at each sequence location. To demonstrate the utility of such an approach, we present priority, a powerful new de novo motif finding algorithm. Using data from TRANSFAC, we train three classifiers to recognize binding sites of basic leucine zipper, forkhead, and basic helix loop helix TFs. These classifiers are used to equip priority with three class-specific priors, in addition to a default prior to handle TFs of other classes. We apply priority and a number of popular motif finding programs to sets of yeast intergenic regions that are reported by ChIP-chip to be bound by particular TFs. priority identifies motifs the other methods fail to identify, and correctly predicts the structural class of the TF recognizing the identified binding sites. Supplementary material and code can be found at http://www.cs.duke.edu/~amink/.
A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum.

PubMed

Cheng, Jiaowen; Zhao, Zicheng; Li, Bo; Qin, Cheng; Wu, Zhiming; Trejo-Saavedra, Diana L; Luo, Xirong; Cui, Junjie; Rivera-Bustamante, Rafael F; Li, Shuaicheng; Hu, Kailin

2016-01-07

The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum.
A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum

PubMed Central

Cheng, Jiaowen; Zhao, Zicheng; Li, Bo; Qin, Cheng; Wu, Zhiming; Trejo-Saavedra, Diana L.; Luo, Xirong; Cui, Junjie; Rivera-Bustamante, Rafael F.; Li, Shuaicheng; Hu, Kailin

2016-01-01

The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum. PMID:26739748
QuadBase2: web server for multiplexed guanine quadruplex mining and visualization

PubMed Central

Dhapola, Parashar; Chowdhury, Shantanu

2016-01-01

DNA guanine quadruplexes or G4s are non-canonical DNA secondary structures which affect genomic processes like replication, transcription and recombination. G4s are computationally identified by specific nucleotide motifs which are also called putative G4 (PG4) motifs. Despite the general relevance of these structures, there is currently no tool available that can allow batch queries and genome-wide analysis of these motifs in a user-friendly interface. QuadBase2 (quadbase.igib.res.in) presents a completely reinvented web server version of previously published QuadBase database. QuadBase2 enables users to mine PG4 motifs in up to 178 eukaryotes through the EuQuad module. This module interfaces with Ensembl Compara database, to allow users mine PG4 motifs in the orthologues of genes of interest across eukaryotes. PG4 motifs can be mined across genes and their promoter sequences in 1719 prokaryotes through ProQuad module. This module includes a feature that allows genome-wide mining of PG4 motifs and their visualization as circular histograms. TetraplexFinder, the module for mining PG4 motifs in user-provided sequences is now capable of handling up to 20 MB of data. QuadBase2 is a comprehensive PG4 motif mining tool that further expands the configurations and algorithms for mining PG4 motifs in a user-friendly way. PMID:27185890
Cancer-related marketing centrality motifs acting as pivot units in the human signaling network and mediating cross-talk between biological pathways.

PubMed

Li, Wan; Chen, Lina; Li, Xia; Jia, Xu; Feng, Chenchen; Zhang, Liangcai; He, Weiming; Lv, Junjie; He, Yuehan; Li, Weiguo; Qu, Xiaoli; Zhou, Yanyan; Shi, Yuchen

2013-12-01

Network motifs in central positions are considered to not only have more in-coming and out-going connections but are also localized in an area where more paths reach the networks. These central motifs have been extensively investigated to determine their consistent functions or associations with specific function categories. However, their functional potentials in the maintenance of cross-talk between different functional communities are unclear. In this paper, we constructed an integrated human signaling network from the Pathway Interaction Database. We identified 39 essential cancer-related motifs in central roles, which we called cancer-related marketing centrality motifs, using combined centrality indices on the system level. Our results demonstrated that these cancer-related marketing centrality motifs were pivotal units in the signaling network, and could mediate cross-talk between 61 biological pathways (25 could be mediated by one motif on average), most of which were cancer-related pathways. Further analysis showed that molecules of most marketing centrality motifs were in the same or adjacent subcellular localizations, such as the motif containing PI3K, PDK1 and AKT1 in the plasma membrane, to mediate signal transduction between 32 cancer-related pathways. Finally, we analyzed the pivotal roles of cancer genes in these marketing centrality motifs in the pathogenesis of cancers, and found that non-cancer genes were potential cancer-related genes.
A novel approach to identifying regulatory motifs in distantly related genomes

PubMed Central

Van Hellemont, Ruth; Monsieurs, Pieter; Thijs, Gert; De Moor, Bart; Van de Peer, Yves; Marchal, Kathleen

2005-01-01

Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size. PMID:16420672
The N-terminal cysteine pair of yeast sulfhydryl oxidase Erv1p is essential for in vivo activity and interacts with the primary redox centre.

PubMed

Hofhaus, Götz; Lee, Jeung-Eun; Tews, Ivo; Rosenberg, Beate; Lisowsky, Thomas

2003-04-01

Yeast Erv1p is a ubiquitous FAD-dependent sulfhydryl oxidase, located in the intermembrane space of mitochondria. The dimeric enzyme is essential for survival of the cell. Besides the redox-active CXXC motif close to the FAD, Erv1p harbours two additional cysteine pairs. Site-directed mutagenesis has identified all three cysteine pairs as essential for normal function. The C-terminal cysteine pair is of structural importance as it contributes to the correct arrangement of the FAD-binding fold. Variations in dimer formation and unique colour changes of mutant proteins argue in favour of an interaction between the N-terminal cysteine pair with the redox centre of the partner monomer.
Origin and diversification of leucine-rich repeat receptor-like protein kinase (LRR-RLK) genes in plants.

PubMed

Liu, Ping-Li; Du, Liang; Huang, Yuan; Gao, Shu-Min; Yu, Meng

2017-02-07

Leucine-rich repeat receptor-like protein kinases (LRR-RLKs) are the largest group of receptor-like kinases in plants and play crucial roles in development and stress responses. The evolutionary relationships among LRR-RLK genes have been investigated in flowering plants; however, no comprehensive studies have been performed for these genes in more ancestral groups. The subfamily classification of LRR-RLK genes in plants, the evolutionary history and driving force for the evolution of each LRR-RLK subfamily remain to be understood. We identified 119 LRR-RLK genes in the Physcomitrella patens moss genome, 67 LRR-RLK genes in the Selaginella moellendorffii lycophyte genome, and no LRR-RLK genes in five green algae genomes. Furthermore, these LRR-RLK sequences, along with previously reported LRR-RLK sequences from Arabidopsis thaliana and Oryza sativa, were subjected to evolutionary analyses. Phylogenetic analyses revealed that plant LRR-RLKs belong to 19 subfamilies, eighteen of which were established in early land plants, and one of which evolved in flowering plants. More importantly, we found that the basic structures of LRR-RLK genes for most subfamilies are established in early land plants and conserved within subfamilies and across different plant lineages, but divergent among subfamilies. In addition, most members of the same subfamily had common protein motif compositions, whereas members of different subfamilies showed variations in protein motif compositions. The unique gene structure and protein motif compositions of each subfamily differentiate the subfamily classifications and, more importantly, provide evidence for functional divergence among LRR-RLK subfamilies. Maximum likelihood analyses showed that some sites within four subfamilies were under positive selection. Much of the diversity of plant LRR-RLK genes was established in early land plants. Positive selection contributed to the evolution of a few LRR-RLK subfamilies.

Nitrogen transporter and assimilation genes exhibit developmental stage-selective expression in maize (Zea mays L.) associated with distinct cis-acting promoter motifs.

PubMed

Liseron-Monfils, Christophe; Bi, Yong-Mei; Downs, Gregory S; Wu, Wenqing; Signorelli, Tara; Lu, Guangwen; Chen, Xi; Bondo, Eddie; Zhu, Tong; Lukens, Lewis N; Colasanti, Joseph; Rothstein, Steven J; Raizada, Manish N

2013-10-01

Nitrogen is considered the most limiting nutrient for maize (Zea mays L.), but there is limited understanding of the regulation of nitrogen-related genes during maize development. An Affymetrix 82K maize array was used to analyze the expression of ≤ 46 unique nitrogen uptake and assimilation probes in 50 maize tissues from seedling emergence to 31 d after pollination. Four nitrogen-related expression clusters were identified in roots and shoots corresponding to, or overlapping, juvenile, adult, and reproductive phases of development. Quantitative real time PCR data was consistent with the existence of these distinct expression clusters. Promoters corresponding to each cluster were screened for over-represented cis-acting elements. The 8-bp distal motif of the Arabidopsis 43-bp nitrogen response element (NRE) was over-represented in nitrogen-related maize gene promoters. This conserved motif, referred to here as NRE43-d8, was previously shown to be critical for nitrate-activated transcription of nitrate reductase (NIA1) and nitrite reductase (NIR1) by the NIN-LIKE PROTEIN 6 (NLP6) in Arabidopsis. Here, NRE43-d8 was over-represented in the promoters of maize nitrate and ammonium transporter genes, specifically those that showed peak expression during early-stage vegetative development. This result predicts an expansion of the NRE-NLP6 regulon and suggests that it may have a developmental component in maize. We also report leaf expression of putative orthologs of nitrite transporters (NiTR1), a transporter not previously reported in maize. We conclude by discussing how each of the four transcriptional modules may be responsible for the different nitrogen uptake and assimilation requirements of leaves and roots at different stages of maize development.
Adaptive Evolution of Eel Fluorescent Proteins from Fatty Acid Binding Proteins Produces Bright Fluorescence in the Marine Environment.

PubMed

Gruber, David F; Gaffney, Jean P; Mehr, Shaadi; DeSalle, Rob; Sparks, John S; Platisa, Jelena; Pieribone, Vincent A

2015-01-01

We report the identification and characterization of two new members of a family of bilirubin-inducible fluorescent proteins (FPs) from marine chlopsid eels and demonstrate a key region of the sequence that serves as an evolutionary switch from non-fluorescent to fluorescent fatty acid-binding proteins (FABPs). Using transcriptomic analysis of two species of brightly fluorescent Kaupichthys eels (Kaupichthys hyoproroides and Kaupichthys n. sp.), two new FPs were identified, cloned and characterized (Chlopsid FP I and Chlopsid FP II). We then performed phylogenetic analysis on 210 FABPs, spanning 16 vertebrate orders, and including 163 vertebrate taxa. We show that the fluorescent FPs diverged as a protein family and are the sister group to brain FABPs. Our results indicate that the evolution of this family involved at least three gene duplication events. We show that fluorescent FABPs possess a unique, conserved tripeptide Gly-Pro-Pro sequence motif, which is not found in non-fluorescent fatty acid binding proteins. This motif arose from a duplication event of the FABP brain isoforms and was under strong purifying selection, leading to the classification of this new FP family. Residues adjacent to the motif are under strong positive selection, suggesting a further refinement of the eel protein's fluorescent properties. We present a phylogenetic reconstruction of this emerging FP family and describe additional fluorescent FABP members from groups of distantly related eels. The elucidation of this class of fish FPs with diverse properties provides new templates for the development of protein-based fluorescent tools. The evolutionary adaptation from fatty acid-binding proteins to fluorescent fatty acid-binding proteins raises intrigue as to the functional role of bright green fluorescence in this cryptic genus of reclusive eels that inhabit a blue, nearly monochromatic, marine environment.
Local complexity predicts global synchronization of hierarchically networked oscillators

NASA Astrophysics Data System (ADS)

Xu, Jin; Park, Dong-Ho; Jo, Junghyo

2017-07-01

We study the global synchronization of hierarchically-organized Stuart-Landau oscillators, where each subsystem consists of three oscillators with activity-dependent couplings. We considered all possible coupling signs between the three oscillators, and found that they can generate different numbers of phase attractors depending on the network motif. Here, the subsystems are coupled through mean activities of total oscillators. Under weak inter-subsystem couplings, we demonstrate that the synchronization between subsystems is highly correlated with the number of attractors in uncoupled subsystems. Among the network motifs, perfect anti-symmetric ones are unique to generate both single and multiple attractors depending on the activities of oscillators. The flexible local complexity can make global synchronization controllable.
Reversible Redox Activity by Ion-pH Dually Modulated Duplex Formation of i-Motif DNA with Complementary G-DNA.

PubMed

Chang, Soyoung; Kilic, Tugba; Lee, Chang Kee; Avci, Huseyin; Bae, Hojae; Oskui, Shirin Mesbah; Jung, Sung Mi; Shin, Su Ryon; Kim, Seon Jeong

2018-04-08

The unique biological features of supramolecular DNA have led to an increasing interest in biomedical applications such as biosensors. We have developed an i-motif and G-rich DNA conjugated single-walled carbon nanotube hybrid materials, which shows reversible conformational switching upon external stimuli such as pH (5 and 8) and presence of ions (Li⁺ and K⁺). We observed reversible electrochemical redox activity upon external stimuli in a quick and robust manner. Given the ease and the robustness of this method, we believe that pH- and ion-driven reversible DNA structure transformations will be utilized for future applications for developing novel biosensors.
DNA nanotechnology based on i-motif structures.

PubMed

Dong, Yuanchen; Yang, Zhongqiang; Liu, Dongsheng

2014-06-17

CONSPECTUS: Most biological processes happen at the nanometer scale, and understanding the energy transformations and material transportation mechanisms within living organisms has proved challenging. To better understand the secrets of life, researchers have investigated artificial molecular motors and devices over the past decade because such systems can mimic certain biological processes. DNA nanotechnology based on i-motif structures is one system that has played an important role in these investigations. In this Account, we summarize recent advances in functional DNA nanotechnology based on i-motif structures. The i-motif is a DNA quadruplex that occurs as four stretches of cytosine repeat sequences form C·CH(+) base pairs, and their stabilization requires slightly acidic conditions. This unique property has produced the first DNA molecular motor driven by pH changes. The motor is reliable, and studies show that it is capable of millisecond running speeds, comparable to the speed of natural protein motors. With careful design, the output of these types of motors was combined to drive micrometer-sized cantilevers bend. Using established DNA nanostructure assembly and functionalization methods, researchers can easily integrate the motor within other DNA assembled structures and functional units, producing DNA molecular devices with new functions such as suprahydrophobic/suprahydrophilic smart surfaces that switch, intelligent nanopores triggered by pH changes, molecular logic gates, and DNA nanosprings. Recently, researchers have produced motors driven by light and electricity, which have allowed DNA motors to be integrated within silicon-based nanodevices. Moreover, some devices based on i-motif structures have proven useful for investigating processes within living cells. The pH-responsiveness of the i-motif structure also provides a way to control the stepwise assembly of DNA nanostructures. In addition, because of the stability of the i-motif, this structure can serve as the stem of one-dimensional nanowires, and a four-strand stem can provide a new basis for three-dimensional DNA structures such as pillars. By sacrificing some accuracy in assembly, we used these properties to prepare the first fast-responding pure DNA supramolecular hydrogel. This hydrogel does not swell and cannot encapsulate small molecules. These unique properties could lead to new developments in smart materials based on DNA assembly and support important applications in fields such as tissue engineering. We expect that DNA nanotechnology will continue to develop rapidly. At a fundamental level, further studies should lead to greater understanding of the energy transformation and material transportation mechanisms at the nanometer scale. In terms of applications, we expect that many of these elegant molecular devices will soon be used in vivo. These further studies could demonstrate the power of DNA nanotechnology in biology, material science, chemistry, and physics.
Conserved Tryptophan Motifs in the Large Tegument Protein pUL36 Are Required for Efficient Secondary Envelopment of Herpes Simplex Virus Capsids

PubMed Central

Ivanova, Lyudmila; Buch, Anna; Döhner, Katinka; Pohlmann, Anja; Binz, Anne; Prank, Ute; Sandbaumhüter, Malte

2016-01-01

ABSTRACT Herpes simplex virus (HSV) replicates in the skin and mucous membranes, and initiates lytic or latent infections in sensory neurons. Assembly of progeny virions depends on the essential large tegument protein pUL36 of 3,164 amino acid residues that links the capsids to the tegument proteins pUL37 and VP16. Of the 32 tryptophans of HSV-1-pUL36, the tryptophan-acidic motifs 1766WD1767 and 1862WE1863 are conserved in all HSV-1 and HSV-2 isolates. Here, we characterized the role of these motifs in the HSV life cycle since the rare tryptophans often have unique roles in protein function due to their large hydrophobic surface. The infectivity of the mutants HSV-1(17+)Lox-pUL36-WD/AA-WE/AA and HSV-1(17+)Lox-CheVP26-pUL36-WD/AA-WE/AA, in which the capsid has been tagged with the fluorescent protein Cherry, was significantly reduced. Quantitative electron microscopy shows that there were a larger number of cytosolic capsids and fewer enveloped virions compared to their respective parental strains, indicating a severe impairment in secondary capsid envelopment. The capsids of the mutant viruses accumulated in the perinuclear region around the microtubule-organizing center and were not dispersed to the cell periphery but still acquired the inner tegument proteins pUL36 and pUL37. Furthermore, cytoplasmic capsids colocalized with tegument protein VP16 and, to some extent, with tegument protein VP22 but not with the envelope glycoprotein gD. These results indicate that the unique conserved tryptophan-acidic motifs in the central region of pUL36 are required for efficient targeting of progeny capsids to the membranes of secondary capsid envelopment and for efficient virion assembly. IMPORTANCE Herpesvirus infections give rise to severe animal and human diseases, especially in young, immunocompromised, and elderly individuals. The structural hallmark of herpesvirus virions is the tegument, which contains evolutionarily conserved proteins that are essential for several stages of the herpesvirus life cycle. Here we characterized two conserved tryptophan-acidic motifs in the central region of the large tegument protein pUL36 of herpes simplex virus. When we mutated these motifs, secondary envelopment of cytosolic capsids and the production of infectious particles were severely impaired. Our data suggest that pUL36 and its homologs in other herpesviruses, and in particular such tryptophan-acidic motifs, could provide attractive targets for the development of novel drugs to prevent herpesvirus assembly and spread. PMID:27009950
A Dbf4p BRCA1 C-Terminal-Like Domain Required for the Response to Replication Fork Arrest in Budding Yeast

PubMed Central

Gabrielse, Carrie; Miller, Charles T.; McConnell, Kristopher H.; DeWard, Aaron; Fox, Catherine A.; Weinreich, Michael

2006-01-01

Dbf4p is an essential regulatory subunit of the Cdc7p kinase required for the initiation of DNA replication. Cdc7p and Dbf4p orthologs have also been shown to function in the response to DNA damage. A previous Dbf4p multiple sequence alignment identified a conserved ∼40-residue N-terminal region with similarity to the BRCA1 C-terminal (BRCT) motif called “motif N.” BRCT motifs encode ∼100-amino-acid domains involved in the DNA damage response. We have identified an expanded and conserved ∼100-residue N-terminal region of Dbf4p that includes motif N but is capable of encoding a single BRCT-like domain. Dbf4p orthologs diverge from the BRCT motif at the C terminus but may encode a similar secondary structure in this region. We have therefore called this the BRCT and DBF4 similarity (BRDF) motif. The principal role of this Dbf4p motif was in the response to replication fork (RF) arrest; however, it was not required for cell cycle progression, activation of Cdc7p kinase activity, or interaction with the origin recognition complex (ORC) postulated to recruit Cdc7p–Dbf4p to origins. Rad53p likely directly phosphorylated Dbf4p in response to RF arrest and Dbf4p was required for Rad53p abundance. Rad53p and Dbf4p therefore cooperated to coordinate a robust cellular response to RF arrest. PMID:16547092
Structure of Rhodococcus equi virulence-associated protein B (VapB) reveals an eight-stranded antiparallel β-barrel consisting of two Greek-key motifs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Geerds, Christina; Wohlmann, Jens; Haas, Albert

The structure of VapB, a member of the Vap protein family that is involved in virulence of the bacterial pathogen R. equi, was determined by SAD phasing and reveals an eight-stranded antiparallel β-barrel similar to avidin, suggestive of a binding function. Made up of two Greek-key motifs, the topology of VapB is unusual or even unique. Members of the virulence-associated protein (Vap) family from the pathogen Rhodococcus equi regulate virulence in an unknown manner. They do not share recognizable sequence homology with any protein of known structure. VapB and VapA are normally associated with isolates from pigs and horses, respectively.more » To contribute to a molecular understanding of Vap function, the crystal structure of a protease-resistant VapB fragment was determined at 1.4 Å resolution. The structure was solved by SAD phasing employing the anomalous signal of one endogenous S atom and two bound Co ions with low occupancy. VapB is an eight-stranded antiparallel β-barrel with a single helix. Structural similarity to avidins suggests a potential binding function. Unlike other eight- or ten-stranded β-barrels found in avidins, bacterial outer membrane proteins, fatty-acid-binding proteins and lysozyme inhibitors, Vaps do not have a next-neighbour arrangement but consist of two Greek-key motifs with strand order 41238567, suggesting an unusual or even unique topology.« less
Structure of Radical-Induced Cell Death1 Hub Domain Reveals a Common αα-Scaffold for Disorder in Transcriptional Networks.

PubMed

Bugge, Katrine; Staby, Lasse; Kemplen, Katherine R; O'Shea, Charlotte; Bendsen, Sidsel K; Jensen, Mikael K; Olsen, Johan G; Skriver, Karen; Kragelund, Birthe B

2018-05-01

Communication within cells relies on a few protein nodes called hubs, which organize vast interactomes with many partners. Frequently, hub proteins are intrinsically disordered conferring multi-specificity and dynamic communication. Conversely, folded hub proteins may organize networks using disordered partners. In this work, the structure of the RST domain, a unique folded hub, is solved by nuclear magnetic resonance spectroscopy and small-angle X-ray scattering, and its complex with a region of the transcription factor DREB2A is provided through data-driven HADDOCK modeling and mutagenesis analysis. The RST fold is unique, but similar structures are identified in the PAH (paired amphipathic helix), TAFH (TATA-box-associated factor homology), and NCBD (nuclear coactivator binding domain) domains. We designate them as a group the αα hubs, as they share an αα-hairpin super-secondary motif, which serves as an organizing platform for malleable helices of varying topology. This allows for partner adaptation, exclusion, and selection. Our findings provide valuable insights into structural features enabling signaling fidelity. Copyright © 2018 Elsevier Ltd. All rights reserved.
PISMA: A Visual Representation of Motif Distribution in DNA Sequences.

PubMed

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code-like, as a gene-map-like, and as a transcript scheme. We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf.
Discovering Motifs in Biological Sequences Using the Micron Automata Processor.

PubMed

Roy, Indranil; Aluru, Srinivas

2016-01-01

Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the (l, d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with each occurrence differing from the motif in at most d substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is (26,11). We propose a novel algorithm for the (l,d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the (l, d) motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances (39,18) and (40,17). The paper serves as a useful guide to solving problems using this new accelerator technology.
PISMA: A Visual Representation of Motif Distribution in DNA Sequences

PubMed Central

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf. PMID:28469418
Identification of a novel mitotic phosphorylation motif associated with protein localization to the mitotic apparatus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Feng; Camp, David G.; Gritsenko, Marina A.

2007-11-16

The chromosomal passenger complex (CPC) is a critical regulator of chromosome, cytoskeleton and membrane dynamics during mitosis. Here, we identified phosphopeptides and phosphoprotein complexes recognized by a phosphorylation specific antibody that labels the CPC using liquid chromatography coupled to mass spectrometry. A mitotic phosphorylation motif (PX{G/T/S}{L/M}[pS]P or WGL[pS]P) was identified in 11 proteins including Fzr/Cdh1 and RIC-8, two proteins with potential links to the CPC. Phosphoprotein complexes contained known CPC components INCENP, Aurora-B and TD-60, as well as SMAD2, 14-3-3 proteins, PP2A, and Cdk1, a likely kinase for this motif. Protein sequence analysis identified phosphorylation motifs in additional proteins includingmore » SMAD2, Plk3 and INCENP. Mitotic SMAD2 and Plk3 phosphorylation was confirmed using phosphorylation specific antibodies, and in the case of Plk3, phosphorylation correlates with its localization to the mitotic apparatus. A mutagenesis approach was used to show INCENP phosphorylation is required for midbody localization. These results provide evidence for a shared phosphorylation event that regulates localization of critical proteins during mitosis.« less
Redemptive Journey: The Storytelling Motif in Andersen's "The Snow Queen."

ERIC Educational Resources Information Center

Misheff, Sue

1989-01-01

Discusses how Hans Christian Andersen's "The Snow Queen" uses the motif of storytelling to describe the journey taken by the heroine Gerda. Identifies a story as that which is alive and active and which causes catharsis for those who participate in it. (MG)
iFORM: Incorporating Find Occurrence of Regulatory Motifs.

PubMed

Ren, Chao; Chen, Hebing; Yang, Bite; Liu, Feng; Ouyang, Zhangyi; Bo, Xiaochen; Shu, Wenjie

2016-01-01

Accurately identifying the binding sites of transcription factors (TFs) is crucial to understanding the mechanisms of transcriptional regulation and human disease. We present incorporating Find Occurrence of Regulatory Motifs (iFORM), an easy-to-use and efficient tool for scanning DNA sequences with TF motifs described as position weight matrices (PWMs). Both performance assessment with a receiver operating characteristic (ROC) curve and a correlation-based approach demonstrated that iFORM achieves higher accuracy and sensitivity by integrating five classical motif discovery programs using Fisher's combined probability test. We have used iFORM to provide accurate results on a variety of data in the ENCODE Project and the NIH Roadmap Epigenomics Project, and the tool has demonstrated its utility in further elucidating individual roles of functional elements. Both the source and binary codes for iFORM can be freely accessed at https://github.com/wenjiegroup/iFORM. The identified TF binding sites across human cell and tissue types using iFORM have been deposited in the Gene Expression Omnibus under the accession ID GSE53962.
Identification of the WBSCR9 gene, encoding a novel transcriptional regulator, in the Williams-Beuren syndrome deletion at 7q11.23.

PubMed

Peoples, R J; Cisco, M J; Kaplan, P; Francke, U

1998-01-01

We have identified a novel gene (WBSCR9) within the common Williams-Beuren syndrome (WBS) deletion by interspecies sequence conservation. The WBSCR9 gene encodes a roughly 7-kb transcript with an open reading frame of 1483 amino acids and a predicted protein product size of 170.8 kDa. WBSCR9 is comprised of at least 20 exons extending over 60 kb. The transcript is expressed ubiquitously throughout development and is subject to alternative splicing. Functional motifs identified by sequence homology searches include a bromodomain; a PHD, or C4HC3, finger; several putative nuclear localization signals; four nuclear receptor binding motifs; a polyglutamate stretch and two PEST sequences. Bromodomains, PHD motifs and nuclear receptor binding motifs are cardinal features of proteins that are involved in chromatin remodeling and modulation of transcription. Haploinsufficiency for WBSCR9 gene products may contribute to the complex phenotype of WBS by interacting with tissue-specific regulatory factors during development.
Curvulamine, a new antibacterial alkaloid incorporating two undescribed units from a Curvularia species.

PubMed

Han, Wen Bo; Lu, Yan Hua; Zhang, Ai Hua; Zhang, Gao Fei; Mei, Ya Ning; Jiang, Nan; Lei, Xinxiang; Song, Yong Chun; Ng, Seik Weng; Tan, Ren Xiang

2014-10-17

The white croaker (Argyrosomus argentatus) derived Curvularia sp. IFB-Z10 produces curvulamine as a skeletally unprecedented alkaloid incorporating two undescribed extender units. Curvulamine is more selectively antibacterial than tinidazole and biosynthetically unique in the new extenders formed through a decarboxylative condensation between an oligoketide motif and alanine.
Insights on genome size evolution from a miniature inverted repeat transposon driving a satellite DNA.

PubMed

Scalvenzi, Thibault; Pollet, Nicolas

2014-12-01

The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.
Transcriptome analysis in tardigrade species reveals specific molecular pathways for stress adaptations.

PubMed

Förster, Frank; Beisser, Daniela; Grohme, Markus A; Liang, Chunguang; Mali, Brahim; Siegl, Alexander Matthias; Engelmann, Julia C; Shkumatov, Alexander V; Schokraie, Elham; Müller, Tobias; Schnölzer, Martina; Schill, Ralph O; Frohme, Marcus; Dandekar, Thomas

2012-01-01

Tardigrades have unique stress-adaptations that allow them to survive extremes of cold, heat, radiation and vacuum. To study this, encoded protein clusters and pathways from an ongoing transcriptome study on the tardigrade Milnesium tardigradum were analyzed using bioinformatics tools and compared to expressed sequence tags (ESTs) from Hypsibius dujardini, revealing major pathways involved in resistance against extreme environmental conditions. ESTs are available on the Tardigrade Workbench along with software and databank updates. Our analysis reveals that RNA stability motifs for M. tardigradum are different from typical motifs known from higher animals. M. tardigradum and H. dujardini protein clusters and conserved domains imply metabolic storage pathways for glycogen, glycolipids and specific secondary metabolism as well as stress response pathways (including heat shock proteins, bmh2, and specific repair pathways). Redox-, DNA-, stress- and protein protection pathways complement specific repair capabilities to achieve the strong robustness of M. tardigradum. These pathways are partly conserved in other animals and their manipulation could boost stress adaptation even in human cells. However, the unique combination of resistance and repair pathways make tardigrades and M. tardigradum in particular so highly stress resistant.
A proximity-based graph clustering method for the identification and application of transcription factor clusters.

PubMed

Spadafore, Maxwell; Najarian, Kayvan; Boyle, Alan P

2017-11-29

Transcription factors (TFs) form a complex regulatory network within the cell that is crucial to cell functioning and human health. While methods to establish where a TF binds to DNA are well established, these methods provide no information describing how TFs interact with one another when they do bind. TFs tend to bind the genome in clusters, and current methods to identify these clusters are either limited in scope, unable to detect relationships beyond motif similarity, or not applied to TF-TF interactions. Here, we present a proximity-based graph clustering approach to identify TF clusters using either ChIP-seq or motif search data. We use TF co-occurrence to construct a filtered, normalized adjacency matrix and use the Markov Clustering Algorithm to partition the graph while maintaining TF-cluster and cluster-cluster interactions. We then apply our graph structure beyond clustering, using it to increase the accuracy of motif-based TFBS searching for an example TF. We show that our method produces small, manageable clusters that encapsulate many known, experimentally validated transcription factor interactions and that our method is capable of capturing interactions that motif similarity methods might miss. Our graph structure is able to significantly increase the accuracy of motif TFBS searching, demonstrating that the TF-TF connections within the graph correlate with biological TF-TF interactions. The interactions identified by our method correspond to biological reality and allow for fast exploration of TF clustering and regulatory dynamics.

Feature extraction using gray-level co-occurrence matrix of wavelet coefficients and texture matching for batik motif recognition

NASA Astrophysics Data System (ADS)

Suciati, Nanik; Herumurti, Darlis; Wijaya, Arya Yudhi

2017-02-01

Batik is one of Indonesian's traditional cloth. Motif or pattern drawn on a piece of batik fabric has a specific name and philosopy. Although batik cloths are widely used in everyday life, but only few people understand its motif and philosophy. This research is intended to develop a batik motif recognition system which can be used to identify motif of Batik image automatically. First, a batik image is decomposed into sub-images using wavelet transform. Six texture descriptors, i.e. max probability, correlation, contrast, uniformity, homogenity and entropy, are extracted from gray-level co-occurrence matrix of each sub-image. The texture features are then matched to the template features using canberra distance. The experiment is performed on Batik Dataset consisting of 1088 batik images grouped into seven motifs. The best recognition rate, that is 92,1%, is achieved using feature extraction process with 5 level wavelet decomposition and 4 directional gray-level co-occurrence matrix.
Mapping and analysis of Caenorhabditis elegans transcription factor sequence specificities

PubMed Central

Narasimhan, Kamesh; Lambert, Samuel A; Yang, Ally WH; Riddell, Jeremy; Mnaimneh, Sanie; Zheng, Hong; Albu, Mihai; Najafabadi, Hamed S; Reece-Hoyes, John S; Fuxman Bass, Juan I; Walhout, Albertha JM; Weirauch, Matthew T; Hughes, Timothy R

2015-01-01

Caenorhabditis elegans is a powerful model for studying gene regulation, as it has a compact genome and a wealth of genomic tools. However, identification of regulatory elements has been limited, as DNA-binding motifs are known for only 71 of the estimated 763 sequence-specific transcription factors (TFs). To address this problem, we performed protein binding microarray experiments on representatives of canonical TF families in C. elegans, obtaining motifs for 129 TFs. Additionally, we predict motifs for many TFs that have DNA-binding domains similar to those already characterized, increasing coverage of binding specificities to 292 C. elegans TFs (∼40%). These data highlight the diversification of binding motifs for the nuclear hormone receptor and C2H2 zinc finger families and reveal unexpected diversity of motifs for T-box and DM families. Motif enrichment in promoters of functionally related genes is consistent with known biology and also identifies putative regulatory roles for unstudied TFs. DOI: http://dx.doi.org/10.7554/eLife.06967.001 PMID:25905672
Sequence analyses of fimbriae subunit FimA proteins on Actinomyces naeslundii genospecies 1 and 2 and Actinomyces odontolyticus with variant carbohydrate binding specificities

PubMed Central

Drobni, Mirva; Hallberg, Kristina; Öhman, Ulla; Birve, Anna; Persson, Karina; Johansson, Ingegerd; Strömberg, Nicklas

2006-01-01

Background Actinomyces naeslundii genospecies 1 and 2 express type-2 fimbriae (FimA subunit polymers) with variant Galβ binding specificities and Actinomyces odontolyticus a sialic acid specificity to colonize different oral surfaces. However, the fimbrial nature of the sialic acid binding property and sequence information about FimA proteins from multiple strains are lacking. Results Here we have sequenced fimA genes from strains of A.naeslundii genospecies 1 (n = 4) and genospecies 2 (n = 4), both of which harboured variant Galβ-dependent hemagglutination (HA) types, and from A.odontolyticus PK984 with a sialic acid-dependent HA pattern. Three unique subtypes of FimA proteins with 63.8–66.4% sequence identity were present in strains of A. naeslundii genospecies 1 and 2 and A. odontolyticus. The generally high FimA sequence identity (>97.2%) within a genospecies revealed species specific sequences or segments that coincided with binding specificity. All three FimA protein variants contained a signal peptide, pilin motif, E box, proline-rich segment and an LPXTG sorting motif among other conserved segments for secretion, assembly and sorting of fimbrial proteins. The highly conserved pilin, E box and LPXTG motifs are present in fimbriae proteins from other Gram-positive bacteria. Moreover, only strains of genospecies 1 were agglutinated with type-2 fimbriae antisera derived from A. naeslundii genospecies 1 strain 12104, emphasizing that the overall folding of FimA may generate different functionalities. Western blot analyses with FimA antisera revealed monomers and oligomers of FimA in whole cell protein extracts and a purified recombinant FimA preparation, indicating a sortase-independent oligomerization of FimA. Conclusion The genus Actinomyces involves a diversity of unique FimA proteins with conserved pilin, E box and LPXTG motifs, depending on subspecies and associated binding specificity. In addition, a sortase independent oligomerization of FimA subunit proteins in solution was indicated. PMID:16686953
Automatic annotation of protein motif function with Gene Ontology terms.

PubMed

Lu, Xinghua; Zhai, Chengxiang; Gopalakrishnan, Vanathi; Buchanan, Bruce G

2004-09-02

Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, a much needed and important task is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO) project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. This paper presents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifs is viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association is found to be a very useful feature. We take advantage of the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correct association. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about the functions of newly discovered candidate protein motifs.
Simultaneously learning DNA motif along with its position and sequence rank preferences through expectation maximization algorithm.

PubMed

Zhang, ZhiZhuo; Chang, Cheng Wei; Hugo, Willy; Cheung, Edwin; Sung, Wing-Kin

2013-03-01

Although de novo motifs can be discovered through mining over-represented sequence patterns, this approach misses some real motifs and generates many false positives. To improve accuracy, one solution is to consider some additional binding features (i.e., position preference and sequence rank preference). This information is usually required from the user. This article presents a de novo motif discovery algorithm called SEME (sampling with expectation maximization for motif elicitation), which uses pure probabilistic mixture model to model the motif's binding features and uses expectation maximization (EM) algorithms to simultaneously learn the sequence motif, position, and sequence rank preferences without asking for any prior knowledge from the user. SEME is both efficient and accurate thanks to two important techniques: the variable motif length extension and importance sampling. Using 75 large-scale synthetic datasets, 32 metazoan compendium benchmark datasets, and 164 chromatin immunoprecipitation sequencing (ChIP-Seq) libraries, we demonstrated the superior performance of SEME over existing programs in finding transcription factor (TF) binding sites. SEME is further applied to a more difficult problem of finding the co-regulated TF (coTF) motifs in 15 ChIP-Seq libraries. It identified significantly more correct coTF motifs and, at the same time, predicted coTF motifs with better matching to the known motifs. Finally, we show that the learned position and sequence rank preferences of each coTF reveals potential interaction mechanisms between the primary TF and the coTF within these sites. Some of these findings were further validated by the ChIP-Seq experiments of the coTFs. The application is available online.
Optimized deep-targeted proteotranscriptomic profiling reveals unexplored Conus toxin diversity and novel cysteine frameworks

PubMed Central

Lavergne, Vincent; Harliwong, Ivon; Jones, Alun; Miller, David; Taft, Ryan J.; Alewood, Paul F.

2015-01-01

Cone snails are predatory marine gastropods characterized by a sophisticated venom apparatus responsible for the biosynthesis and delivery of complex mixtures of cysteine-rich toxin peptides. These conotoxins fold into small highly structured frameworks, allowing them to potently and selectively interact with heterologous ion channels and receptors. Approximately 2,000 toxins from an estimated number of >70,000 bioactive peptides have been identified in the genus Conus to date. Here, we describe a high-resolution interrogation of the transcriptomes (available at www.ddbj.nig.ac.jp) and proteomes of the diverse compartments of the Conus episcopatus venom apparatus. Using biochemical and bioinformatic tools, we found the highest number of conopeptides yet discovered in a single Conus specimen, with 3,305 novel precursor toxin sequences classified into 9 known superfamilies (A, I1, I2, M, O1, O2, S, T, Z), and identified 16 new superfamilies showing unique signal peptide signatures. We were also able to depict the largest population of venom peptides containing the pharmacologically active C-C-CC-C-C inhibitor cystine knot and CC-C-C motifs (168 and 44 toxins, respectively), as well as 208 new conotoxins displaying odd numbers of cysteine residues derived from known conotoxin motifs. Importantly, six novel cysteine-rich frameworks were revealed which may have novel pharmacology. Finally, analyses of codon usage bias and RNA-editing processes of the conotoxin transcripts demonstrate a specific conservation of the cysteine skeleton at the nucleic acid level and provide new insights about the origin of sequence hypervariablity in mature toxin regions. PMID:26150494
Host adaptation of Chlamydia pecorum towards low virulence evident in co-evolution of the ompA, incA, and ORF663 Loci.

PubMed

Mohamad, Khalil Yousef; Kaltenboeck, Bernhard; Rahman, Kh Shamsur; Magnino, Simone; Sachse, Konrad; Rodolakis, Annie

2014-01-01

Chlamydia (C.) pecorum, an obligate intracellular bacterium, may cause severe diseases in ruminants, swine and koalas, although asymptomatic infections are the norm. Recently, we identified genetic polymorphisms in the ompA, incA and ORF663 genes that potentially differentiate between high-virulence C. pecorum isolates from diseased animals and low-virulence isolates from asymptomatic animals. Here, we expand these findings by including additional ruminant, swine, and koala strains. Coding tandem repeats (CTRs) at the incA locus encoded a variable number of repeats of APA or AGA amino acid motifs. Addition of any non-APA/AGA repeat motif, such as APEVPA, APAVPA, APE, or APAPE, associated with low virulence (P<10-4), as did a high number of amino acids in all incA CTRs (P = 0.0028). In ORF663, high numbers of 15-mer CTRs correlated with low virulence (P = 0.0001). Correction for ompA phylogram position in ORF663 and incA abolished the correlation between genetic changes and virulence, demonstrating co-evolution of ompA, incA, and ORF663 towards low virulence. Pairwise divergence of ompA, incA, and ORF663 among isolates from healthy animals was significantly higher than among strains isolated from diseased animals (P≤10-5), confirming the longer evolutionary path traversed by low-virulence strains. All three markers combined identified 43 unique strains and 4 pairs of identical strains among all 57 isolates tested, demonstrating the suitability of these markers for epidemiological investigations.
Multiple Dileucine-like Motifs Direct VGLUT1 Trafficking

PubMed Central

Foss, Sarah M.; Li, Haiyan; Santos, Magda S.; Edwards, Robert H.

2013-01-01

The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation. PMID:23804088
Multiple dileucine-like motifs direct VGLUT1 trafficking.

PubMed

Foss, Sarah M; Li, Haiyan; Santos, Magda S; Edwards, Robert H; Voglmaier, Susan M

2013-06-26

The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation.
Comprehensive analysis of single molecule sequencing-derived complete genome and whole transcriptome of Hyposidra talaca nuclear polyhedrosis virus.

PubMed

Nguyen, Thong T; Suryamohan, Kushal; Kuriakose, Boney; Janakiraman, Vasantharajan; Reichelt, Mike; Chaudhuri, Subhra; Guillory, Joseph; Divakaran, Neethu; Rabins, P E; Goel, Ridhi; Deka, Bhabesh; Sarkar, Suman; Ekka, Preety; Tsai, Yu-Chih; Vargas, Derek; Santhosh, Sam; Mohan, Sangeetha; Chin, Chen-Shan; Korlach, Jonas; Thomas, George; Babu, Azariah; Seshagiri, Somasekar

2018-06-12

We sequenced the Hyposidra talaca NPV (HytaNPV) double stranded circular DNA genome using PacBio single molecule sequencing technology. We found that the HytaNPV genome is 139,089 bp long with a GC content of 39.6%. It encodes 141 open reading frames (ORFs) including the 37 baculovirus core genes, 25 genes conserved among lepidopteran baculoviruses, 72 genes known in baculovirus, and 7 genes unique to the HytaNPV genome. It is a group II alphabaculovirus that codes for the F protein and lacks the gp64 gene found in group I alphabaculovirus viruses. Using RNA-seq, we confirmed the expression of the ORFs identified in the HytaNPV genome. Phylogenetic analysis showed HytaNPV to be closest to BusuNPV, SujuNPV and EcobNPV that infect other tea pests, Buzura suppressaria, Sucra jujuba, and Ectropis oblique, respectively. We identified repeat elements and a conserved non-coding baculovirus element in the genome. Analysis of the putative promoter sequences identified motif consistent with the temporal expression of the genes observed in the RNA-seq data.
Function of Apollo (SNM1B) at telomere highlighted by a splice variant identified in a patient with Hoyeraal–Hreidarsson syndrome

PubMed Central

Touzot, Fabien; Callebaut, Isabelle; Soulier, Jean; Gaillard, Laetitia; Azerrad, Chantal; Durandy, Anne; Fischer, Alain; de Villartay, Jean-Pierre; Revy, Patrick

2010-01-01

Telomeres, the protein–DNA complexes at the ends of linear chromosomes, are protected and regulated by the shelterin molecules, the telomerase complex, and other accessory factors, among which is Apollo, a DNA repair factor of the β-lactamase/β-CASP family. Impaired telomere protection in humans causes dyskeratosis congenita and Hoyeraal–Hreidarsson (HH) syndrome, characterized by premature aging, bone marrow failure, and immunodeficiency. We identified a unique Apollo splice variant (designated Apollo-Δ) in fibroblasts from a patient with HH syndrome. Apollo-Δ generates a dominant negative form of Apollo lacking the telomeric repeat-binding factor homology (TRFH)-binding motif (TBM) required for interaction with the shelterin TRF2 at telomeres. Apollo-Δ hampers the proper replication of telomeres, leading to major telomeric dysfunction and cellular senescence, but maintains its DNA interstrand cross-link repair function in the whole genome. These results identify Apollo as a crucial actor in telomere maintenance in vivo, independent of its function as a general DNA repair factor. PMID:20479256
Function of Apollo (SNM1B) at telomere highlighted by a splice variant identified in a patient with Hoyeraal-Hreidarsson syndrome.

PubMed

Touzot, Fabien; Callebaut, Isabelle; Soulier, Jean; Gaillard, Laetitia; Azerrad, Chantal; Durandy, Anne; Fischer, Alain; de Villartay, Jean-Pierre; Revy, Patrick

2010-06-01

Telomeres, the protein-DNA complexes at the ends of linear chromosomes, are protected and regulated by the shelterin molecules, the telomerase complex, and other accessory factors, among which is Apollo, a DNA repair factor of the beta-lactamase/beta-CASP family. Impaired telomere protection in humans causes dyskeratosis congenita and Hoyeraal-Hreidarsson (HH) syndrome, characterized by premature aging, bone marrow failure, and immunodeficiency. We identified a unique Apollo splice variant (designated Apollo-Delta) in fibroblasts from a patient with HH syndrome. Apollo-Delta generates a dominant negative form of Apollo lacking the telomeric repeat-binding factor homology (TRFH)-binding motif (TBM) required for interaction with the shelterin TRF2 at telomeres. Apollo-Delta hampers the proper replication of telomeres, leading to major telomeric dysfunction and cellular senescence, but maintains its DNA interstrand cross-link repair function in the whole genome. These results identify Apollo as a crucial actor in telomere maintenance in vivo, independent of its function as a general DNA repair factor.
Cloning, characterisation and comparative analysis of a starch synthase IV gene in wheat: functional and evolutionary implications

PubMed Central

Leterrier, Marina; Holappa, Lynn D; Broglie, Karen E; Beckles, Diane M

2008-01-01

Background Starch is of great importance to humans as a food and biomaterial, and the amount and structure of starch made in plants is determined in part by starch synthase (SS) activity. Five SS isoforms, SSI, II, III, IV and Granule Bound SSI, have been identified, each with a unique catalytic role in starch synthesis. The basic mode of action of SSs is known; however our knowledge of several aspects of SS enzymology at the structural and mechanistic level is incomplete. To gain a better understanding of the differences in SS sequences that underscore their specificity, the previously uncharacterised SSIVb from wheat was cloned and extensive bioinformatics analyses of this and other SSs sequences were done. Results The wheat SSIV cDNA is most similar to rice SSIVb with which it shows synteny and shares a similar exon-intron arrangement. The wheat SSIVb gene was preferentially expressed in leaf and was not regulated by a circadian clock. Phylogenetic analysis showed that in plants, SSIV is closely related to SSIII, while SSI, SSII and Granule Bound SSI clustered together and distinctions between the two groups can be made at the genetic level and included chromosomal location and intron conservation. Further, identified differences at the amino acid level in their glycosyltransferase domains, predicted secondary structures, global conformations and conserved residues might be indicative of intragroup functional associations. Conclusion Based on bioinformatics analysis of the catalytic region of 36 SSs and 3 glycogen synthases (GSs), it is suggested that the valine residue in the highly conserved K-X-G-G-L motif in SSIII and SSIV may be a determining feature of primer specificity of these SSs as compared to GBSSI, SSI and SSII. In GBSSI, the Ile485 residue may partially explain that enzyme's unique catalytic features. The flexible 380s Loop in the starch catalytic domain may be important in defining the specificity of action for each different SS and the G-X-G in motif VI could define SSIV and SSIII action particularly. PMID:18826586
A conserved human DJ1-subfamily motif (DJSM) is critical for anti-oxidative and deglycase activities of Plasmodium falciparum DJ1.

PubMed

Nair, Divya N; Prasad, Rajesh; Singhal, Neha; Bhattacharjee, Manish; Sudhakar, Renu; Singh, Pushpa; Thanumalayan, Subramonian; Kiran, Uday; Sharma, Yogendra; Sijwali, Puran Singh

2018-06-01

Plasmodium falciparum DJ1 (PfDJ1) belongs to the DJ-1/ThiJ/PfpI superfamily whose members are present in all the kingdoms of life and exhibit diverse cellular functions and biochemical activities. The common feature of the superfamily is the class I glutamine amidotransferase domain with a conserved redox-active cysteine residue, which mediates various activities of the superfamily members, including anti-oxidative activity in PfDJ1 and human DJ1 (hDJ1). As the superfamily members represent diverse functional classes, to investigate if there is any sequence feature unique to hDJ1-like proteins, sequences of the representative proteins of different functional classes were compared and analysed. A novel motif unique to PfDJ1 and several other hDJ1-like proteins, with the consensus sequence of TSXGPX5FXLX5L, was identified that we designated as the hDJ1-subfamily motif (DJSM). Several mutations that have been associated with Parkinson's disease are also present in DJSM, suggesting its functional importance in hDJ1-like proteins. Mutations of the conserved residues of DJSM of PfDJ1 did not significantly affect overall secondary structure, but caused both a significant loss (S151A and P154A) and gain (L168A) of anti-oxidative activity. We also report that PfDJ1 has deglycase activity, which was significantly decreased in its mutants of the catalytic cysteine (C106A) and DJSM (S151A and P154A). Episomal expression of the catalytic cysteine (C106A) or DJSM (P154A) mutant decreased growth rates of parasites as compared to that of wild type parasites or parasites expressing wild type PfDJ1. S151 appears to properly position the nucleophilic elbow containing C106 and P154 forms a hydrogen bond with C106, which could be a reason for the loss of activities of PfDJ1 upon their mutations. Taken together, DJSM delineates PfDJ1 and other hDJ1-subfamily proteins from the remaining superfamily, and is critical for anti-oxidative and deglycase activities of PfDJ1. Copyright © 2018 Elsevier B.V. All rights reserved.
A New Scheme to Characterize and Identify Protein Ubiquitination Sites.

PubMed

Nguyen, Van-Nui; Huang, Kai-Yao; Huang, Chien-Hsun; Lai, K Robert; Lee, Tzong-Yi

2017-01-01

Protein ubiquitination, involving the conjugation of ubiquitin on lysine residue, serves as an important modulator of many cellular functions in eukaryotes. Recent advancements in proteomic technology have stimulated increasing interest in identifying ubiquitination sites. However, most computational tools for predicting ubiquitination sites are focused on small-scale data. With an increasing number of experimentally verified ubiquitination sites, we were motivated to design a predictive model for identifying lysine ubiquitination sites for large-scale proteome dataset. This work assessed not only single features, such as amino acid composition (AAC), amino acid pair composition (AAPC) and evolutionary information, but also the effectiveness of incorporating two or more features into a hybrid approach to model construction. The support vector machine (SVM) was applied to generate the prediction models for ubiquitination site identification. Evaluation by five-fold cross-validation showed that the SVM models learned from the combination of hybrid features delivered a better prediction performance. Additionally, a motif discovery tool, MDDLogo, was adopted to characterize the potential substrate motifs of ubiquitination sites. The SVM models integrating the MDDLogo-identified substrate motifs could yield an average accuracy of 68.70 percent. Furthermore, the independent testing result showed that the MDDLogo-clustered SVM models could provide a promising accuracy (78.50 percent) and perform better than other prediction tools. Two cases have demonstrated the effective prediction of ubiquitination sites with corresponding substrate motifs.
A Novel Protein Interaction between Nucleotide Binding Domain of Hsp70 and p53 Motif

PubMed Central

Elengoe, Asita; Naser, Mohammed Abu; Hamdan, Salehhuddin

2015-01-01

Currently, protein interaction of Homo sapiens nucleotide binding domain (NBD) of heat shock 70 kDa protein (PDB: 1HJO) with p53 motif remains to be elucidated. The NBD-p53 motif complex enhances the p53 stabilization, thereby increasing the tumor suppression activity in cancer treatment. Therefore, we identified the interaction between NBD and p53 using STRING version 9.1 program. Then, we modeled the three-dimensional structure of p53 motif through homology modeling and determined the binding affinity and stability of NBD-p53 motif complex structure via molecular docking and dynamics (MD) simulation. Human DNA binding domain of p53 motif (SCMGGMNR) retrieved from UniProt (UniProtKB: P04637) was docked with the NBD protein, using the Autodock version 4.2 program. The binding energy and intermolecular energy for the NBD-p53 motif complex were −0.44 Kcal/mol and −9.90 Kcal/mol, respectively. Moreover, RMSD, RMSF, hydrogen bonds, salt bridge, and secondary structure analyses revealed that the NBD protein had a strong bond with p53 motif and the protein-ligand complex was stable. Thus, the current data would be highly encouraging for designing Hsp70 structure based drug in cancer therapy. PMID:26098630
A Novel Protein Interaction between Nucleotide Binding Domain of Hsp70 and p53 Motif.

PubMed

Elengoe, Asita; Naser, Mohammed Abu; Hamdan, Salehhuddin

2015-01-01

Currently, protein interaction of Homo sapiens nucleotide binding domain (NBD) of heat shock 70 kDa protein (PDB: 1HJO) with p53 motif remains to be elucidated. The NBD-p53 motif complex enhances the p53 stabilization, thereby increasing the tumor suppression activity in cancer treatment. Therefore, we identified the interaction between NBD and p53 using STRING version 9.1 program. Then, we modeled the three-dimensional structure of p53 motif through homology modeling and determined the binding affinity and stability of NBD-p53 motif complex structure via molecular docking and dynamics (MD) simulation. Human DNA binding domain of p53 motif (SCMGGMNR) retrieved from UniProt (UniProtKB: P04637) was docked with the NBD protein, using the Autodock version 4.2 program. The binding energy and intermolecular energy for the NBD-p53 motif complex were -0.44 Kcal/mol and -9.90 Kcal/mol, respectively. Moreover, RMSD, RMSF, hydrogen bonds, salt bridge, and secondary structure analyses revealed that the NBD protein had a strong bond with p53 motif and the protein-ligand complex was stable. Thus, the current data would be highly encouraging for designing Hsp70 structure based drug in cancer therapy.
Comparing Multi-Step IMAC and Multi-Step TiO2 Methods for Phosphopeptide Enrichment

PubMed Central

Yue, Xiaoshan; Schunter, Alissa; Hummon, Amanda B.

2016-01-01

Phosphopeptide enrichment from complicated peptide mixtures is an essential step for mass spectrometry-based phosphoproteomic studies to reduce sample complexity and ionization suppression effects. Typical methods for enriching phosphopeptides include immobilized metal affinity chromatography (IMAC) or titanium dioxide (TiO2) beads, which have selective affinity and can interact with phosphopeptides. In this study, the IMAC enrichment method was compared with the TiO2 enrichment method, using a multi-step enrichment strategy from whole cell lysate, to evaluate their abilities to enrich for different types of phosphopeptides. The peptide-to-beads ratios were optimized for both IMAC and TiO2 beads. Both IMAC and TiO2 enrichments were performed for three rounds to enable the maximum extraction of phosphopeptides from the whole cell lysates. The phosphopeptides that are unique to IMAC enrichment, unique to TiO2 enrichment, and identified with both IMAC and TiO2 enrichment were analyzed for their characteristics. Both IMAC and TiO2 enriched similar amounts of phosphopeptides with comparable enrichment efficiency. However, phosphopeptides that are unique to IMAC enrichment showed a higher percentage of multi-phosphopeptides, as well as a higher percentage of longer, basic, and hydrophilic phosphopeptides. Also, the IMAC and TiO2 procedures clearly enriched phosphopeptides with different motifs. Finally, further enriching with two rounds of TiO2 from the supernatant after IMAC enrichment, or further enriching with two rounds of IMAC from the supernatant TiO2 enrichment does not fully recover the phosphopeptides that are not identified with the corresponding multi-step enrichment. PMID:26237447
Evolutionary analysis of FAM83H in vertebrates.

PubMed

Huang, Wushuang; Yang, Mei; Wang, Changning; Song, Yaling

2017-01-01

Amelogenesis imperfecta is a group of disorders causing abnormalities in enamel formation in various phenotypes. Many mutations in the FAM83H gene have been identified to result in autosomal dominant hypocalcified amelogenesis imperfecta in different populations. However, the structure and function of FAM83H and its pathological mechanism have yet to be further explored. Evolutionary analysis is an alternative for revealing residues or motifs that are important for protein function. In the present study, we chose 50 vertebrate species in public databases representative of approximately 230 million years of evolution, including 1 amphibian, 2 fishes, 7 sauropsidas and 40 mammals, and we performed evolutionary analysis on the FAM83H protein. By sequence alignment, conserved residues and motifs were indicated, and the loss of important residues and motifs of five special species (Malayan pangolin, platypus, minke whale, nine-banded armadillo and aardvark) was discovered. A phylogenetic time tree showed the FAM83H divergent process. Positive selection sites in the C-terminus suggested that the C-terminus of FAM83H played certain adaptive roles during evolution. The results confirmed some important motifs reported in previous findings and identified some new highly conserved residues and motifs that need further investigation. The results suggest that the C-terminus of FAM83H contain key conserved regions critical to enamel formation and calcification.
Identification of sequence-structure RNA binding motifs for SELEX-derived aptamers.

PubMed

Hoinka, Jan; Zotenko, Elena; Friedman, Adam; Sauna, Zuben E; Przytycka, Teresa M

2012-06-15

Systematic Evolution of Ligands by EXponential Enrichment (SELEX) represents a state-of-the-art technology to isolate single-stranded (ribo)nucleic acid fragments, named aptamers, which bind to a molecule (or molecules) of interest via specific structural regions induced by their sequence-dependent fold. This powerful method has applications in designing protein inhibitors, molecular detection systems, therapeutic drugs and antibody replacement among others. However, full understanding and consequently optimal utilization of the process has lagged behind its wide application due to the lack of dedicated computational approaches. At the same time, the combination of SELEX with novel sequencing technologies is beginning to provide the data that will allow the examination of a variety of properties of the selection process. To close this gap we developed, Aptamotif, a computational method for the identification of sequence-structure motifs in SELEX-derived aptamers. To increase the chances of identifying functional motifs, Aptamotif uses an ensemble-based approach. We validated the method using two published aptamer datasets containing experimentally determined motifs of increasing complexity. We were able to recreate the author's findings to a high degree, thus proving the capability of our approach to identify binding motifs in SELEX data. Additionally, using our new experimental dataset, we illustrate the application of Aptamotif to elucidate several properties of the selection process.

Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

PubMed

Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

2018-01-10

Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing cancer cells. Copyright © 2017 Elsevier B.V. All rights reserved.
Transient α-helices in the disordered RPEL motifs of the serum response factor coactivator MKL1

NASA Astrophysics Data System (ADS)

Mizuguchi, Mineyuki; Fuju, Takahiro; Obita, Takayuki; Ishikawa, Mitsuru; Tsuda, Masaaki; Tabuchi, Akiko

2014-06-01

The megakaryoblastic leukemia 1 (MKL1) protein functions as a transcriptional coactivator of the serum response factor. MKL1 has three RPEL motifs (RPEL1, RPEL2, and RPEL3) in its N-terminal region. MKL1 binds to monomeric G-actin through RPEL motifs, and the dissociation of MKL1 from G-actin promotes the translocation of MKL1 to the nucleus. Although structural data are available for RPEL motifs of MKL1 in complex with G-actin, the structural characteristics of RPEL motifs in the free state have been poorly defined. Here we characterized the structures of free RPEL motifs using NMR and CD spectroscopy. NMR and CD measurements showed that free RPEL motifs are largely unstructured in solution. However, NMR analysis identified transient α-helices in the regions where helices α1 and α2 are induced upon binding to G-actin. Proline mutagenesis showed that the transient α-helices are locally formed without helix-helix interactions. The helix content is higher in the order of RPEL1, RPEL2, and RPEL3. The amount of preformed structure may correlate with the binding affinity between the intrinsically disordered protein and its target molecule.
Characterization of a unique motif in LIM mineralization protein-1 that interacts with jun activation-domain-binding protein 1.

PubMed

Sangadala, Sreedhara; Yoshioka, Katsuhito; Enyo, Yoshio; Liu, Yunshan; Titus, Louisa; Boden, Scott D

2014-01-01

Development and repair of the skeletal system and other organs are highly dependent on precise regulation of the bone morphogenetic protein (BMP) pathway. The use of BMPs clinically to induce bone formation has been limited in part by the requirement of much higher doses of recombinant proteins in primates than were needed in cell culture or rodents. Therefore, increasing cellular responsiveness to BMPs has become our focus. We determined that an osteogenic LIM mineralization protein, LMP-1 interacts with Smurf1 (Smad ubiquitin regulatory factor 1) and prevents ubiquitination of Smads resulting in potentiation of BMP activity. In the region of LMP-1 responsible for bone formation, there is a motif that directly interacts with the Smurf1 WW2 domain and thus effectively competes for binding with Smad1 and Smad5, key signaling proteins in the BMP pathway. Here we show that the same region also contains a motif that interacts with Jun activation-domain-binding protein 1 (Jab1) which targets a common Smad, Smad4, shared by both the BMP and transforming growth factor-β (TGF-β) pathways, for proteasomal degradation. Jab1 was first identified as a coactivator of the transcription factor c-Jun. Jab1 binds to Smad4, Smad5, and Smad7, key intracellular signaling molecules of the TGF-β superfamily, and causes ubiquitination and/or degradation of these Smads. We confirmed a direct interaction of Jab1 with LMP-1 using recombinantly expressed wild-type and mutant proteins in slot-blot-binding assays. We hypothesized that LMP-1 binding to Jab1 prevents the binding and subsequent degradation of these Smads causing increased accumulation of osteogenic Smads in cells. We identified a sequence motif in LMP-1 that was predicted to interact with Jab1 based on the MAME/MAST sequence analysis of several cellular signaling molecules that are known to interact with Jab-1. We further mutated the potential key interacting residues in LMP-1 and showed loss of binding to Jab1 in binding assays in vitro. The activities of various wild-type and mutant LMP-1 proteins were evaluated using a BMP-responsive luciferase reporter and alkaline phosphatase assay in mouse myoblastic cells that were differentiated toward the osteoblastic phenotype. Finally, to strengthen physiological relevance of LMP-1 and Jab1 interaction, we showed that overexpression of LMP-1 caused nuclear accumulation of Smad4 upon BMP treatment which is reflective of increased Smad signaling in cells.
Overexpression of TRIM25 in Lung Cancer Regulates Tumor Cell Progression.

PubMed

Qin, Ying; Cui, He; Zhang, Hua

2016-10-01

Lung cancer is one of the most common causes of cancer-related deaths worldwide. Although great efforts and progressions have been made in the study of the lung cancer in the recent decades, the mechanism of lung cancer formation remains elusive. To establish effective therapeutic methods, new targets implied in lung cancer processes have to be identified. Tripartite motif-containing 25 has been associated with ovarian and breast cancer and is thought to positively promote cell growth by targeting the cell cycle. However, whether tripartite motif-containing 25 has a function in lung cancer development remains unknown. In this study, we found that tripartite motif-containing 25 was overexpressed in human lung cancer tissues. Expression of tripartite motif-containing 25 in lung cancer cells is important for cell proliferation and migration. Knockdown of tripartite motif-containing 25 markedly reduced proliferation of lung cancer cells both in vitro and in vivo and reduced migration of lung cancer cells in vitro Meanwhile, tripartite motif-containing 25 silencing also increased the sensitivity of doxorubicin and significantly increased death and apoptosis of lung cancer cells by doxorubicin were achieved with knockdown of tripartite motif-containing 25. We also observed that tripartite motif-containing 25 formed a complex with p53 and mouse double minute 2 homolog (MDM2) in both human lung cancer tissues and in lung cancer cells and tripartite motif-containing 25 silencing increased the expression of p53. These results provide evidence that tripartite motif-containing 25 contributes to the pathogenesis of lung cancer probably by promoting proliferation and migration of lung cancer cells. Therefore, targeting tripartite motif-containing 25 may provide a potential therapeutic intervention for lung cancer. © The Author(s) 2015.
Isosteric And Non-Isosteric Base Pairs In RNA Motifs: Molecular Dynamics And Bioinformatics Study Of The Sarcin-Ricin Internal Loop

PubMed Central

Havrila, Marek; Réblová, Kamila; Zirbel, Craig L.; Leontis, Neocles B.; Šponer, Jiří

2013-01-01

The Sarcin-Ricin RNA motif (SR motif) is one of the most prominent recurrent RNA building blocks that occurs in many different RNA contexts and folds autonomously, i.e., in a context-independent manner. In this study, we combined bioinformatics analysis with explicit-solvent molecular dynamics (MD) simulations to better understand the relation between the RNA sequence and the evolutionary patterns of SR motif. SHAPE probing experiment was also performed to confirm fidelity of MD simulations. We identified 57 instances of the SR motif in a non-redundant subset of the RNA X-ray structure database and analyzed their basepairing, base-phosphate, and backbone-backbone interactions. We extracted sequences aligned to these instances from large ribosomal RNA alignments to determine frequency of occurrence for different sequence variants. We then used a simple scoring scheme based on isostericity to suggest 10 sequence variants with highly variable expected degree of compatibility with the SR motif 3D structure. We carried out MD simulations of SR motifs with these base substitutions. Non isosteric base substitutions led to unstable structures, but so did isosteric substitutions which were unable to make key base-phosphate interactions. MD technique explains why some potentially isosteric SR motifs are not realized during evolution. We also found that inability to form stable cWW geometry is an important factor in case of the first base pair of the flexible region of the SR motif. Comparison of structural, bioinformatics, SHAPE probing and MD simulation data reveals that explicit solvent MD simulations neatly reflect viability of different sequence variants of the SR motif. Thus, MD simulations can efficiently complement bioinformatics tools in studies of conservation patterns of RNA motifs and provide atomistic insight into the role of their different signature interactions. PMID:24144333
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms.

PubMed

Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

2014-02-17

As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots.
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms

PubMed Central

2014-01-01

Background As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Results Recently, an algorithm called “LDsplit” has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. Conclusions LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots. PMID:24533858
Analysis of the linker region joining the adenylation and carrier protein domains of the modular nonribosomal peptide synthetases.

PubMed

Miller, Bradley R; Sundlov, Jesse A; Drake, Eric J; Makin, Thomas A; Gulick, Andrew M

2014-10-01

Nonribosomal peptide synthetases (NRPSs) are multimodular proteins capable of producing important peptide natural products. Using an assembly line process, the amino acid substrate and peptide intermediates are passed between the active sites of different catalytic domains of the NRPS while bound covalently to a peptidyl carrier protein (PCP) domain. Examination of the linker sequences that join the NRPS adenylation and PCP domains identified several conserved proline residues that are not found in standalone adenylation domains. We examined the roles of these proline residues and neighboring conserved sequences through mutagenesis and biochemical analysis of the reaction catalyzed by the adenylation domain and the fully reconstituted NRPS pathway. In particular, we identified a conserved LPxP motif at the start of the adenylation-PCP linker. The LPxP motif interacts with a region on the adenylation domain to stabilize a critical catalytic lysine residue belonging to the A10 motif that immediately precedes the linker. Further, this interaction with the C-terminal subdomain of the adenylation domain may coordinate movement of the PCP with the conformational change of the adenylation domain. Through this work, we extend the conserved A10 motif of the adenylation domain and identify residues that enable proper adenylation domain function. © 2014 Wiley Periodicals, Inc.
Complexity of the 5' Untranslated Region of EIF4A3, a Critical Factor for Craniofacial and Neural Development.

PubMed

Hsia, Gabriella S P; Musso, Camila M; Alvizi, Lucas; Brito, Luciano A; Kobayashi, Gerson S; Pavanello, Rita C M; Zatz, Mayana; Gardham, Alice; Wakeling, Emma; Zechi-Ceide, Roseli M; Bertola, Debora; Passos-Bueno, Maria Rita

2018-01-01

Repeats in coding and non-coding regions have increasingly been associated with many human genetic disorders, such as Richieri-Costa-Pereira syndrome (RCPS). RCPS, mostly characterized by midline cleft mandible, Robin sequence and limb defects, is an autosomal-recessive acrofacial dysostosis mainly reported in Brazilian patients. This disorder is caused by decreased levels of EIF4A3 , mostly due to an increased number of repeats at the EIF4A3 5'UTR. EIF4A3 5'UTR alleles are CG-rich and vary in size and organization of three types of motifs. An exclusive allelic pattern was identified among affected individuals, in which the CGCA-motif is the most prevalent, herein referred as "disease-associated CGCA-20nt motif." The origin of the pathogenic alleles containing the disease-associated motif, as well as the functional effects of the 5'UTR motifs on EIF4A3 expression, to date, are entirely unknown. Here, we characterized 43 different EIF4A3 5'UTR alleles in a cohort of 380 unaffected individuals. We identified eight heterozygous unaffected individuals harboring the disease-associated CGCA-20nt motif and our haplotype analyses indicate that there are more than one haplotype associated with RCPS. The combined analysis of number, motif organization and haplotypic diversity, as well as the observation of two apparently distinct haplotypes associated with the disease-associated CGCA-20nt motif, suggest that the RCPS alleles might have arisen from independent unequal crossing-over events between ancient alleles at least twice. Moreover, we have shown that the number and sequence of motifs in the 5'UTR region is associated with EIF4A3 repression, which is not mediated by CpG methylation. In conclusion, this study has shown that the large number of repeats in EIF4A3 does not represent a dynamic mutation and RCPS can arise in any population harboring alleles with the CGCA-20nt motif. We also provided further evidence that EIF4A3 5'UTR is a regulatory region and the size and sequence type of the repeats at 5'UTR may contribute to clinical variability in RCPS.
Enrichment of Circular Code Motifs in the Genes of the Yeast Saccharomyces cerevisiae.

PubMed

Michel, Christian J; Ngoune, Viviane Nguefack; Poch, Olivier; Ripp, Raymond; Thompson, Julie D

2017-12-03

A set X of 20 trinucleotides has been found to have the highest average occurrence in the reading frame, compared to the two shifted frames, of genes of bacteria, archaea, eukaryotes, plasmids and viruses. This set X has an interesting mathematical property, since X is a maximal C3 self-complementary trinucleotide circular code. Furthermore, any motif obtained from this circular code X has the capacity to retrieve, maintain and synchronize the original (reading) frame. Since 1996, the theory of circular codes in genes has mainly been developed by analysing the properties of the 20 trinucleotides of X, using combinatorics and statistical approaches. For the first time, we test this theory by analysing the X motifs, i.e., motifs from the circular code X, in the complete genome of the yeast Saccharomyces cerevisiae . Several properties of X motifs are identified by basic statistics (at the frequency level), and evaluated by comparison to R motifs, i.e., random motifs generated from 30 different random codes R. We first show that the frequency of X motifs is significantly greater than that of R motifs in the genome of S. cerevisiae . We then verify that no significant difference is observed between the frequencies of X and R motifs in the non-coding regions of S. cerevisiae , but that the occurrence number of X motifs is significantly higher than R motifs in the genes (protein-coding regions). This property is true for all cardinalities of X motifs (from 4 to 20) and for all 16 chromosomes. We further investigate the distribution of X motifs in the three frames of S. cerevisiae genes and show that they occur more frequently in the reading frame, regardless of their cardinality or their length. Finally, the ratio of X genes, i.e., genes with at least one X motif, to non-X genes, in the set of verified genes is significantly different to that observed in the set of putative or dubious genes with no experimental evidence. These results, taken together, represent the first evidence for a significant enrichment of X motifs in the genes of an extant organism. They raise two hypotheses: the X motifs may be evolutionary relics of the primitive codes used for translation, or they may continue to play a functional role in the complex processes of genome decoding and protein synthesis.
Identification of three novel B-cell epitopes of VMH protein from Vibrio mimicus by screening a phage display peptide library.

PubMed

Xiao, Ning; Cao, Ji; Zhou, Hao; Ding, Shu-Quan; Kong, Ling-Yan; Li, Jin-Nian

2016-12-01

Vibrio mimicus is the causative agent of ascites disease in fish. The heat-labile hemolytic toxin designated VMH is an immunoprotective antigen of V. mimicus. However, its epitopes have not been well characterized. Here, a commercially available phage displayed 12-mer peptide library was used to screen epitopes of VMH protein using polyclonal rabbit anti-rVMH protein antibodies, and then five positive phage clones were identified by sandwich and competitive ELISA. Sequences analysis showed that the motif of DPTLL displayed on phage clone 15 and the consensus motif of SLDDDST displayed on the clone 4/11 corresponded to the residues 134-138 and 238-244 of VMH protein, respectively, and the synthetic motif peptides could also be recognized by anti-rVMH-HD antibody in peptide-ELISA. Thus, both motifs DPTLL and SLDDDST were identified as minimal linear B-cell epitopes of VMH protein. Although no similarity was found between VMH protein and the consensus motif of ADGLVPR displayed on the clone 2/6, the synthetic peptide ADGLVPR could absorb anti-rVMH-HD antibody and inhibit the antibody binding to rVMH protein in enhanced chemoluminescence Western blotting, whereas irrelevant control peptide did not affect the antibody binding with rVMH. These results revealed that the peptide ADGLVPR was a mimotope of VMH protein. Taken together, three novel B-cell epitopes of VMH protein were identified, which provide a foundation for developing epitope-based vaccine against V. mimicus infection in fish. Copyright © 2016 Elsevier B.V. All rights reserved.
Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds.

PubMed

Mariani, Luca; Weinand, Kathryn; Vedenko, Anastasia; Barrera, Luis A; Bulyk, Martha L

2017-09-27

Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs. Copyright © 2017 Elsevier Inc. All rights reserved.
Transcription factor ThWRKY4 binds to a novel WLS motif and a RAV1A element in addition to the W-box to regulate gene expression.

PubMed

Xu, Hongyun; Shi, Xinxin; Wang, Zhibo; Gao, Caiqiu; Wang, Chao; Wang, Yucheng

2017-08-01

WRKY transcription factors play important roles in many biological processes, and mainly bind to the W-box element to regulate gene expression. Previously, we characterized a WRKY gene from Tamarix hispida, ThWRKY4, in response to abiotic stress, and showed that it bound to the W-box motif. However, whether ThWRKY4 could bind to other motifs remains unknown. In this study, we employed a Transcription Factor-Centered Yeast one Hybrid (TF-Centered Y1H) screen to study the motifs recognized by ThWRKY4. In addition to the W-box core cis-element (termed W-box), we identified that ThWRKY4 could bind to two other motifs: the RAV1A element (CAACA) and a novel motif with sequence of GTCTA (W-box like sequence, WLS). The distributions of these motifs were screened in the promoter regions of genes regulated by some WRKYs. The results showed that the W-box, RAV1A, and WLS motifs were all present in high numbers, suggesting that they play key roles in gene expression mediated by WRKYs. Furthermore, five WRKY proteins from different WRKY subfamilies in Arabidopsis thaliana were selected and confirmed to bind to the RAV1A and WLS motifs, indicating that they are recognized commonly by WRKYs. These findings will help to further reveal the functions of WRKY proteins. Copyright © 2017 Elsevier B.V. All rights reserved.
The glycine-rich motif of Pyrococcus abyssi DNA polymerase D is critical for protein stability.

PubMed

Castrec, Benoît; Laurent, Sébastien; Henneke, Ghislaine; Flament, Didier; Raffin, Jean-Paul

2010-03-05

A glycine-rich motif described as being involved in human polymerase delta proliferating cell nuclear antigen (PCNA) binding has also been identified in all euryarchaeal DNA polymerase D (Pol D) family members. We redefined the motif as the (G)-PYF box. In the present study, Pol D (G)-PYF box motif mutants from Pyrococcus abyssi were generated to investigate its role in functional interactions with the cognate PCNA. We demonstrated that this motif is not essential for interactions between PabPol D (P. abyssi Pol D) and PCNA, using surface plasmon resonance and primer extension studies. Interestingly, the (G)-PYF box is located in a hydrophobic region close to the active site. The (G)-PYF box mutants exhibited altered DNA binding properties. In addition, the thermal stability of all mutants was reduced compared to that of wild type, and this effect could be attributed to increased exposure of the hydrophobic region. These studies suggest that the (G)-PYF box motif mediates intersubunit interactions and that it may be crucial for the thermostability of PabPol D. (c) 2010 Elsevier Ltd. All rights reserved.
Gene cloning and characterization of a novel esterase from activated sludge metagenome

PubMed Central

2009-01-01

A metagenomic library was prepared using pCC2FOS vector containing about 3.0 Gbp of community DNA from the microbial assemblage of activated sludge. Screening of a part of the un-amplified library resulted in the finding of 1 unique lipolytic clone capable of hydrolyzing tributyrin, in which an esterase gene was identified. This esterase/lipase gene consists of 834 bp and encodes a polypeptide (designated EstAS) of 277 amino acid residuals with a molecular mass of 31 kDa. Sequence analysis indicated that it showed 33% and 31% amino acid identity to esterase/lipase from Gemmata obscuriglobus UQM 2246 (ZP_02733109) and Yarrowia lipolytica CLIB122 (XP_504639), respectively; and several conserved regions were identified, including the putative active site, HSMGG, a catalytic triad (Ser92, His125 and Asp216) and a LHYFRG conserved motif. The EstAS was overexpressed, purified and shown to hydrolyse p-nitrophenyl (NP) esters of fatty acids with short chain lengths (≤ C8). This EstAS had optimal temperature and pH at 35°C and 9.0, respectively, by hydrolysis of p-NP hexanoate. It also exhibited the same level of stability over wide temperature and pH ranges and in the presence of metal ions or detergents. The high level of stability of esterase EstAS with its unique substrate specificities make itself highly useful for biotechnological applications. PMID:20028524
Progress Report for DOE DE-FG03-98ER20317 ''Regulation of the floral homeotic gene AGAMOUS'' Current and Final Funding Period: September 1, 2002, to December 31, 2002

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weigel, D.

2003-03-11

OAK-B135 Results obtained during this funding period: (1) Phylogenetic footprinting of AG regulatory sequences Sequences necessary and sufficient for AGAMOUS (AG) expression in the center of Arabidopsis flowers are located in the second intron, which is about 3 kb in size. This intron contains binding sites for two transcription factors, LEAFY (LFY) and WUSCHEL (WUS), which are direct activators of AG. We used the new method of phylogenetic shadowing to identify new regulatory elements. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested sixmore » of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. (2) Repression of AG by MADS box genes A candidate for repressing AG in the shoot apical meristem has been the MADS box gene FUL, since it is expressed in the shoot apical meristem and since an activated version (FUL:VP16) leads to ectopic AG expression in the shoot apical meristem. However, there is no ectopic AG expression in full single mutants. We therefore started to generate VP16 fusions of several other MADS box genes expressed in the shoot apical meristem, to determine which of these might be candidates for FUL redundant genes. We found that AGL6:VP16 has a similar phenotype as FUL:VP16, suggesting that AGL6 and FUL interact. We are now testing this hypothesis. (3) Two candidate AG regulators, WOW and ULA Because the phylogenetic footprinting project has identified several new candidate regulatory motifs, of which at least one (the CCAATCA motif) has rather strong effects, we had decided to put the analysis of WOW and ULA on hold, and to focus on using the newly identified motifs as tools. We conduct ed yeast one-hybrid screen with two of the conserved motifs, and identified several classes of transcription factors that can interact with them. One of these is encoded by the PAN gene, previously known to be expressed in a domain that overlaps the AG domain, but not known before to regulate AG. (4) New genetic modifiers of AG This part of the project was concluded in the previous funding period.« less
Analysis of the interactome of the Ser/Thr Protein Phosphatase type 1 in Plasmodium falciparum.

PubMed

Hollin, Thomas; De Witte, Caroline; Lenne, Astrid; Pierrot, Christine; Khalife, Jamal

2016-03-17

Protein Phosphatase 1 (PP1) is an enzyme essential to cell viability in the malaria parasite Plasmodium falciparum (Pf). The activity of PP1 is regulated by the binding of regulatory subunits, of which there are up to 200 in humans, but only 3 have been so far reported for the parasite. To better understand the P. falciparum PP1 (PfPP1) regulatory network, we here report the use of three strategies to characterize the PfPP1 interactome: co-affinity purified proteins identified by mass spectrometry, yeast two-hybrid (Y2H) screening and in silico analysis of the P. falciparum predicted proteome. Co-affinity purification followed by MS analysis identified 6 PfPP1 interacting proteins (Pips) of which 3 contained the RVxF consensus binding, 2 with a Fxx[RK]x[RK] motif, also shown to be a PP1 binding motif and one with both binding motifs. The Y2H screens identified 134 proteins of which 30 present the RVxF binding motif and 20 have the Fxx[RK]x[RK] binding motif. The in silico screen of the Pf predicted proteome using a consensus RVxF motif as template revealed the presence of 55 potential Pips. As further demonstration, 35 candidate proteins were validated as PfPP1 interacting proteins in an ELISA-based assay. To the best of our knowledge, this is the first study on PfPP1 interactome. The data reports several conserved PP1 interacting proteins as well as a high number of specific interactors to PfPP1. Their analysis indicates a high diversity of biological functions for PP1 in Plasmodium. Based on the present data and on an earlier study of the Pf interactome, a potential implication of Pips in protein folding/proteolysis, transcription and pathogenicity networks is proposed. The present work provides a starting point for further studies on the structural basis of these interactions and their functions in P. falciparum.
Physicochemically Tunable Polyfunctionalized RNA Square Architecture with Fluorogenic and Ribozymatic Properties

PubMed Central

2015-01-01

Recent advances in RNA nanotechnology allow the rational design of various nanoarchitectures. Previous methods utilized conserved angles from natural RNA motifs to form geometries with specific sizes. However, the feasibility of producing RNA architecture with variable sizes using native motifs featuring fixed sizes and angles is limited. It would be advantageous to display RNA nanoparticles of diverse shape and size derived from a given primary sequence. Here, we report an approach to construct RNA nanoparticles with tunable size and stability. Multifunctional RNA squares with a 90° angle were constructed by tuning the 60° angle of the three-way junction (3WJ) motif from the packaging RNA (pRNA) of the bacteriophage phi29 DNA packaging motor. The physicochemical properties and size of the RNA square were also easily tuned by modulating the “core” strand and adjusting the length of the sides of the square via predictable design. Squares of 5, 10, and 20 nm were constructed, each showing diverse thermodynamic and chemical stabilities. Four “arms” extending from the corners of the square were used to incorporate siRNA, ribozyme, and fluorogenic RNA motifs. Unique intramolecular contact using the pre-existing intricacy of the 3WJ avoids relatively weaker intermolecular interactions via kissing loops or sticky ends. Utilizing the 3WJ motif, we have employed a modular design technique to construct variable-size RNA squares with controllable properties and functionalities for diverse and versatile applications with engineering, pharmaceutical, and medical potential. This technique for simple design to finely tune physicochemical properties adds a new angle to RNA nanotechnology. PMID:24971772
Expression patterns of TEL genes in Poaceae suggest a conserved association with cell differentiation.

PubMed

Paquet, Nicolas; Bernadet, Marie; Morin, Halima; Traas, Jan; Dron, Michel; Charon, Celine

2005-06-01

Poaceae species present a conserved distichous phyllotaxy (leaf position along the stem) and share common properties with respect to leaf initiation. The goal of this work was to determine if these common traits imply common genes. Therefore, homologues of the maize TERMINAL EAR1 gene in Poaceae were studied. This gene encodes an RNA-binding motif (RRM) protein, that is suggested to regulate leaf initiation. Using degenerate primers, one unique tel (terminal ear1-like) gene from seven Poaceae members, covering almost all the phylogenetic tree of the family, was identified by PCR. These genes present a very high degree of similarity, a much conserved exon-intron structure, and the three RRMs and TEL characteristic motifs. The evolution of tel sequences in Poaceae strongly correlates with the known phylogenetic tree of this family. RT-PCR gene expression analyses show conserved tel expression in the shoot apex in all species, suggesting functional orthology between these genes. In addition, in situ hybridization experiments with specific antisense probes show tel transcript accumulation in all differentiating cells of the leaf, from the recruitment of leaf founder cells to leaf margins cells. Tel expression is not restricted to initiating leaves as it is also found in pro-vascular tissues, root meristems, and immature inflorescences. Therefore, these results suggest that TEL is not only associated with leaf initiation but more generally with cell differentiation in Poaceae.
HIV-1 adaptation to antigen processing results in population-level immune evasion and affects subtype diversification.

PubMed

Tenzer, Stefan; Crawford, Hayley; Pymm, Phillip; Gifford, Robert; Sreenu, Vattipally B; Weimershaus, Mirjana; de Oliveira, Tulio; Burgevin, Anne; Gerstoft, Jan; Akkad, Nadja; Lunn, Daniel; Fugger, Lars; Bell, John; Schild, Hansjörg; van Endert, Peter; Iversen, Astrid K N

2014-04-24

The recent HIV-1 vaccine failures highlight the need to better understand virus-host interactions. One key question is why CD8(+) T cell responses to two HIV-Gag regions are uniquely associated with delayed disease progression only in patients expressing a few rare HLA class I variants when these regions encode epitopes presented by ~30 more common HLA variants. By combining epitope processing and computational analyses of the two HIV subtypes responsible for ~60% of worldwide infections, we identified a hitherto unrecognized adaptation to the antigen-processing machinery through substitutions at subtype-specific motifs. Multiple HLA variants presenting epitopes situated next to a given subtype-specific motif drive selection at this subtype-specific position, and epitope abundances correlate inversely with the HLA frequency distribution in affected populations. This adaptation reflects the sum of intrapatient adaptations, is predictable, facilitates viral subtype diversification, and increases global HIV diversity. Because low epitope abundance is associated with infrequent and weak T cell responses, this most likely results in both population-level immune evasion and inadequate responses in most people vaccinated with natural HIV-1 sequence constructs. Our results suggest that artificial sequence modifications at subtype-specific positions in vitro could refocus and reverse the poor immunogenicity of HIV proteins. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

Identification of E-cadherin signature motifs functioning as cleavage sites for Helicobacter pylori HtrA

NASA Astrophysics Data System (ADS)

Schmidt, Thomas P.; Perna, Anna M.; Fugmann, Tim; Böhm, Manja; Jan Hiss; Haller, Sarah; Götz, Camilla; Tegtmeyer, Nicole; Hoy, Benjamin; Rau, Tilman T.; Neri, Dario; Backert, Steffen; Schneider, Gisbert; Wessler, Silja

2016-03-01

The cell adhesion protein and tumour suppressor E-cadherin exhibits important functions in the prevention of gastric cancer. As a class-I carcinogen, Helicobacter pylori (H. pylori) has developed a unique strategy to interfere with E-cadherin functions. In previous studies, we have demonstrated that H. pylori secretes the protease high temperature requirement A (HtrA) which cleaves off the E-cadherin ectodomain (NTF) on epithelial cells. This opens cell-to-cell junctions, allowing bacterial transmigration across the polarised epithelium. Here, we investigated the molecular mechanism of the HtrA-E-cadherin interaction and identified E-cadherin cleavage sites for HtrA. Mass-spectrometry-based proteomics and Edman degradation revealed three signature motifs containing the [VITA]-[VITA]-x-x-D-[DN] sequence pattern, which were preferentially cleaved by HtrA. Based on these sites, we developed a substrate-derived peptide inhibitor that selectively bound and inhibited HtrA, thereby blocking transmigration of H. pylori. The discovery of HtrA-targeted signature sites might further explain why we detected a stable 90 kDa NTF fragment during H. pylori infection, but also additional E-cadherin fragments ranging from 105 kDa to 48 kDa in in vitro cleavage experiments. In conclusion, HtrA targets E-cadherin signature sites that are accessible in in vitro reactions, but might be partially masked on epithelial cells through functional homophilic E-cadherin interactions.
A Conserved GPG-Motif in the HIV-1 Nef Core Is Required for Principal Nef-Activities

PubMed Central

Martínez-Bonet, Marta; Palladino, Claudia; Briz, Veronica; Rudolph, Jochen M.; Fackler, Oliver T.; Relloso, Miguel; Muñoz-Fernandez, Maria Angeles; Madrid, Ricardo

2015-01-01

To find out new determinants required for Nef activity we performed a functional alanine scanning analysis along a discrete but highly conserved region at the core of HIV-1 Nef. We identified the GPG-motif, located at the 121–137 region of HIV-1 NL4.3 Nef, as a novel protein signature strictly required for the p56Lck dependent Nef-induced CD4-downregulation in T-cells. Since the Nef-GPG motif was dispensable for CD4-downregulation in HeLa-CD4 cells, Nef/AP-1 interaction and Nef-dependent effects on Tf-R trafficking, the observed effects on CD4 downregulation cannot be attributed to structure constraints or to alterations on general protein trafficking. Besides, we found that the GPG-motif was also required for Nef-dependent inhibition of ring actin re-organization upon TCR triggering and MHCI downregulation, suggesting that the GPG-motif could actively cooperate with the Nef PxxP motif for these HIV-1 Nef-related effects. Finally, we observed that the Nef-GPG motif was required for optimal infectivity of those viruses produced in T-cells. According to these findings, we propose the conserved GPG-motif in HIV-1 Nef as functional region required for HIV-1 infectivity and therefore with a potential interest for the interference of Nef activity during HIV-1 infection. PMID:26700863
A naturally occurring, noncanonical GTP aptamer made of simple tandem repeats

PubMed Central

Curtis, Edward A; Liu, David R

2014-01-01

Recently, we used in vitro selection to identify a new class of naturally occurring GTP aptamer called the G motif. Here we report the discovery and characterization of a second class of naturally occurring GTP aptamer, the “CA motif.” The primary sequence of this aptamer is unusual in that it consists entirely of tandem repeats of CA-rich motifs as short as three nucleotides. Several active variants of the CA motif aptamer lack the ability to form consecutive Watson-Crick base pairs in any register, while others consist of repeats containing only cytidine and adenosine residues, indicating that noncanonical interactions play important roles in its structure. The circular dichroism spectrum of the CA motif aptamer is distinct from that of A-form RNA and other major classes of nucleic acid structures. Bioinformatic searches indicate that the CA motif is absent from most archaeal and bacterial genomes, but occurs in at least 70 percent of approximately 400 eukaryotic genomes examined. These searches also uncovered several phylogenetically conserved examples of the CA motif in rodent (mouse and rat) genomes. Together, these results reveal the existence of a second class of naturally occurring GTP aptamer whose sequence requirements, like that of the G motif, are not consistent with those of a canonical secondary structure. They also indicate a new and unexpected potential biochemical activity of certain naturally occurring tandem repeats. PMID:24824832
Blind prediction of noncanonical RNA structure at atomic accuracy.

PubMed

Watkins, Andrew M; Geniesse, Caleb; Kladwang, Wipapat; Zakrevsky, Paul; Jaeger, Luc; Das, Rhiju

2018-05-01

Prediction of RNA structure from nucleotide sequence remains an unsolved grand challenge of biochemistry and requires distinct concepts from protein structure prediction. Despite extensive algorithmic development in recent years, modeling of noncanonical base pairs of new RNA structural motifs has not been achieved in blind challenges. We report a stepwise Monte Carlo (SWM) method with a unique add-and-delete move set that enables predictions of noncanonical base pairs of complex RNA structures. A benchmark of 82 diverse motifs establishes the method's general ability to recover noncanonical pairs ab initio, including multistrand motifs that have been refractory to prior approaches. In a blind challenge, SWM models predicted nucleotide-resolution chemical mapping and compensatory mutagenesis experiments for three in vitro selected tetraloop/receptors with previously unsolved structures (C7.2, C7.10, and R1). As a final test, SWM blindly and correctly predicted all noncanonical pairs of a Zika virus double pseudoknot during a recent community-wide RNA-Puzzle. Stepwise structure formation, as encoded in the SWM method, enables modeling of noncanonical RNA structure in a variety of previously intractable problems.
Induction of lateral lumens through disruption of a monoleucine-based basolateral-sorting motif in betacellulin

PubMed Central

Singh, Bhuminder; Bogatcheva, Galina; Starchenko, Alina; Sinnaeve, Justine; Lapierre, Lynne A.; Williams, Janice A.; Goldenring, James R.; Coffey, Robert J.

2015-01-01

ABSTRACT Directed delivery of EGF receptor (EGFR) ligands to the apical or basolateral surface is a crucial regulatory step in the initiation of EGFR signaling in polarized epithelial cells. Herein, we show that the EGFR ligand betacellulin (BTC) is preferentially sorted to the basolateral surface of polarized MDCK cells. By using sequential truncations and site-directed mutagenesis within the BTC cytoplasmic domain, combined with selective cell-surface biotinylation and immunofluorescence, we have uncovered a monoleucine-based basolateral-sorting motif (EExxxL, specifically 156EEMETL161). Disruption of this sorting motif led to equivalent apical and basolateral localization of BTC. Unlike other EGFR ligands, BTC mistrafficking induced formation of lateral lumens in polarized MDCK cells, and this process was significantly attenuated by inhibition of EGFR. Additionally, expression of a cancer-associated somatic BTC mutation (E156K) led to BTC mistrafficking and induced lateral lumens in MDCK cells. Overexpression of BTC, especially mistrafficking forms, increased the growth of MDCK cells. These results uncover a unique role for BTC mistrafficking in promoting epithelial reorganization. PMID:26272915
Structural basis of RNA folding and recognition in an AMP-RNA aptamer complex.

PubMed

Jiang, F; Kumar, R A; Jones, R A; Patel, D J

1996-07-11

The catalytic properties of RNA and its well known role in gene expression and regulation are the consequence of its unique solution structures. Identification of the structural determinants of ligand recognition by RNA molecules is of fundamental importance for understanding the biological functions of RNA, as well as for the rational design of RNA Sequences with specific catalytic activities. Towards this latter end, Szostak et al. used in vitro selection techniques to isolate RNA sequences ('aptamers') containing a high-affinity binding site for ATP, the universal currency of cellular energy, and then used this motif to engineer ribozymes with polynucleotide kinase activity. Here we present the solution structure, as determined by multidimensional NMR spectroscopy and molecular dynamics calculations, of both uniformly and specifically 13C-, 15N-labelled 40-mer RNA containing the ATP-binding motif complexed with AMP. The aptamer adopts an L-shaped structure with two nearly orthogonal stems, each capped proximally by a G x G mismatch pair, binding the AMP ligand at their junction in a GNRA-like motif.
Structural Basis of PP2A Inhibition by Small t Antigen

PubMed Central

Cho, Uhn Soo; Morrone, Seamus; Sablina, Anna A; Arroyo, Jason D; Hahn, William C; Xu, Wenqing

2007-01-01

The SV40 small t antigen (ST) is a potent oncoprotein that perturbs the function of protein phosphatase 2A (PP2A). ST directly interacts with the PP2A scaffolding A subunit and alters PP2A activity by displacing regulatory B subunits from the A subunit. We have determined the crystal structure of full-length ST in complex with PP2A A subunit at 3.1 Å resolution. ST consists of an N-terminal J domain and a C-terminal unique domain that contains two zinc-binding motifs. Both the J domain and second zinc-binding motif interact with the intra-HEAT-repeat loops of HEAT repeats 3–7 of the A subunit, which overlaps with the binding site of the PP2A B56 subunit. Intriguingly, the first zinc-binding motif is in a position that may allow it to directly interact with and inhibit the phosphatase activity of the PP2A catalytic C subunit. These observations provide a structural basis for understanding the oncogenic functions of ST. PMID:17608567
Synthesis and Reactivity of Alkyl-1,1,1-trisphosphonate Esters

PubMed Central

Smits, Jacqueline P.; Wiemer, David F.

2011-01-01

The α–trisphosphonic acid esters provide a unique spatial arrangement of three phosphonate groups, and may represent an attractive motif for inhibitors of enzymes that utilize di- or triphosphate substrates. To advance studies of this unique functionality, a general route to alkyl derivatives of the parent system (R = H) has been developed. A set of new α-alkyl-1,1,1-trisphosphonate esters has been prepared through phosphinylation and subsequent oxidation of tetraethyl alkylbisphosphonates, and the reactivity of these new compounds has been studied in representative reactions that afford additional examples of this functionality. PMID:21916407
Identification of sequence–structure RNA binding motifs for SELEX-derived aptamers

PubMed Central

Hoinka, Jan; Zotenko, Elena; Friedman, Adam; Sauna, Zuben E.; Przytycka, Teresa M.

2012-01-01

Motivation: Systematic Evolution of Ligands by EXponential Enrichment (SELEX) represents a state-of-the-art technology to isolate single-stranded (ribo)nucleic acid fragments, named aptamers, which bind to a molecule (or molecules) of interest via specific structural regions induced by their sequence-dependent fold. This powerful method has applications in designing protein inhibitors, molecular detection systems, therapeutic drugs and antibody replacement among others. However, full understanding and consequently optimal utilization of the process has lagged behind its wide application due to the lack of dedicated computational approaches. At the same time, the combination of SELEX with novel sequencing technologies is beginning to provide the data that will allow the examination of a variety of properties of the selection process. Results: To close this gap we developed, Aptamotif, a computational method for the identification of sequence–structure motifs in SELEX-derived aptamers. To increase the chances of identifying functional motifs, Aptamotif uses an ensemble-based approach. We validated the method using two published aptamer datasets containing experimentally determined motifs of increasing complexity. We were able to recreate the author's findings to a high degree, thus proving the capability of our approach to identify binding motifs in SELEX data. Additionally, using our new experimental dataset, we illustrate the application of Aptamotif to elucidate several properties of the selection process. Contact: przytyck@ncbi.nlm.nih.gov, Zuben.Sauna@fda.hhs.gov PMID:22689764
Dipeptide frequency/bias analysis identifies conserved sites of nonrandomness shared by cysteine-rich motifs.

PubMed

Campion, S R; Ameen, A S; Lai, L; King, J M; Munzenmaier, T N

2001-08-15

This report describes the application of a simple computational tool, AAPAIR.TAB, for the systematic analysis of the cysteine-rich EGF, Sushi, and Laminin motif/sequence families at the two-amino acid level. Automated dipeptide frequency/bias analysis detects preferences in the distribution of amino acids in established protein families, by determining which "ordered dipeptides" occur most frequently in comprehensive motif-specific sequence data sets. Graphic display of the dipeptide frequency/bias data revealed family-specific preferences for certain dipeptides, but more importantly detected a shared preference for employment of the ordered dipeptides Gly-Tyr (GY) and Gly-Phe (GF) in all three protein families. The dipeptide Asn-Gly (NG) also exhibited high-frequency and bias in the EGF and Sushi motif families, whereas Asn-Thr (NT) was distinguished in the Laminin family. Evaluation of the distribution of dipeptides identified by frequency/bias analysis subsequently revealed the highly restricted localization of the G(F/Y) and N(G/T) sequence elements at two separate sites of extreme conservation in the consensus sequence of all three sequence families. The similar employment of the high-frequency/bias dipeptides in three distinct protein sequence families was further correlated with the concurrence of these shared molecular determinants at similar positions within the distinctive scaffolds of three structurally divergent, but similarly employed, motif modules.
Two-level tunneling systems in amorphous alumina

NASA Astrophysics Data System (ADS)

Lebedeva, Irina V.; Paz, Alejandro P.; Tokatly, Ilya V.; Rubio, Angel

2014-03-01

The decades of research on thermal properties of amorphous solids at temperatures below 1 K suggest that their anomalous behaviour can be related to quantum mechanical tunneling of atoms between two nearly equivalent states that can be described as a two-level system (TLS). This theory is also supported by recent studies on microwave spectroscopy of superconducting qubits. However, the microscopic nature of the TLS remains unknown. To identify structural motifs for TLSs in amorphous alumina we have performed extensive classical molecular dynamics simulations. Several bistable motifs with only one or two atoms jumping by considerable distance ~ 0.5 Å were found at T=25 K. Accounting for the surrounding environment relaxation was shown to be important up to distances ~ 7 Å. The energy asymmetry and barrier for the detected motifs lied in the ranges 0.5 - 2 meV and 4 - 15 meV, respectively, while their density was about 1 motif per 10 000 atoms. Tuning of motif asymmetry by strain was demonstrated with the coupling coefficient below 1 eV. The tunnel splitting for the symmetrized motifs was estimated on the order of 0.1 meV. The discovered motifs are in good agreement with the available experimental data. The financial support from the Marie Curie Fellowship PIIF-GA-2012-326435 (RespSpatDisp) is gratefully acknowledged.
Multiple activities of the plant pathogen type III effector proteins WtsE and AvrE require WxxxE motifs.

PubMed

Ham, Jong Hyun; Majerczak, Doris R; Nomura, Kinya; Mecey, Christy; Uribe, Francisco; He, Sheng-Yang; Mackey, David; Coplin, David L

2009-06-01

The broadly conserved AvrE-family of type III effectors from gram-negative plant-pathogenic bacteria includes important virulence factors, yet little is known about the mechanisms by which these effectors function inside plant cells to promote disease. We have identified two conserved motifs in AvrE-family effectors: a WxxxE motif and a putative C-terminal endoplasmic reticulum membrane retention/retrieval signal (ERMRS). The WxxxE and ERMRS motifs are both required for the virulence activities of WtsE and AvrE, which are major virulence factors of the corn pathogen Pantoea stewartii subsp. stewartii and the tomato or Arabidopsis pathogen Pseudomonas syringae pv. tomato, respectively. The WxxxE and the predicted ERMRS motifs are also required for other biological activities of WtsE, including elicitation of the hypersensitive response in nonhost plants and suppression of defense responses in Arabidopsis. A family of type III effectors from mammalian bacterial pathogens requires WxxxE and subcellular targeting motifs for virulence functions that involve their ability to mimic activated G-proteins. The conservation of related motifs and their necessity for the function of type III effectors from plant pathogens indicates that disturbing host pathways by mimicking activated host G-proteins may be a virulence mechanism employed by plant pathogens as well.
Evaluation of DNA Binding Drugs as Inhibitors of ESX, and ETS Domain Transcription Factor Associated With Breast Cancer: Effects of ESX/DNA Complex Disruption

DTIC Science & Technology

2000-08-01

4). Sequence recognition of all four DNA bases is achieved by positioning an N- methylimidazole opposite guanine or N-methylpyrrole opposite...unique sequences of DNA based upon selective binding motifs to all four DNA bases , although relatively little is known about the ability of these agents to
Oligomers and Polymers Based on Pentacene Building Blocks

PubMed Central

Lehnherr, Dan; Tykwinski, Rik R.

2010-01-01

Functionalized pentacene derivatives continue to provide unique materials for organic semiconductor applications. Although oligomers and polymers based on pentacene building blocks remain quite rare, recent synthetic achievements have provided a number of examples with varied structural motifs. This review highlights recent work in this area and, when possible, contrasts the properties of defined-length pentacene oligomers to those of mono- and polymeric systems.
Transcriptome Analysis in Tardigrade Species Reveals Specific Molecular Pathways for Stress Adaptations

PubMed Central

Förster, Frank; Beisser, Daniela; Grohme, Markus A.; Liang, Chunguang; Mali, Brahim; Siegl, Alexander Matthias; Engelmann, Julia C.; Shkumatov, Alexander V.; Schokraie, Elham; Müller, Tobias; Schnölzer, Martina; Schill, Ralph O.; Frohme, Marcus; Dandekar, Thomas

2012-01-01

Tardigrades have unique stress-adaptations that allow them to survive extremes of cold, heat, radiation and vacuum. To study this, encoded protein clusters and pathways from an ongoing transcriptome study on the tardigrade Milnesium tardigradum were analyzed using bioinformatics tools and compared to expressed sequence tags (ESTs) from Hypsibius dujardini, revealing major pathways involved in resistance against extreme environmental conditions. ESTs are available on the Tardigrade Workbench along with software and databank updates. Our analysis reveals that RNA stability motifs for M. tardigradum are different from typical motifs known from higher animals. M. tardigradum and H. dujardini protein clusters and conserved domains imply metabolic storage pathways for glycogen, glycolipids and specific secondary metabolism as well as stress response pathways (including heat shock proteins, bmh2, and specific repair pathways). Redox-, DNA-, stress- and protein protection pathways complement specific repair capabilities to achieve the strong robustness of M. tardigradum. These pathways are partly conserved in other animals and their manipulation could boost stress adaptation even in human cells. However, the unique combination of resistance and repair pathways make tardigrades and M. tardigradum in particular so highly stress resistant. PMID:22563243
Differential pleiotropy and HOX functional organization.

PubMed

Sivanantharajah, Lovesha; Percival-Smith, Anthony

2015-02-01

Key studies led to the idea that transcription factors are composed of defined modular protein motifs or domains, each with separable, unique function. During evolution, the recombination of these modular domains could give rise to transcription factors with new properties, as has been shown using recombinant molecules. This archetypic, modular view of transcription factor organization is based on the analyses of a few transcription factors such as GAL4, which may represent extreme exemplars rather than an archetype or the norm. Recent work with a set of Homeotic selector (HOX) proteins has revealed differential pleiotropy: the observation that highly-conserved HOX protein motifs and domains make small, additive, tissue specific contributions to HOX activity. Many of these differentially pleiotropic HOX motifs may represent plastic sequence elements called short linear motifs (SLiMs). The coupling of differential pleiotropy with SLiMs, suggests that protein sequence changes in HOX transcription factors may have had a greater impact on morphological diversity during evolution than previously believed. Furthermore, differential pleiotropy may be the genetic consequence of an ensemble nature of HOX transcription factor allostery, where HOX proteins exist as an ensemble of states with the capacity to integrate an extensive array of developmental information. Given a new structural model for HOX functional domain organization, the properties of the archetypic TF may require reassessment. Copyright © 2014 Elsevier Inc. All rights reserved.
The Transcriptional Complex Between the BCL2 i-Motif and hnRNP LL Is a Molecular Switch for Control of Gene Expression That Can Be Modulated by Small Molecules

PubMed Central

2015-01-01

In a companion paper (DOI: 10.021/ja410934b) we demonstrate that the C-rich strand of the cis-regulatory element in the BCL2 promoter element is highly dynamic in nature and can form either an i-motif or a flexible hairpin. Under physiological conditions these two secondary DNA structures are found in an equilibrium mixture, which can be shifted by the addition of small molecules that trap out either the i-motif (IMC-48) or the flexible hairpin (IMC-76). In cellular experiments we demonstrate that the addition of these molecules has opposite effects on BCL2 gene expression and furthermore that these effects are antagonistic. In this contribution we have identified a transcriptional factor that recognizes and binds to the BCL2 i-motif to activate transcription. The molecular basis for the recognition of the i-motif by hnRNP LL is determined, and we demonstrate that the protein unfolds the i-motif structure to form a stable single-stranded complex. In subsequent experiments we show that IMC-48 and IMC-76 have opposite, antagonistic effects on the formation of the hnRNP LL–i-motif complex as well as on the transcription factor occupancy at the BCL2 promoter. For the first time we propose that the i-motif acts as a molecular switch that controls gene expression and that small molecules that target the dynamic equilibrium of the i-motif and the flexible hairpin can differentially modulate gene expression. PMID:24559432
The elastase-PK101 structure: Mechanism of an ultrasensitive activity-based probe revealed

DOE PAGES

Lechtenberg, Bernhard C.; Robinson, Howard R.; Kasperkiewicz, Paulina; ...

2015-01-22

Human neutrophil elastase (HNE) plays a central role in neutrophil host defense, but its broad specificity makes HNE a difficult target for both inhibitor and probe development. Recently, we identified the unnatural amino acid containing activity-based probe PK101, which exhibits astounding sensitivity and selectivity for HNE, yet completely lacks mechanistic explanation for its unique characteristics. Here, we present the crystal structure of the HNE-PK101 complex which not only reveals the basis for PK101 ultrasensitivity but also uncovers so far unrecognized HNE features. Strikingly, the Nle( O-Bzl) function in the P4 position of PK101 reveals and leverages an “exo-pocket” on HNEmore » as a critical factor for selectivity. Furthermore, the PK101 P3 position harbors a methionine dioxide function, which mimics a post-translationally oxidized methionine residue and forms a critical hydrogen bond to the backbone amide of Gly219 of HNE. Gly219 resides in a Gly–Gly motif that is unique to HNE, yet compulsory for this interaction. Consequently, this feature enables HNE to accommodate substrates that have undergone methionine oxidation, which constitutes a hallmark post-translational modification of neutrophil signaling.« less
Identification, characterization, and expression analysis of calmodulin and calmodulin-like genes in grapevine (Vitis vinifera) reveal likely roles in stress responses.

PubMed

Vandelle, Elodie; Vannozzi, Alessandro; Wong, Darren; Danzi, Davide; Digby, Anne-Marie; Dal Santo, Silvia; Astegno, Alessandra

2018-06-04

Calcium (Ca 2+ ) is an ubiquitous key second messenger in plants, where it modulates many developmental and adaptive processes in response to various stimuli. Several proteins containing Ca 2+ binding domain have been identified in plants, including calmodulin (CaM) and calmodulin-like (CML) proteins, which play critical roles in translating Ca 2+ signals into proper cellular responses. In this work, a genome-wide analysis conducted in Vitis vinifera identified three CaM- and 62 CML-encoding genes. We assigned gene family nomenclature, analyzed gene structure, chromosomal location and gene duplication, as well as protein motif organization. The phylogenetic clustering revealed a total of eight subgroups, including one unique clade of VviCaMs distinct from VviCMLs. VviCaMs were found to contain four EF-hand motifs whereas VviCML proteins have one to five. Most of grapevine CML genes were intronless, while VviCaMs were intron rich. All the genes were well spread among the 19 grapevine chromosomes and displayed a high level of duplication. The expression profiling of VviCaM/VviCML genes revealed a broad expression pattern across all grape organs and tissues at various developmental stages, and a significant modulation in biotic stress-related responses. Our results highlight the complexity of CaM/CML protein family also in grapevine, supporting the versatile role of its different members in modulating cellular responses to various stimuli, in particular to biotic stresses. This work lays the foundation for further functional and structural studies on specific grapevine CaMs/CMLs in order to better understand the role of Ca 2+ -binding proteins in grapevine and to explore their potential for further biotechnological applications. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Profiling of 3696 Nuclear Receptor-Coregulator Interactions: A Resource for Biological and Clinical Discovery.

PubMed

Broekema, Marjoleine F; Hollman, Danielle A A; Koppen, Arjen; van den Ham, Henk-Jan; Melchers, Diana; Pijnenburg, Dirk; Ruijtenbeek, Rob; van Mil, Saskia W C; Houtman, René; Kalkhoven, Eric

2018-06-01

Nuclear receptors (NRs) are ligand-inducible transcription factors that play critical roles in metazoan development, reproduction, and physiology and therefore are implicated in a broad range of pathologies. The transcriptional activity of NRs critically depends on their interaction(s) with transcriptional coregulator proteins, including coactivators and corepressors. Short leucine-rich peptide motifs in these proteins (LxxLL in coactivators and LxxxIxxxL in corepressors) are essential and sufficient for NR binding. With 350 different coregulator proteins identified to date and with many coregulators containing multiple interaction motifs, an enormous combinatorial potential is present for selective NR-mediated gene regulation. However, NR-coregulator interactions have often been determined experimentally on a one-to-one basis across diverse experimental conditions. In addition, NR-coregulator interactions are difficult to predict because the molecular determinants that govern specificity are not well established. Therefore, many biologically and clinically relevant NR-coregulator interactions may remain to be discovered. Here, we present a comprehensive overview of 3696 NR-coregulator interactions by systematically characterizing the binding of 24 nuclear receptors with 154 coregulator peptides. We identified unique ligand-dependent NR-coregulator interaction profiles for each NR, confirming many well-established NR-coregulator interactions. Hierarchical clustering based on the NR-coregulator interaction profiles largely recapitulates the classification of NR subfamilies based on the primary amino acid sequences of the ligand-binding domains, indicating that amino acid sequence is an important, although not the only, molecular determinant in directing and fine-tuning NR-coregulator interactions. This NR-coregulator peptide interactome provides an open data resource for future biological and clinical discovery as well as NR-based drug design.

Self-complementary circular codes in coding theory.

PubMed

Fimmel, Elena; Michel, Christian J; Starman, Martin; Strüngmann, Lutz

2018-04-01

Self-complementary circular codes are involved in pairing genetic processes. A maximal [Formula: see text] self-complementary circular code X of trinucleotides was identified in genes of bacteria, archaea, eukaryotes, plasmids and viruses (Michel in Life 7(20):1-16 2017, J Theor Biol 380:156-177, 2015; Arquès and Michel in J Theor Biol 182:45-58 1996). In this paper, self-complementary circular codes are investigated using the graph theory approach recently formulated in Fimmel et al. (Philos Trans R Soc A 374:20150058, 2016). A directed graph [Formula: see text] associated with any code X mirrors the properties of the code. In the present paper, we demonstrate a necessary condition for the self-complementarity of an arbitrary code X in terms of the graph theory. The same condition has been proven to be sufficient for codes which are circular and of large size [Formula: see text] trinucleotides, in particular for maximal circular codes ([Formula: see text] trinucleotides). For codes of small-size [Formula: see text] trinucleotides, some very rare counterexamples have been constructed. Furthermore, the length and the structure of the longest paths in the graphs associated with the self-complementary circular codes are investigated. It has been proven that the longest paths in such graphs determine the reading frame for the self-complementary circular codes. By applying this result, the reading frame in any arbitrary sequence of trinucleotides is retrieved after at most 15 nucleotides, i.e., 5 consecutive trinucleotides, from the circular code X identified in genes. Thus, an X motif of a length of at least 15 nucleotides in an arbitrary sequence of trinucleotides (not necessarily all of them belonging to X) uniquely defines the reading (correct) frame, an important criterion for analyzing the X motifs in genes in the future.
Identification and analysis of pig chimeric mRNAs using RNA sequencing data

PubMed Central

2012-01-01

Background Gene fusion is ubiquitous over the course of evolution. It is expected to increase the diversity and complexity of transcriptomes and proteomes through chimeric sequence segments or altered regulation. However, chimeric mRNAs in pigs remain unclear. Here we identified some chimeric mRNAs in pigs and analyzed the expression of them across individuals and breeds using RNA-sequencing data. Results The present study identified 669 putative chimeric mRNAs in pigs, of which 251 chimeric candidates were detected in a set of RNA-sequencing data. The 618 candidates had clear trans-splicing sites, 537 of which obeyed the canonical GU-AG splice rule. Only two putative pig chimera variants whose fusion junction was overlapped with that of a known human chimeric mRNA were found. A set of unique chimeric events were considered middle variances in the expression across individuals and breeds, and revealed non-significant variance between sexes. Furthermore, the genomic region of the 5′ partner gene shares a similar DNA sequence with that of the 3′ partner gene for 458 putative chimeric mRNAs. The 81 of those shared DNA sequences significantly matched the known DNA-binding motifs in the JASPAR CORE database. Four DNA motifs shared in parental genomic regions had significant similarity with known human CTCF binding sites. Conclusions The present study provided detailed information on some pig chimeric mRNAs. We proposed a model that trans-acting factors, such as CTCF, induced the spatial organisation of parental genes to the same transcriptional factory so that parental genes were coordinatively transcribed to give birth to chimeric mRNAs. PMID:22925561
The Regulatory Factor ZFHX3 Modifies Circadian Function in SCN via an AT Motif-Driven Axis

PubMed Central

Parsons, Michael J.; Brancaccio, Marco; Sethi, Siddharth; Maywood, Elizabeth S.; Satija, Rahul; Edwards, Jessica K.; Jagannath, Aarti; Couch, Yvonne; Finelli, Mattéa J.; Smyllie, Nicola J.; Esapa, Christopher; Butler, Rachel; Barnard, Alun R.; Chesham, Johanna E.; Saito, Shoko; Joynson, Greg; Wells, Sara; Foster, Russell G.; Oliver, Peter L.; Simon, Michelle M.; Mallon, Ann-Marie; Hastings, Michael H.; Nolan, Patrick M.

2015-01-01

Summary We identified a dominant missense mutation in the SCN transcription factor Zfhx3, termed short circuit (Zfhx3Sci), which accelerates circadian locomotor rhythms in mice. ZFHX3 regulates transcription via direct interaction with predicted AT motifs in target genes. The mutant protein has a decreased ability to activate consensus AT motifs in vitro. Using RNA sequencing, we found minimal effects on core clock genes in Zfhx3Sci/+ SCN, whereas the expression of neuropeptides critical for SCN intercellular signaling was significantly disturbed. Moreover, mutant ZFHX3 had a decreased ability to activate AT motifs in the promoters of these neuropeptide genes. Lentiviral transduction of SCN slices showed that the ZFHX3-mediated activation of AT motifs is circadian, with decreased amplitude and robustness of these oscillations in Zfhx3Sci/+ SCN slices. In conclusion, by cloning Zfhx3Sci, we have uncovered a circadian transcriptional axis that determines the period and robustness of behavioral and SCN molecular rhythms. PMID:26232227
Nucleophosmin integrates within the nucleolus via multi-modal interactions with proteins displaying R-rich linear motifs and rRNA.

PubMed

Mitrea, Diana M; Cika, Jaclyn A; Guy, Clifford S; Ban, David; Banerjee, Priya R; Stanley, Christopher B; Nourse, Amanda; Deniz, Ashok A; Kriwacki, Richard W

2016-02-02

The nucleolus is a membrane-less organelle formed through liquid-liquid phase separation of its components from the surrounding nucleoplasm. Here, we show that nucleophosmin (NPM1) integrates within the nucleolus via a multi-modal mechanism involving multivalent interactions with proteins containing arginine-rich linear motifs (R-motifs) and ribosomal RNA (rRNA). Importantly, these R-motifs are found in canonical nucleolar localization signals. Based on a novel combination of biophysical approaches, we propose a model for the molecular organization within liquid-like droplets formed by the N-terminal domain of NPM1 and R-motif peptides, thus providing insights into the structural organization of the nucleolus. We identify multivalency of acidic tracts and folded nucleic acid binding domains, mediated by N-terminal domain oligomerization, as structural features required for phase separation of NPM1 with other nucleolar components in vitro and for localization within mammalian nucleoli. We propose that one mechanism of nucleolar localization involves phase separation of proteins within the nucleolus.
Searching RNA motifs and their intermolecular contacts with constraint networks.

PubMed

Thébault, P; de Givry, S; Schiex, T; Gaspin, C

2006-09-01

Searching RNA gene occurrences in genomic sequences is a task whose importance has been renewed by the recent discovery of numerous functional RNA, often interacting with other ligands. Even if several programs exist for RNA motif search, none exists that can represent and solve the problem of searching for occurrences of RNA motifs in interaction with other molecules. We present a constraint network formulation of this problem. RNA are represented as structured motifs that can occur on more than one sequence and which are related together by possible hybridization. The implemented tool MilPat is used to search for several sRNA families in genomic sequences. Results show that MilPat allows to efficiently search for interacting motifs in large genomic sequences and offers a simple and extensible framework to solve such problems. New and known sRNA are identified as H/ACA candidates in Methanocaldococcus jannaschii. http://carlit.toulouse.inra.fr/MilPaT/MilPat.pl.
Promzea: a pipeline for discovery of co-regulatory motifs in maize and other plant species and its application to the anthocyanin and phlobaphene biosynthetic pathways and the Maize Development Atlas.

PubMed

Liseron-Monfils, Christophe; Lewis, Tim; Ashlock, Daniel; McNicholas, Paul D; Fauteux, François; Strömvik, Martina; Raizada, Manish N

2013-03-15

The discovery of genetic networks and cis-acting DNA motifs underlying their regulation is a major objective of transcriptome studies. The recent release of the maize genome (Zea mays L.) has facilitated in silico searches for regulatory motifs. Several algorithms exist to predict cis-acting elements, but none have been adapted for maize. A benchmark data set was used to evaluate the accuracy of three motif discovery programs: BioProspector, Weeder and MEME. Analysis showed that each motif discovery tool had limited accuracy and appeared to retrieve a distinct set of motifs. Therefore, using the benchmark, statistical filters were optimized to reduce the false discovery ratio, and then remaining motifs from all programs were combined to improve motif prediction. These principles were integrated into a user-friendly pipeline for motif discovery in maize called Promzea, available at http://www.promzea.org and on the Discovery Environment of the iPlant Collaborative website. Promzea was subsequently expanded to include rice and Arabidopsis. Within Promzea, a user enters cDNA sequences or gene IDs; corresponding upstream sequences are retrieved from the maize genome. Predicted motifs are filtered, combined and ranked. Promzea searches the chosen plant genome for genes containing each candidate motif, providing the user with the gene list and corresponding gene annotations. Promzea was validated in silico using a benchmark data set: the Promzea pipeline showed a 22% increase in nucleotide sensitivity compared to the best standalone program tool, Weeder, with equivalent nucleotide specificity. Promzea was also validated by its ability to retrieve the experimentally defined binding sites of transcription factors that regulate the maize anthocyanin and phlobaphene biosynthetic pathways. Promzea predicted additional promoter motifs, and genome-wide motif searches by Promzea identified 127 non-anthocyanin/phlobaphene genes that each contained all five predicted promoter motifs in their promoters, perhaps uncovering a broader co-regulated gene network. Promzea was also tested against tissue-specific microarray data from maize. An online tool customized for promoter motif discovery in plants has been generated called Promzea. Promzea was validated in silico by its ability to retrieve benchmark motifs and experimentally defined motifs and was tested using tissue-specific microarray data. Promzea predicted broader networks of gene regulation associated with the historic anthocyanin and phlobaphene biosynthetic pathways. Promzea is a new bioinformatics tool for understanding transcriptional gene regulation in maize and has been expanded to include rice and Arabidopsis.
Signature motif-guided identification of receptors for peptide hormones essential for root meristem growth.

PubMed

Song, Wen; Liu, Li; Wang, Jizong; Wu, Zhen; Zhang, Heqiao; Tang, Jiao; Lin, Guangzhong; Wang, Yichuan; Wen, Xing; Li, Wenyang; Han, Zhifu; Guo, Hongwei; Chai, Jijie

2016-06-01

Peptide-mediated cell-to-cell signaling has crucial roles in coordination and definition of cellular functions in plants. Peptide-receptor matching is important for understanding the mechanisms underlying peptide-mediated signaling. Here we report the structure-guided identification of root meristem growth factor (RGF) receptors important for plant development. An assay based on a signature ligand recognition motif (Arg-x-Arg) conserved in a subfamily of leucine-rich repeat receptor kinases (LRR-RKs) identified the functionally uncharacterized LRR-RK At4g26540 as a receptor of RGF1 (RGFR1). We further solved the crystal structure of RGF1 in complex with the LRR domain of RGFR1 at a resolution of 2.6 Å, which reveals that the Arg-x-Gly-Gly (RxGG) motif is responsible for specific recognition of the sulfate group of RGF1 by RGFR1. Based on the RxGG motif, we identified additional four RGFRs. Participation of the five RGFRs in RGF-induced signaling is supported by biochemical and genetic data. We also offer evidence showing that SERKs function as co-receptors for RGFs. Taken together, our study identifies RGF receptors and co-receptors that can link RGF signals with their downstream components and provides a proof of principle for structure-based matching of LRR-RKs with their peptide ligands.
Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine.

PubMed

Li, Yunhai; Lee, Kee Khoon; Walsh, Sean; Smith, Caroline; Hadingham, Sophie; Sorefan, Karim; Cawley, Gavin; Bevan, Michael W

2006-03-01

Establishing transcriptional regulatory networks by analysis of gene expression data and promoter sequences shows great promise. We developed a novel promoter classification method using a Relevance Vector Machine (RVM) and Bayesian statistical principles to identify discriminatory features in the promoter sequences of genes that can correctly classify transcriptional responses. The method was applied to microarray data obtained from Arabidopsis seedlings treated with glucose or abscisic acid (ABA). Of those genes showing >2.5-fold changes in expression level, approximately 70% were correctly predicted as being up- or down-regulated (under 10-fold cross-validation), based on the presence or absence of a small set of discriminative promoter motifs. Many of these motifs have known regulatory functions in sugar- and ABA-mediated gene expression. One promoter motif that was not known to be involved in glucose-responsive gene expression was identified as the strongest classifier of glucose-up-regulated gene expression. We show it confers glucose-responsive gene expression in conjunction with another promoter motif, thus validating the classification method. We were able to establish a detailed model of glucose and ABA transcriptional regulatory networks and their interactions, which will help us to understand the mechanisms linking metabolism with growth in Arabidopsis. This study shows that machine learning strategies coupled to Bayesian statistical methods hold significant promise for identifying functionally significant promoter sequences.
Mapping the low palmitate fap1 mutation and validation of its effects in soybean oil and agronomic traits in three soybean populations.

PubMed

Cardinal, Andrea J; Whetten, Rebecca; Wang, Sanbao; Auclair, Jérôme; Hyten, David; Cregan, Perry; Bachlava, Eleni; Gillman, Jason; Ramirez, Martha; Dewey, Ralph; Upchurch, Greg; Miranda, Lilian; Burton, Joseph W

2014-01-01

fap 1 mutation is caused by a G174A change in GmKASIIIA that disrupts a donor splice site recognition and creates a GATCTG motif that enhanced its expression. Soybean oil with reduced palmitic acid content is desirable to reduce the health risks associated with consumption of this fatty acid. The objectives of this study were: to identify the genomic location of the reduced palmitate fap1 mutation, determine its molecular basis, estimate the amount of phenotypic variation in fatty acid composition explained by this locus, determine if there are epistatic interactions between the fap1 and fap nc loci and, determine if the fap1 mutation has pleiotropic effects on seed yield, oil and protein content in three soybean populations. This study detected two major QTL for 16:0 content located in chromosome 5 (GmFATB1a, fap nc) and chromosome 9 near BARCSOYSSR_09_1707 that explained, with their interaction, 66-94 % of the variation in 16:0 content in the three populations. Sequencing results of a putative candidate gene, GmKASIIIA, revealed a single unique polymorphism in the germplasm line C1726, which was predicted to disrupt the donor splice site recognition between exon one and intron one and produce a truncated KASIIIA protein. This G to A change also created the GATCTG motif that enhanced gene expression of the mutated GmKASIIIA gene. Lines homozygous for the GmKASIIIA mutation (fap1) had a significant reduction in 16:0, 18:0, and oil content; and an increase in unsaturated fatty acids content. There were significant epistatic interactions between GmKASIIIA (fap1) and fap nc for 16:0 and oil contents, and seed yield in two populations. In conclusion, the fap1 phenotype is caused by a single unique SNP in the GmKASIIIA gene.
TrawlerWeb: an online de novo motif discovery tool for next-generation sequencing datasets.

PubMed

Dang, Louis T; Tondl, Markus; Chiu, Man Ho H; Revote, Jerico; Paten, Benedict; Tano, Vincent; Tokolyi, Alex; Besse, Florence; Quaife-Ryan, Greg; Cumming, Helen; Drvodelic, Mark J; Eichenlaub, Michael P; Hallab, Jeannette C; Stolper, Julian S; Rossello, Fernando J; Bogoyevitch, Marie A; Jans, David A; Nim, Hieu T; Porrello, Enzo R; Hudson, James E; Ramialison, Mirana

2018-04-05

A strong focus of the post-genomic era is mining of the non-coding regulatory genome in order to unravel the function of regulatory elements that coordinate gene expression (Nat 489:57-74, 2012; Nat 507:462-70, 2014; Nat 507:455-61, 2014; Nat 518:317-30, 2015). Whole-genome approaches based on next-generation sequencing (NGS) have provided insight into the genomic location of regulatory elements throughout different cell types, organs and organisms. These technologies are now widespread and commonly used in laboratories from various fields of research. This highlights the need for fast and user-friendly software tools dedicated to extracting cis-regulatory information contained in these regulatory regions; for instance transcription factor binding site (TFBS) composition. Ideally, such tools should not require prior programming knowledge to ensure they are accessible for all users. We present TrawlerWeb, a web-based version of the Trawler_standalone tool (Nat Methods 4:563-5, 2007; Nat Protoc 5:323-34, 2010), to allow for the identification of enriched motifs in DNA sequences obtained from next-generation sequencing experiments in order to predict their TFBS composition. TrawlerWeb is designed for online queries with standard options common to web-based motif discovery tools. In addition, TrawlerWeb provides three unique new features: 1) TrawlerWeb allows the input of BED files directly generated from NGS experiments, 2) it automatically generates an input-matched biologically relevant background, and 3) it displays resulting conservation scores for each instance of the motif found in the input sequences, which assists the researcher in prioritising the motifs to validate experimentally. Finally, to date, this web-based version of Trawler_standalone remains the fastest online de novo motif discovery tool compared to other popular web-based software, while generating predictions with high accuracy. TrawlerWeb provides users with a fast, simple and easy-to-use web interface for de novo motif discovery. This will assist in rapidly analysing NGS datasets that are now being routinely generated. TrawlerWeb is freely available and accessible at: http://trawler.erc.monash.edu.au .
DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP.

PubMed

Mitra, Sneha; Biswas, Anushua; Narlikar, Leelavati

2018-04-01

Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY/releases/tag/v1.0.0.
SVM2Motif—Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor

PubMed Central

Vidovic, Marina M. -C.; Görnitz, Nico; Müller, Klaus-Robert; Rätsch, Gunnar; Kloft, Marius

2015-01-01

Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs) achieve state-of-the-art performances in genomic discrimination tasks, but—due to its black-box character—motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs) allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs—regardless of their length and complexity—underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set. PMID:26690911
The crystal structure of the regulatory domain of the human sodium-driven chloride/bicarbonate exchanger.

PubMed

Alvadia, Carolina M; Sommer, Theis; Bjerregaard-Andersen, Kaare; Damkier, Helle Hasager; Montrasio, Michele; Aalkjaer, Christian; Morth, J Preben

2017-09-21

The sodium-driven chloride/bicarbonate exchanger (NDCBE) is essential for maintaining homeostatic pH in neurons. The crystal structure at 2.8 Å resolution of the regulatory N-terminal domain of human NDCBE represents the first crystal structure of an electroneutral sodium-bicarbonate cotransporter. The crystal structure forms an equivalent dimeric interface as observed for the cytoplasmic domain of Band 3, and thus establishes that the consensus motif VTVLP is the key minimal dimerization motif. The VTVLP motif is highly conserved and likely to be the physiologically relevant interface for all other members of the SLC4 family. A novel conserved Zn 2+ -binding motif present in the N-terminal domain of NDCBE is identified and characterized in vitro. Cellular studies confirm the Zn 2+ dependent transport of two electroneutral bicarbonate transporters, NCBE and NBCn1. The Zn 2+ site is mapped to a cluster of histidines close to the conserved ETARWLKFEE motif and likely plays a role in the regulation of this important motif. The combined structural and bioinformatics analysis provides a model that predicts with additional confidence the physiologically relevant interface between the cytoplasmic domain and the transmembrane domain.
Correlated Mutation in the Evolution of Catalysis in Uracil DNA Glycosylase Superfamily

NASA Astrophysics Data System (ADS)

Xia, Bo; Liu, Yinling; Guevara, Jose; Li, Jing; Jilich, Celeste; Yang, Ye; Wang, Liangjiang; Dominy, Brian N.; Cao, Weiguo

2017-04-01

Enzymes in Uracil DNA glycosylase (UDG) superfamily are essential for the removal of uracil. Family 4 UDGa is a robust uracil DNA glycosylase that only acts on double-stranded and single-stranded uracil-containing DNA. Based on mutational, kinetic and modeling analyses, a catalytic mechanism involving leaving group stabilization by H155 in motif 2 and water coordination by N89 in motif 3 is proposed. Mutual Information analysis identifies a complexed correlated mutation network including a strong correlation in the EG doublet in motif 1 of family 4 UDGa and in the QD doublet in motif 1 of family 1 UNG. Conversion of EG doublet in family 4 Thermus thermophilus UDGa to QD doublet increases the catalytic efficiency by over one hundred-fold and seventeen-fold over the E41Q and G42D single mutation, respectively, rectifying the strong correlation in the doublet. Molecular dynamics simulations suggest that the correlated mutations in the doublet in motif 1 position the catalytic H155 in motif 2 to stabilize the leaving uracilate anion. The integrated approach has important implications in studying enzyme evolution and protein structure and function.
Cross-reactions vs co-sensitization evaluated by in silico motifs and in vitro IgE microarray testing.

PubMed

Pfiffner, P; Stadler, B M; Rasi, C; Scala, E; Mari, A

2012-02-01

Using an in silico allergen clustering method, we have recently shown that allergen extracts are highly cross-reactive. Here we used serological data from a multi-array IgE test based on recombinant or highly purified natural allergens to evaluate whether co-reactions are true cross-reactions or co-sensitizations by allergens with the same motifs. The serum database consisted of 3142 samples, each tested against 103 highly purified natural or recombinant allergens. Cross-reactivity was predicted by an iterative motif-finding algorithm through sequence motifs identified in 2708 known allergens. Allergen proteins containing the same motifs cross-reacted as predicted. However, proteins with identical motifs revealed a hierarchy in the degree of cross-reaction: The more frequent an allergen was positive in the allergic population, the less frequently it was cross-reacting and vice versa. Co-sensitization was analyzed by splitting the dataset into patient groups that were most likely sensitized through geographical occurrence of allergens. Interestingly, most co-reactions are cross-reactions but not co-sensitizations. The observed hierarchy of cross-reactivity may play an important role for the future management of allergic diseases. © 2011 John Wiley & Sons A/S.
Comparative genomics of pyridoxal 5′-phosphate-dependent transcription factor regulons in Bacteria

PubMed Central

Suvorova, Inna A.

2016-01-01

The MocR-subfamily transcription factors (MocR-TFs) characterized by the GntR-family DNA-binding domain and aminotransferase-like sensory domain are broadly distributed among certain lineages of Bacteria. Characterized MocR-TFs bind pyridoxal 5′-phosphate (PLP) and control transcription of genes involved in PLP, gamma aminobutyric acid (GABA) and taurine metabolism via binding specific DNA operator sites. To identify putative target genes and DNA binding motifs of MocR-TFs, we performed comparative genomics analysis of over 250 bacterial genomes. The reconstructed regulons for 825 MocR-TFs comprise structural genes from over 200 protein families involved in diverse biological processes. Using the genome context and metabolic subsystem analysis we tentatively assigned functional roles for 38 out of 86 orthologous groups of studied regulators. Most of these MocR-TF regulons are involved in PLP metabolism, as well as utilization of GABA, taurine and ectoine. The remaining studied MocR-TF regulators presumably control genes encoding enzymes involved in reduction/oxidation processes, various transporters and PLP-dependent enzymes, for example aminotransferases. Predicted DNA binding motifs of MocR-TFs are generally similar in each orthologous group and are characterized by two to four repeated sequences. Identified motifs were classified according to their structures. Motifs with direct and/or inverted repeat symmetry constitute the majority of inferred DNA motifs, suggesting preferable TF dimerization in head-to-tail or head-to-head configuration. The obtained genomic collection of in silico reconstructed MocR-TF motifs and regulons in Bacteria provides a basis for future experimental characterization of molecular mechanisms for various regulators in this family. PMID:28348826
Molecular analysis of the human SLC13A4 sulfate transporter gene promoter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jefferis, J.; Rakoczy, J.; School of Biomedical Sciences, University of Queensland, St. Lucia, Queensland

2013-03-29

Highlights: ► Basal promoter activity of SLC13A4 −57 to −192 nt upstream of transcription initiation site. ► Human SLC13A4 5′-flanking region has conserved motifs with other placental species. ► Putative NFY, SP1 and KLF7 motifs in SLC13A4 5′-flanking region enhance transcription. -- Abstract: The human solute linked carrier (SLC) 13A4 gene is primarily expressed in the placenta where it is proposed to mediate the transport of nutrient sulfate from mother to fetus. The molecular mechanisms involved in the regulation of SLC13A4 expression remain unknown. To investigate the regulation of SLC13A4 gene expression, we analysed the transcriptional activity of the humanmore » SLC13A4 5′-flanking region in the JEG-3 placental cell line using luciferase reporter assays. Basal transcriptional activity was identified in the region −57 to −192 nucleotides upstream of the SLC13A4 transcription initiation site. Mutational analysis of the minimal promoter region identified Nuclear factor Y (NFY), Specificity protein 1 (SP1) and Krüppel like factor 7 (KLF7) motifs which conferred positive transcriptional activity, as well as Zinc finger protein of the cerebellum 2 (ZIC2) and helix–loop–helix protein 1 (HEN1) motifs that repressed transcription. The conserved NFY, SP1, KLF7, ZIC2 and HEN1 motifs in the SLC13A4 promoter of placental species but not in non-placental species, suggests a potential role for these putative transcriptional factor binding motifs in the physiological control of SLC13A4 mRNA expression.« less
Crystal structure of the tyrosine kinase domain of the hepatocyte growth factor receptor c-Met and its complex with the microbial alkaloid K-252a.

PubMed

Schiering, Nikolaus; Knapp, Stefan; Marconi, Marina; Flocco, Maria M; Cui, Jean; Perego, Rita; Rusconi, Luisa; Cristiani, Cinzia

2003-10-28

The protooncogene c-met codes for the hepatocyte growth factor receptor tyrosine kinase. Binding of its ligand, hepatocyte growth factor/scatter factor, stimulates receptor autophosphorylation, which leads to pleiotropic downstream signaling events in epithelial cells, including cell growth, motility, and invasion. These events are mediated by interaction of cytoplasmic effectors, generally through Src homology 2 (SH2) domains, with two phosphotyrosine-containing sequence motifs in the unique C-terminal tail of c-Met (supersite). There is a strong link between aberrant c-Met activity and oncogenesis, which makes this kinase an important cancer drug target. The furanosylated indolocarbazole K-252a belongs to a family of microbial alkaloids that also includes staurosporine. It was recently shown to be a potent inhibitor of c-Met. Here we report the crystal structures of an unphosphorylated c-Met kinase domain harboring a human cancer mutation and its complex with K-252a at 1.8-A resolution. The structure follows the well established architecture of protein kinases. It adopts a unique, inhibitory conformation of the activation loop, a catalytically noncompetent orientation of helix alphaC, and reveals the complete C-terminal docking site. The first SH2-binding motif (1349YVHV) adopts an extended conformation, whereas the second motif (1356YVNV), a binding site for Grb2-SH2, folds as a type II Beta-turn. The intermediate portion of the supersite (1353NATY) assumes a type I Beta-turn conformation as in an Shc-phosphotyrosine binding domain peptide complex. K-252a is bound in the adenosine pocket with an analogous binding mode to those observed in previously reported structures of protein kinases in complex with staurosporine.
Human cytomegalovirus tegument protein pp150 acts as a cyclin A2-CDK-dependent sensor of the host cell cycle and differentiation state.

PubMed

Bogdanow, Boris; Weisbach, Henry; von Einem, Jens; Straschewski, Sarah; Voigt, Sebastian; Winkler, Michael; Hagemeier, Christian; Wiebusch, Lüder

2013-10-22

Upon cell entry, herpesviruses deliver a multitude of premade virion proteins to their hosts. The interplay between these incoming proteins and cell-specific regulatory factors dictates the outcome of infections at the cellular level. Here, we report a unique type of virion-host cell interaction that is essential for the cell cycle and differentiation state-dependent onset of human cytomegalovirus (HCMV) lytic gene expression. The major tegument 150-kDa phosphoprotein (pp150) of HCMV binds to cyclin A2 via a functional RXL/Cy motif resulting in its cyclin A2-dependent phosphorylation. Alanine substitution of the RXL/Cy motif prevents this interaction and allows the virus to fully escape the cyclin-dependent kinase (CDK)-mediated block of immediate early (IE) gene expression in S/G2 phase that normally restricts the onset of the HCMV replication cycle to G0/G1. Furthermore, the cyclin A2-CDK-pp150 axis is also involved in the establishment of HCMV quiescence in NTera2 cells, showing the importance of this molecular switch for differentiation state-dependent regulation of IE gene expression. Consistent with the known nucleocapsid-binding function of pp150, its RXL/Cy-dependent phosphorylation affects gene expression of the parental virion only, suggesting a cis-acting, virus particle-associated mechanism of control. The pp150 homologs of other primate and mammalian CMVs lack an RXL/Cy motif and accordingly even the nearest relative of HCMV, chimpanzee CMV, starts its lytic cycle in a cell cycle-independent manner. Thus, HCMV has evolved a molecular sensor for cyclin A2-CDK activity to restrict its IE gene expression program as a unique level of self-limitation and adaptation to its human host.
Comparative Analysis of the 15.5kD Box C/D snoRNP Core Protein in the Primitive Eukaryote Giardia lamblia Reveals Unique Structural and Functional Features

DOE Office of Scientific and Technical Information (OSTI.GOV)

Biswas, Shyamasri; Buhrman, Greg; Gagnon, Keith

2012-07-11

Box C/D ribonucleoproteins (RNP) guide the 2'-O-methylation of targeted nucleotides in archaeal and eukaryotic rRNAs. The archaeal L7Ae and eukaryotic 15.5kD box C/D RNP core protein homologues initiate RNP assembly by recognizing kink-turn (K-turn) motifs. The crystal structure of the 15.5kD core protein from the primitive eukaryote Giardia lamblia is described here to a resolution of 1.8 {angstrom}. The Giardia 15.5kD protein exhibits the typical {alpha}-{beta}-{alpha} sandwich fold exhibited by both archaeal L7Ae and eukaryotic 15.5kD proteins. Characteristic of eukaryotic homologues, the Giardia 15.5kD protein binds the K-turn motif but not the variant K-loop motif. The highly conserved residues ofmore » loop 9, critical for RNA binding, also exhibit conformations similar to those of the human 15.5kD protein when bound to the K-turn motif. However, comparative sequence analysis indicated a distinct evolutionary position between Archaea and Eukarya. Indeed, assessment of the Giardia 15.5kD protein in denaturing experiments demonstrated an intermediate stability in protein structure when compared with that of the eukaryotic mouse 15.5kD and archaeal Methanocaldococcus jannaschii L7Ae proteins. Most notable was the ability of the Giardia 15.5kD protein to assemble in vitro a catalytically active chimeric box C/D RNP utilizing the archaeal M. jannaschii Nop56/58 and fibrillarin core proteins. In contrast, a catalytically competent chimeric RNP could not be assembled using the mouse 15.5kD protein. Collectively, these analyses suggest that the G. lamblia 15.5kD protein occupies a unique position in the evolution of this box C/D RNP core protein retaining structural and functional features characteristic of both archaeal L7Ae and higher eukaryotic 15.5kD homologues.« less

Consensus Prediction of Charged Single Alpha-Helices with CSAHserver.

PubMed

Dudola, Dániel; Tóth, Gábor; Nyitray, László; Gáspári, Zoltán

2017-01-01

Charged single alpha-helices (CSAHs) constitute a rare structural motif. CSAH is characterized by a high density of regularly alternating residues with positively and negatively charged side chains. Such segments exhibit unique structural properties; however, there are only a handful of proteins where its existence is experimentally verified. Therefore, establishing a pipeline that is capable of predicting the presence of CSAH segments with a low false positive rate is of considerable importance. Here we describe a consensus-based approach that relies on two conceptually different CSAH detection methods and a final filter based on the estimated helix-forming capabilities of the segments. This pipeline was shown to be capable of identifying previously uncharacterized CSAH segments that could be verified experimentally. The method is available as a web server at http://csahserver.itk.ppke.hu and also a downloadable standalone program suitable to scan larger sequence collections.
Ultrasoft microgels displaying emergent platelet-like behaviours

NASA Astrophysics Data System (ADS)

Brown, Ashley C.; Stabenfeldt, Sarah E.; Ahn, Byungwook; Hannan, Riley T.; Dhada, Kabir S.; Herman, Emily S.; Stefanelli, Victoria; Guzzetta, Nina; Alexeev, Alexander; Lam, Wilbur A.; Lyon, L. Andrew; Barker, Thomas H.

2014-12-01

Efforts to create platelet-like structures for the augmentation of haemostasis have focused solely on recapitulating aspects of platelet adhesion; more complex platelet behaviours such as clot contraction are assumed to be inaccessible to synthetic systems. Here, we report the creation of fully synthetic platelet-like particles (PLPs) that augment clotting in vitro under physiological flow conditions and achieve wound-triggered haemostasis and decreased bleeding times in vivo in a traumatic injury model. PLPs were synthesized by combining highly deformable microgel particles with molecular-recognition motifs identified through directed evolution. In vitro and in silico analyses demonstrate that PLPs actively collapse fibrin networks, an emergent behaviour that mimics in vivo clot contraction. Mechanistically, clot collapse is intimately linked to the unique deformability and affinity of PLPs for fibrin fibres, as evidenced by dissipative particle dynamics simulations. Our findings should inform the future design of a broader class of dynamic, biosynthetic composite materials.
The human TREM gene cluster at 6p21.1 encodes both activating and inhibitory single IgV domain receptors and includes NKp44.

PubMed

Allcock, Richard J N; Barrow, Alexander D; Forbes, Simon; Beck, Stephan; Trowsdale, John

2003-02-01

We have characterized a cluster of single immunoglobulin variable (IgV) domain receptors centromeric of the major histocompatibility complex (MHC) on human chromosome 6. In addition to triggering receptor expressed on myeloid cells (TREM)-1 and TREM2, the cluster contains NKp44, a triggering receptor whose expression is limited to NK cells. We identified three new related genes and two gene fragments within a cluster of approximately 200 kb. Two of the three new genes lack charged residues in their transmembrane domain tails. Further, one of the genes contains two potential immunotyrosine Inhibitory motifs in its cytoplasmic tail, suggesting that it delivers inhibitory signals. The human and mouse TREM clusters appear to have diverged such that there are unique sequences in each species. Finally, each gene in the TREM cluster was expressed in a different range of cell types.
Evolution of Drosophila ribosomal protein gene core promoters.

PubMed

Ma, Xiaotu; Zhang, Kangyu; Li, Xiaoman

2009-03-01

The coordinated expression of ribosomal protein genes (RPGs) has been well documented in many species. Previous analyses of RPG promoters focus only on Fungi and mammals. Recognizing this gap and using a comparative genomics approach, we utilize a motif-finding algorithm that incorporates cross-species conservation to identify several significant motifs in Drosophila RPG promoters. As a result, significant differences of the enriched motifs in RPG promoter are found among Drosophila, Fungi, and mammals, demonstrating the evolutionary dynamics of the ribosomal gene regulatory network. We also report a motif present in similar numbers of RPGs among Drosophila species which does not appear to be conserved at the individual RPG gene level. A module-wise stabilizing selection theory is proposed to explain this observation. Overall, our results provide significant insight into the fast-evolving nature of transcriptional regulation in the RPG module.
Evolution of Drosophila ribosomal protein gene core promoters

PubMed Central

Ma, Xiaotu; Zhang, Kangyu; Li, Xiaoman

2011-01-01

The coordinated expression of ribosomal protein genes (RPGs) has been well documented in many species. Previous analyses of RPG promoters focus only on Fungi and mammals. Recognizing this gap and using a comparative genomics approach, we utilize a motif-finding algorithm that incorporates cross-species conservation to identify several significant motifs in Drosophila RPG promoters. As a result, significant differences of the enriched motifs in RPG promoter are found among Drosophila, Fungi, and mammals, demonstrating the evolutionary dynamics of the ribosomal gene regulatory network. We also report a motif present in similar numbers of RPGs among Drosophila species which does not appear to be conserved at the individual RPG gene level. A module-wise stabilizing selection theory is proposed to explain this observation. Overall, our results provide significant insight into the fast-evolving nature of transcriptional regulation in the RPG module. PMID:19059316
Mechanism for CARMIL Protein Inhibition of Heterodimeric Actin-capping Protein*

PubMed Central

Kim, Taekyung; Ravilious, Geoffrey E.; Sept, David; Cooper, John A.

2012-01-01

Capping protein (CP) controls the polymerization of actin filaments by capping their barbed ends. In lamellipodia, CP dissociates from the actin cytoskeleton rapidly, suggesting the possible existence of an uncapping factor, for which the protein CARMIL (capping protein, Arp2/3 and myosin-I linker) is a candidate. CARMIL binds to CP via two motifs. One, the CP interaction (CPI) motif, is found in a number of unrelated proteins; the other motif is unique to CARMILs, the CARMIL-specific interaction motif. A 115-aa CARMIL fragment of CARMIL with both motifs, termed the CP-binding region (CBR), binds to CP with high affinity, inhibits capping, and causes uncapping. We wanted to understand the structural basis for this function. We used a collection of mutants affecting the actin-binding surface of CP to test the possibility of a steric-blocking model, which remained open because a region of CBR was not resolved in the CBR/CP co-crystal structure. The CP actin-binding mutants bound CBR normally. In addition, a CBR mutant with all residues of the unresolved region changed showed nearly normal binding to CP. Having ruled out a steric blocking model, we tested an allosteric model with molecular dynamics. We found that CBR binding induces changes in the conformation of the actin-binding surface of CP. In addition, ∼30-aa truncations on the actin-binding surface of CP decreased the affinity of CBR for CP. Thus, CARMIL promotes uncapping by binding to a freely accessible site on CP bound to a filament barbed end and inducing a change in the conformation of the actin-binding surface of CP. PMID:22411988
Molecular characterization of a new monopartite dsRNA mycovirus from mycorrhizal Thelephora terrestris (Ehrh.) and its detection in soil oribatid mites (Acari: Oribatida)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Petrzik, Karel, E-mail: petrzik@umbr.cas.cz; Sarkisova, Tatiana; Starý, Josef

2016-02-15

A novel dsRNA virus was identified in the mycorrhizal fungus Thelephora terrestris (Ehrh.) and sequenced. This virus, named Thelephora terrestris virus 1 (TtV1), contains two reading frames in different frames but with the possibility that ORF2 could be translated as a fusion polyprotein after ribosomal -1 frameshifting. Picornavirus 2A-like motif, nudix hydrolase, phytoreovirus S7, and RdRp domains were found in a unique arrangement on the polyprotein. A new genus named Phlegivirus and containing TtV1, PgLV1, RfV1 and LeV is therefore proposed. Twenty species of oribatid mites were identified in soil material in the vicinity of T. terrestris. TtV1 was detectedmore » in large amounts in Steganacarus (Tropacarus) carinatus (C.L. Koch, 1841) and in much smaller amounts in Nothrus silvestris (Nicolet). This is the first description of mycovirus presence in oribatid mites. - Highlights: • A novel dsRNA virus was identified in the mycorrhizal fungus Thelephora terrestris. • A new virus genus Phlegivirus is proposed. • The mycovirus was firstly detected in oribatid mites.« less
Identification of the divergent calmodulin binding motif in yeast Ssb1/Hsp75 protein and in other HSP70 family members.

PubMed

Heinen, R C; Diniz-Mendes, L; Silva, J T; Paschoalin, V M F

2006-11-01

Yeast soluble proteins were fractionated by calmodulin-agarose affinity chromatography and the Ca2+/calmodulin-binding proteins were analyzed by SDS-PAGE. One prominent protein of 66 kDa was excised from the gel, digested with trypsin and the masses of the resultant fragments were determined by MALDI/MS. Twenty-one of 38 monoisotopic peptide masses obtained after tryptic digestion were matched to the heat shock protein Ssb1/Hsp75, covering 37% of its sequence. Computational analysis of the primary structure of Ssb1/Hsp75 identified a unique potential amphipathic alpha-helix in its N-terminal ATPase domain with features of target regions for Ca2+/calmodulin binding. This region, which shares 89% similarity to the experimentally determined calmodulin-binding domain from mouse, Hsc70, is conserved in near half of the 113 members of the HSP70 family investigated, from yeast to plant and animals. Based on the sequence of this region, phylogenetic analysis grouped the HSP70s in three distinct branches. Two of them comprise the non-calmodulin binding Hsp70s BIP/GR78, a subfamily of eukaryotic HSP70 localized in the endoplasmic reticulum, and DnaK, a subfamily of prokaryotic HSP70. A third heterogeneous group is formed by eukaryotic cytosolic HSP70s containing the new calmodulin-binding motif and other cytosolic HSP70s whose sequences do not conform to those conserved motif, indicating that not all eukaryotic cytosolic Hsp70s are target for calmodulin regulation. Furthermore, the calmodulin-binding domain found in eukaryotic HSP70s is also the target for binding of Bag-1 - an enhancer of ADP/ATP exchange activity of Hsp70s. A model in which calmodulin displaces Bag-1 and modulates Ssb1/Hsp75 chaperone activity is discussed.
Trend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, R; McCallen, S; Almaas, E

2007-05-28

Complex networks have been used successfully in scientific disciplines ranging from sociology to microbiology to describe systems of interacting units. Until recently, studies of complex networks have mainly focused on their network topology. However, in many real world applications, the edges and vertices have associated attributes that are frequently represented as vertex or edge weights. Furthermore, these weights are often not static, instead changing with time and forming a time series. Hence, to fully understand the dynamics of the complex network, we have to consider both network topology and related time series data. In this work, we propose a motifmore » mining approach to identify trend motifs for such purposes. Simply stated, a trend motif describes a recurring subgraph where each of its vertices or edges displays similar dynamics over a userdefined period. Given this, each trend motif occurrence can help reveal significant events in a complex system; frequent trend motifs may aid in uncovering dynamic rules of change for the system, and the distribution of trend motifs may characterize the global dynamics of the system. Here, we have developed efficient mining algorithms to extract trend motifs. Our experimental validation using three disparate empirical datasets, ranging from the stock market, world trade, to a protein interaction network, has demonstrated the efficiency and effectiveness of our approach.« less
Chaotic Motifs in Gene Regulatory Networks

PubMed Central

Zhang, Zhaoyang; Ye, Weiming; Qian, Yu; Zheng, Zhigang; Huang, Xuhui; Hu, Gang

2012-01-01

Chaos should occur often in gene regulatory networks (GRNs) which have been widely described by nonlinear coupled ordinary differential equations, if their dimensions are no less than 3. It is therefore puzzling that chaos has never been reported in GRNs in nature and is also extremely rare in models of GRNs. On the other hand, the topic of motifs has attracted great attention in studying biological networks, and network motifs are suggested to be elementary building blocks that carry out some key functions in the network. In this paper, chaotic motifs (subnetworks with chaos) in GRNs are systematically investigated. The conclusion is that: (i) chaos can only appear through competitions between different oscillatory modes with rivaling intensities. Conditions required for chaotic GRNs are found to be very strict, which make chaotic GRNs extremely rare. (ii) Chaotic motifs are explored as the simplest few-node structures capable of producing chaos, and serve as the intrinsic source of chaos of random few-node GRNs. Several optimal motifs causing chaos with atypically high probability are figured out. (iii) Moreover, we discovered that a number of special oscillators can never produce chaos. These structures bring some advantages on rhythmic functions and may help us understand the robustness of diverse biological rhythms. (iv) The methods of dominant phase-advanced driving (DPAD) and DPAD time fraction are proposed to quantitatively identify chaotic motifs and to explain the origin of chaotic behaviors in GRNs. PMID:22792171
Overexpression of TRIM44 is related to invasive potential and malignant outcomes in esophageal squamous cell carcinoma.

PubMed

Kawaguchi, Tsutomu; Komatsu, Shuhei; Ichikawa, Daisuke; Hirajima, Shoji; Nishimura, Yukihisa; Konishi, Hirotaka; Shiozaki, Atsushi; Fujiwara, Hitoshi; Okamoto, Kazuma; Tsuda, Hitoshi; Otsuji, Eigo

2017-06-01

Recent studies have shown that some members of the tripartite motif-containing protein family function as important regulators for carcinogenesis. In this study, we investigated whether tripartite motif-containing protein 44 acts as a cancer-promoting gene through its overexpression in esophageal squamous cell carcinoma. We analyzed esophageal squamous cell carcinoma cell lines to evaluate malignant potential and also analyzed 68 primary tumors to evaluate clinical relevance of tripartite motif-containing protein 44 protein in esophageal squamous cell carcinoma patients. Expression of the tripartite motif-containing protein 44 protein was detected in esophageal squamous cell carcinoma cell lines (8/14 cell lines; 57%) and primary tumor samples of esophageal squamous cell carcinoma (39/68 cases; 57%). Knockdown of tripartite motif-containing protein 44 expression in esophageal squamous cell carcinoma cells using several specific small interfering RNAs inhibited cell migration and invasion, but not cell proliferation. Immunohistochemical analysis demonstrated that the overexpression of the tripartite motif-containing protein 44 protein in the tumor infiltrated region was associated with the status of lymph node metastasis ( p = 0.049), and the overall survival rates were significantly worse among patients with tripartite motif-containing protein 44-overexpressing tumors than those with non-expressing tumors ( p = 0.029). Moreover, multivariate Cox regression model identified that overexpression of the tripartite motif-containing protein 44 protein was an independent worse prognostic factor (hazard ratio = 2.815; p = 0.041), as well as lymphatic invasion (hazard ratio = 2.735; p = 0.037). These results suggest that tripartite motif-containing protein 44 protein could play a crucial role in tumor invasion through its overexpression and highlight its usefulness as a predictor and potential therapeutic target in esophageal squamous cell carcinoma.
Functional structural motifs for protein-ligand, protein-protein, and protein-nucleic acid interactions and their connection to supersecondary structures.

PubMed

Kinjo, Akira R; Nakamura, Haruki

2013-01-01

Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.
The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response.

PubMed

Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

2016-01-01

The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls.
ACGT-containing abscisic acid response element (ABRE) and coupling element 3 (CE3) are functionally equivalent.

PubMed

Hobo, T; Asada, M; Kowyama, Y; Hattori, T

1999-09-01

ACGT-containing ABA response elements (ABREs) have been functionally identified in the promoters of various genes. In addition, single copies of ABRE have been found to require a cis-acting, coupling element to achieve ABA induction. A coupling element 3 (CE3) sequence, originally identified as such in the barley HVA1 promoter, is found approximately 30 bp downstream of motif A (ACGT-containing ABRE) in the promoter of the Osem gene. The relationship between these two elements was further defined by linker-scan analyses of a 55 bp fragment of the Osem promoter, which is sufficient for ABA-responsiveness and VP1 activation. The analyses revealed that both motif A and CE3 sequence were required not only for ABA-responsiveness but also for VP1 activation. Since the sequences of motif A and CE3 were found to be similar, motif-exchange experiments were carried out. The experiments demonstrated that motif A and CE3 were interchangeable by each other with respect to both ABA and VP1 regulation. In addition, both sequences were shown to be recognized by a VP1-interacting, ABA-responsive bZIP factor TRAB1. These results indicate that ACGT-containing ABREs and CE3 are functionally equivalent cis-acting elements. Furthermore, TRAB1 was shown to bind two other non-ACGT ABREs. Based on these results, all these ABREs including CE3 are proposed to be categorized into a single class of cis-acting elements.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Fukumoto, Yasunori, E-mail: fukumoto@faculty.chiba-u.jp; Ikeuchi, Masayoshi; Nakayama, Yuji

ATR-dependent DNA damage checkpoint is the major DNA damage checkpoint against UV irradiation and DNA replication stress. The Rad17–RFC and Rad9–Rad1–Hus1 (9–1–1) complexes interact with each other to contribute to ATR signaling, however, the precise regulatory mechanism of the interaction has not been established. Here, we identified a conserved sequence motif, KYxxL, in the AAA+ domain of Rad17 protein, and demonstrated that this motif is essential for the interaction with the 9–1–1 complex. We also show that UV-induced Rad17 phosphorylation is increased in the Rad17 KYxxL mutants. These data indicate that the interaction with the 9–1–1 complex is not required formore » Rad17 protein to be an efficient substrate for the UV-induced phosphorylation. Our data also raise the possibility that the 9–1–1 complex plays a negative regulatory role in the Rad17 phosphorylation. We also show that the nucleotide-binding activity of Rad17 is required for its nuclear localization. - Highlights: • We have identified a conserved KYxxL motif in Rad17 protein. • The KYxxL motif is crucial for the interaction with the 9–1–1 complex. • The KYxxL motif is dispensable or inhibitory for UV-induced Rad17 phosphorylation. • Nucleotide binding of Rad17 is required for its nuclear localization.« less
A tryptophan-rich motif in the human parainfluenza virus type 2 V protein is critical for the blockade of toll-like receptor 7 (TLR7)- and TLR9-dependent signaling.

PubMed

Kitagawa, Yoshinori; Yamaguchi, Mayu; Zhou, Min; Komatsu, Takayuki; Nishio, Machiko; Sugiyama, Tsuyoshi; Takeuchi, Kenji; Itoh, Masae; Gotoh, Bin

2011-05-01

Plasmacytoid dendritic cells (pDCs) do not produce alpha interferon (IFN-α) unless viruses cause a systemic infection or overcome the first-line defense provided by conventional DCs and macrophages. We show here that even paramyxoviruses, whose infections are restricted to the respiratory tract, have a V protein able to prevent Toll-like receptor 7 (TLR7)- and TLR9-dependent IFN-α induction specific to pDCs. Mutational analysis of human parainfluenza virus type 2 demonstrates that the second Trp residue of the Trp-rich motif (Trp-X(3)-Trp-X(9)-Trp) in the C-terminal domain unique to V, a determinant for IRF7 binding, is critical for the blockade of TLR7/9-dependent signaling.
Subtle Changes in Motif Positioning Cause Tissue-Specific Effects on Robustness of an Enhancer's Activity

PubMed Central

Erceg, Jelena; Saunders, Timothy E.; Girardot, Charles; Devos, Damien P.; Hufnagel, Lars; Furlong, Eileen E. M.

2014-01-01

Deciphering the specific contribution of individual motifs within cis-regulatory modules (CRMs) is crucial to understanding how gene expression is regulated and how this process is affected by sequence variation. But despite vast improvements in the ability to identify where transcription factors (TFs) bind throughout the genome, we are limited in our ability to relate information on motif occupancy to function from sequence alone. Here, we engineered 63 synthetic CRMs to systematically assess the relationship between variation in the content and spacing of motifs within CRMs to CRM activity during development using Drosophila transgenic embryos. In over half the cases, very simple elements containing only one or two types of TF binding motifs were capable of driving specific spatio-temporal patterns during development. Different motif organizations provide different degrees of robustness to enhancer activity, ranging from binary on-off responses to more subtle effects including embryo-to-embryo and within-embryo variation. By quantifying the effects of subtle changes in motif organization, we were able to model biophysical rules that explain CRM behavior and may contribute to the spatial positioning of CRM activity in vivo. For the same enhancer, the effects of small differences in motif positions varied in developmentally related tissues, suggesting that gene expression may be more susceptible to sequence variation in one tissue compared to another. This result has important implications for human eQTL studies in which many associated mutations are found in cis-regulatory regions, though the mechanism for how they affect tissue-specific gene expression is often not understood. PMID:24391522
Helix–hairpin–helix motifs confer salt resistance and processivity on chimeric DNA polymerases

PubMed Central

Pavlov, Andrey R.; Belova, Galina I.; Kozyavkin, Sergei A.; Slesarev, Alexei I.

2002-01-01

Helix–hairpin–helix (HhH) is a widespread motif involved in sequence-nonspecific DNA binding. The majority of HhH motifs function as DNA-binding modules with typical occurrence of one HhH motif or one or two (HhH)2 domains in proteins. We recently identified 24 HhH motifs in DNA topoisomerase V (Topo V). Although these motifs are dispensable for the topoisomerase activity of Topo V, their removal narrows the salt concentration range for topoisomerase activity tenfold. Here, we demonstrate the utility of Topo V's HhH motifs for modulating DNA-binding properties of the Stoffel fragment of TaqDNA polymerase and Pfu DNA polymerase. Different HhH cassettes fused with either NH2 terminus or COOH terminus of DNA polymerases broaden the salt concentration range of the polymerase activity significantly (up to 0.5 M NaCl or 1.8 M potassium glutamate). We found that anions play a major role in the inhibition of DNA polymerase activity. The resistance of initial extension rates and the processivity of chimeric polymerases to salts depend on the structure of added HhH motifs. Regardless of the type of the construct, the thermal stability of chimeric Taq polymerases increases under the optimal ionic conditions, as compared with that of TaqDNA polymerase or its Stoffel fragment. Our approach to raise the salt tolerance, processivity, and thermostability of Taq and Pfu DNA polymerases may be applied to all pol1- and polB-type polymerases, as well as to other DNA processing enzymes. PMID:12368475
A ΩXaV motif in the Rift Valley fever virus NSs protein is essential for degrading p62, forming nuclear filaments and virulence

PubMed Central

Cyr, Normand; de la Fuente, Cynthia; Lecoq, Lauriane; Guendel, Irene; Chabot, Philippe R.; Kehn-Hall, Kylene; Omichinski, James G.

2015-01-01

Rift Valley fever virus (RVFV) is a single-stranded RNA virus capable of inducing fatal hemorrhagic fever in humans. A key component of RVFV virulence is its ability to form nuclear filaments through interactions between the viral nonstructural protein NSs and the host general transcription factor TFIIH. Here, we identify an interaction between a ΩXaV motif in NSs and the p62 subunit of TFIIH. This motif in NSs is similar to ΩXaV motifs found in nucleotide excision repair (NER) factors and transcription factors known to interact with p62. Structural and biophysical studies demonstrate that NSs binds to p62 in a similar manner as these other factors. Functional studies in RVFV-infected cells show that the ΩXaV motif is required for both nuclear filament formation and degradation of p62. Consistent with the fact that the RVFV can be distinguished from other Bunyaviridae-family viruses due to its ability to form nuclear filaments in infected cells, the motif is absent in the NSs proteins of other Bunyaviridae-family viruses. Taken together, our studies demonstrate that p62 binding to NSs through the ΩXaV motif is essential for degrading p62, forming nuclear filaments and enhancing RVFV virulence. In addition, these results show how the RVFV incorporates a simple motif into the NSs protein that enables it to functionally mimic host cell proteins that bind the p62 subunit of TFIIH. PMID:25918396
A ΩXaV motif in the Rift Valley fever virus NSs protein is essential for degrading p62, forming nuclear filaments and virulence.

PubMed

Cyr, Normand; de la Fuente, Cynthia; Lecoq, Lauriane; Guendel, Irene; Chabot, Philippe R; Kehn-Hall, Kylene; Omichinski, James G

2015-05-12

Rift Valley fever virus (RVFV) is a single-stranded RNA virus capable of inducing fatal hemorrhagic fever in humans. A key component of RVFV virulence is its ability to form nuclear filaments through interactions between the viral nonstructural protein NSs and the host general transcription factor TFIIH. Here, we identify an interaction between a ΩXaV motif in NSs and the p62 subunit of TFIIH. This motif in NSs is similar to ΩXaV motifs found in nucleotide excision repair (NER) factors and transcription factors known to interact with p62. Structural and biophysical studies demonstrate that NSs binds to p62 in a similar manner as these other factors. Functional studies in RVFV-infected cells show that the ΩXaV motif is required for both nuclear filament formation and degradation of p62. Consistent with the fact that the RVFV can be distinguished from other Bunyaviridae-family viruses due to its ability to form nuclear filaments in infected cells, the motif is absent in the NSs proteins of other Bunyaviridae-family viruses. Taken together, our studies demonstrate that p62 binding to NSs through the ΩXaV motif is essential for degrading p62, forming nuclear filaments and enhancing RVFV virulence. In addition, these results show how the RVFV incorporates a simple motif into the NSs protein that enables it to functionally mimic host cell proteins that bind the p62 subunit of TFIIH.

Identification of 15 candidate structured noncoding RNA motifs in fungi by comparative genomics.

PubMed

Li, Sanshu; Breaker, Ronald R

2017-10-13

With the development of rapid and inexpensive DNA sequencing, the genome sequences of more than 100 fungal species have been made available. This dataset provides an excellent resource for comparative genomics analyses, which can be used to discover genetic elements, including noncoding RNAs (ncRNAs). Bioinformatics tools similar to those used to uncover novel ncRNAs in bacteria, likewise, should be useful for searching fungal genomic sequences, and the relative ease of genetic experiments with some model fungal species could facilitate experimental validation studies. We have adapted a bioinformatics pipeline for discovering bacterial ncRNAs to systematically analyze many fungal genomes. This comparative genomics pipeline integrates information on conserved RNA sequence and structural features with alternative splicing information to reveal fungal RNA motifs that are candidate regulatory domains, or that might have other possible functions. A total of 15 prominent classes of structured ncRNA candidates were identified, including variant HDV self-cleaving ribozyme representatives, atypical snoRNA candidates, and possible structured antisense RNA motifs. Candidate regulatory motifs were also found associated with genes for ribosomal proteins, S-adenosylmethionine decarboxylase (SDC), amidase, and HexA protein involved in Woronin body formation. We experimentally confirm that the variant HDV ribozymes undergo rapid self-cleavage, and we demonstrate that the SDC RNA motif reduces the expression of SAM decarboxylase by translational repression. Furthermore, we provide evidence that several other motifs discovered in this study are likely to be functional ncRNA elements. Systematic screening of fungal genomes using a computational discovery pipeline has revealed the existence of a variety of novel structured ncRNAs. Genome contexts and similarities to known ncRNA motifs provide strong evidence for the biological and biochemical functions of some newly found ncRNA motifs. Although initial examinations of several motifs provide evidence for their likely functions, other motifs will require more in-depth analysis to reveal their functions.
Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals

PubMed Central

2014-01-01

Background The polyadenylation of RNA is critical for gene functioning, but the conserved sequence motifs (often called signal or signature motifs), motif locations and abundances, and base composition patterns around mRNA polyadenylation [poly(A)] sites are still uncharacterized in most species. The evolutionary tendency for poly(A) site selection is still largely unknown. Results We analyzed the poly(A) site regions of 31 species or phyla. Different groups of species showed different poly(A) signal motifs: UUACUU at the poly(A) site in the parasite Trypanosoma cruzi; UGUAAC (approximately 13 bases upstream of the site) in the alga Chlamydomonas reinhardtii; UGUUUG (or UGUUUGUU) at mainly the fourth base downstream of the poly(A) site in the parasite Blastocystis hominis; and AAUAAA at approximately 16 bases and approximately 19 bases upstream of the poly(A) site in animals and plants, respectively. Polyadenylation signal motifs are usually several hundred times more abundant around poly(A) sites than in whole genomes. These predominant motifs usually had very specific locations, whether upstream of, at, or downstream of poly(A) sites, depending on the species or phylum. The poly(A) site was usually an adenosine (A) in all analyzed species except for B. hominis, and there was weak A predominance in C. reinhardtii. Fungi, animals, plants, and the protist Phytophthora infestans shared a general base abundance pattern (or base composition pattern) of “U-rich—A-rich—U-rich—Poly(A) site—U-rich regions”, or U-A-U-A-U for short, with some variation for each kingdom or subkingdom. Conclusion This study identified the poly(A) signal motifs, motif locations, and base composition patterns around mRNA poly(A) sites in protists, fungi, plants, and animals and provided insight into poly(A) site evolution. PMID:25052519
Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

PubMed Central

Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

2016-01-01

Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282
Nucleophosmin integrates within the nucleolus via multi-modal interactions with proteins displaying R-rich linear motifs and rRNA

PubMed Central

Mitrea, Diana M; Cika, Jaclyn A; Guy, Clifford S; Ban, David; Banerjee, Priya R; Stanley, Christopher B; Nourse, Amanda; Deniz, Ashok A; Kriwacki, Richard W

2016-01-01

The nucleolus is a membrane-less organelle formed through liquid-liquid phase separation of its components from the surrounding nucleoplasm. Here, we show that nucleophosmin (NPM1) integrates within the nucleolus via a multi-modal mechanism involving multivalent interactions with proteins containing arginine-rich linear motifs (R-motifs) and ribosomal RNA (rRNA). Importantly, these R-motifs are found in canonical nucleolar localization signals. Based on a novel combination of biophysical approaches, we propose a model for the molecular organization within liquid-like droplets formed by the N-terminal domain of NPM1 and R-motif peptides, thus providing insights into the structural organization of the nucleolus. We identify multivalency of acidic tracts and folded nucleic acid binding domains, mediated by N-terminal domain oligomerization, as structural features required for phase separation of NPM1 with other nucleolar components in vitro and for localization within mammalian nucleoli. We propose that one mechanism of nucleolar localization involves phase separation of proteins within the nucleolus. DOI: http://dx.doi.org/10.7554/eLife.13571.001 PMID:26836305
Composition-dependent stability of the medium-range order responsible for metallic glass formation

DOE PAGES

Zhang, Feng; Ji, Min; Fang, Xiao-Wei; ...

2014-09-18

The competition between the characteristic medium-range order corresponding to amorphous alloys and that in ordered crystalline phases is central to phase selection and morphology evolution under various processing conditions. We examine the stability of a model glass system, Cu–Zr, by comparing the energetics of various medium-range structural motifs over a wide range of compositions using first-principles calculations. Furthermore, we focus specifically on motifs that represent possible building blocks for competing glassy and crystalline phases, and we employ a genetic algorithm to efficiently identify the energetically favored decorations of each motif for specific compositions. These results show that a Bergman-type motifmore » with crystallization-resisting icosahedral symmetry is energetically most favorable in the composition range 0.63 < xCu < 0.68, and is the underlying motif for one of the three optimal glass-forming ranges observed experimentally for this binary system (Li et al., 2008). This work establishes an energy-based methodology to evaluate specific medium-range structural motifs which compete with stable crystalline nuclei in deeply undercooled liquids.« less
Relationships Among Obesity, Type 2 Diabetes, and Plasma Cytokines in African American Women.

PubMed

Denis, Gerald V; Sebastiani, Paola; Andrieu, Guillaume; Tran, Anna H; Strissel, Katherine J; Lombardi, Frank L; Palmer, Julie R

2017-11-01

The principal objective of this investigation was to identify novel cytokine associations with BMI and type 2 diabetes (T2D). Cytokines were profiled from African American women with obesity who donated plasma to the Komen Tissue Bank. Multiplex bead arrays of analytes were used to quantify 88 cytokines and chemokines in association with clinical diagnoses of metabolic health. Regression models were generated after elimination of outliers. Among women with obesity, T2D was associated with breast adipocyte hypertrophy and with six plasma analytes, including four chemokines (chemokine [C-C motif] ligand 2, chemokine [C-C motif] ligand 16, chemokine [C-X-C motif] ligand 1, and chemokine [C-X-C motif] ligand 16) and two growth factors (interleukin 2 and epidermal growth factor). In addition, three analytes were associated with obesity independently of diabetes: interleukin 4, soluble CD40 ligand, and chemokine (C-C motif) ligand 3. Profiling of inflammatory cytokines combined with measures of BMI may produce a more personalized risk assessment for obesity-associated disease in African American women. © 2017 The Obesity Society.
Conservation of the glycoprotein B homologs of the Kaposi’s sarcoma-associated herpesvirus (KSHV/HHV8) and Old World primate rhadinoviruses of chimpanzees and macaques

PubMed Central

Bruce, A. Gregory; Horst, Jeremy A.; Rose, Timothy M.

2016-01-01

The envelope-associated glycoprotein B (gB) is highly conserved within the Herpesviridae and plays a critical role in viral entry. We analyzed the evolutionary conservation of sequence and structural motifs within the Kaposi’s sarcoma-associated herpesvirus (KSHV) gB and homologs of Old World primate rhadinoviruses belonging to the distinct RV1 and RV2 rhadinovirus lineages. In addition to gB homologs of rhadinoviruses infecting the pig-tailed and rhesus macaques, we cloned and sequenced gB homologs of RV1 and RV2 rhadinoviruses infecting chimpanzees. A structural model of the KSHV gB was determined, and functional motifs and sequence variants were mapped to the model structure. Conserved domains and motifs were identified, including an “RGD” motif that plays a critical role in KSHV binding and entry through the cellular integrin αVβ3. The RGD motif was only detected in RV1 rhadinoviruses suggesting an important difference in cell tropism between the two rhadinovirus lineages. PMID:27070755
[Personal motif in art].

PubMed

Gerevich, József

2015-01-01

One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy.
A nonself sugar mimic of the HIV glycan shield shows enhanced antigenicity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Doores, Katie J.; Fulton, Zara; Hong, Vu

2011-08-24

Antibody 2G12 uniquely neutralizes a broad range of HIV-1 isolates by binding the high-mannose glycans on the HIV-1 surface glycoprotein, gp120. Antigens that resemble these natural epitopes of 2G12 would be highly desirable components for an HIV-1 vaccine. However, host-produced (self)-carbohydrate motifs have been unsuccessful so far at eliciting 2G12-like antibodies that cross-react with gp120. Based on the surprising observation that 2G12 binds nonproteinaceous monosaccharide D-fructose with higher affinity than D-mannose, we show here that a designed set of nonself, synthetic monosaccharides are potent antigens. When introduced to the terminus of the D1 arm of protein glycans recognized by 2G12,more » their antigenicity is significantly enhanced. Logical variation of these unnatural sugars pinpointed key modifications, and the molecular basis of this increased antigenicity was elucidated using high-resolution crystallographic analyses. Virus-like particle protein conjugates containing such nonself glycans are bound more tightly by 2G12. As immunogens they elicit higher titers of antibodies than those immunogenic conjugates containing the self D1 glycan motif. These antibodies generated from nonself immunogens also cross-react with this self motif, which is found in the glycan shield, when it is presented in a range of different conjugates and glycans. However, these antibodies did not bind this glycan motif when present on gp120.« less
The Thiamin Pyrophosphate-Motif

NASA Technical Reports Server (NTRS)

Dominiak, P.; Ciszak, E.

2003-01-01

Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Shaw, Debra J.; Morse, Robert; Todd, Adrian G.

The Ewing Sarcoma (EWS) protein is a ubiquitously expressed RNA processing factor that localises predominantly to the nucleus. However, the mechanism through which EWS enters the nucleus remains unclear, with differing reports identifying three separate import signals within the EWS protein. Here we have utilized a panel of truncated EWS proteins to clarify the reported nuclear localisation signals. We describe three C-terminal domains that are important for efficient EWS nuclear localization: (1) the third RGG-motif; (2) the last 10 amino acids (known as the PY-import motif); and (3) the zinc-finger motif. Although these three domains are involved in nuclear import,more » they are not independently capable of driving the efficient import of a GFP-moiety. However, collectively they form a complex tripartite signal that efficiently drives GFP-import into the nucleus. This study helps clarify the EWS import signal, and the identification of the involvement of both the RGG- and zinc-finger motifs has wide reaching implications.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Hamaji, Takashi; Lopez, David; Pellegrini, Matteo

Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient tomore » confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. Furthermore, we predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes.« less
The proteome and phosphoproteome of maize pollen uncovers fertility candidate proteins.

PubMed

Chao, Qing; Gao, Zhi-Fang; Wang, Yue-Feng; Li, Zhe; Huang, Xia-He; Wang, Ying-Chun; Mei, Ying-Chang; Zhao, Biligen-Gaowa; Li, Liang; Jiang, Yu-Bo; Wang, Bai-Chen

2016-06-01

Maize is unique since it is both monoecious and diclinous (separate male and female flowers on the same plant). We investigated the proteome and phosphoproteome of maize pollen containing modified proteins and here we provide a comprehensive pollen proteome and phosphoproteome which contain 100,990 peptides from 6750 proteins and 5292 phosphorylated sites corresponding to 2257 maize phosphoproteins, respectively. Interestingly, among the total 27 overrepresented phosphosite motifs we identified here, 11 were novel motifs, which suggested different modification mechanisms in plants compared to those of animals. Enrichment analysis of pollen phosphoproteins showed that pathways including DNA synthesis/chromatin structure, regulation of RNA transcription, protein modification, cell organization, signal transduction, cell cycle, vesicle transport, transport of ions and metabolisms, which were involved in pollen development, the following germination and pollen tube growth, were regulated by phosphorylation. In this study, we also found 430 kinases and 105 phosphatases in the maize pollen phosphoproteome, among which calcium dependent protein kinases (CDPKs), leucine rich repeat kinase, SNF1 related protein kinases and MAPK family proteins were heavily enriched and further analyzed. From our research, we also uncovered hundreds of male sterility-associated proteins and phosphoproteins that might influence maize productivity and serve as targets for hybrid maize seed production. At last, a putative complex signaling pathway involving CDPKs, MAPKs, ubiquitin ligases and multiple fertility proteins was constructed. Overall, our data provides new insight for further investigation of protein phosphorylation status in mature maize pollen and construction of maize male sterile mutants in the future.
The Drosophila Juvenile Hormone Receptor Candidates Methoprene-tolerant (MET) and Germ Cell-expressed (GCE) Utilize a Conserved LIXXL Motif to Bind the FTZ-F1 Nuclear Receptor*

PubMed Central

Bernardo, Travis J.; Dubrovsky, Edward B.

2012-01-01

Juvenile hormone (JH) has been implicated in many developmental processes in holometabolous insects, but its mechanism of signaling remains controversial. We previously found that in Drosophila Schneider 2 cells, the nuclear receptor FTZ-F1 is required for activation of the E75A gene by JH. Here, we utilized insect two-hybrid assays to show that FTZ-F1 interacts with two JH receptor candidates, the bHLH-PAS paralogs MET and GCE, in a JH-dependent manner. These interactions are severely reduced when helix 12 of the FTZ-F1 activation function 2 (AF2) is removed, implicating AF2 as an interacting site. Through homology modeling, we found that MET and GCE possess a C-terminal α-helix featuring a conserved motif LIXXL that represents a novel nuclear receptor (NR) box. Docking simulations supported by two-hybrid experiments revealed that FTZ-F1·MET and FTZ-F1·GCE heterodimer formation involves a typical NR box-AF2 interaction but does not require the canonical charge clamp residues of FTZ-F1 and relies primarily on hydrophobic contacts, including a unique interaction with helix 4. Moreover, we identified paralog-specific features, including a secondary interaction site found only in MET. Our findings suggest that a novel NR box enables MET and GCE to interact JH-dependently with the AF2 of FTZ-F1. PMID:22249180
A comprehensive analysis of Helicobacter pylori plasticity zones reveals that they are integrating conjugative elements with intermediate integration specificity.

PubMed

Fischer, Wolfgang; Breithaupt, Ute; Kern, Beate; Smith, Stella I; Spicher, Carolin; Haas, Rainer

2014-04-27

The human gastric pathogen Helicobacter pylori is a paradigm for chronic bacterial infections. Its persistence in the stomach mucosa is facilitated by several mechanisms of immune evasion and immune modulation, but also by an unusual genetic variability which might account for the capability to adapt to changing environmental conditions during long-term colonization. This variability is reflected by the fact that almost each infected individual is colonized by a genetically unique strain. Strain-specific genes are dispersed throughout the genome, but clusters of genes organized as genomic islands may also collectively be present or absent. We have comparatively analysed such clusters, which are commonly termed plasticity zones, in a high number of H. pylori strains of varying geographical origin. We show that these regions contain fixed gene sets, rather than being true regions of genome plasticity, but two different types and several subtypes with partly diverging gene content can be distinguished. Their genetic diversity is incongruent with variations in the rest of the genome, suggesting that they are subject to horizontal gene transfer within H. pylori populations. We identified 40 distinct integration sites in 45 genome sequences, with a conserved heptanucleotide motif that seems to be the minimal requirement for integration. The significant number of possible integration sites, together with the requirement for a short conserved integration motif and the high level of gene conservation, indicates that these elements are best described as integrating conjugative elements (ICEs) with an intermediate integration site specificity.
Search for global-minimum geometries of medium-sized germanium clusters. II. Motif-based low-lying clusters Ge21-Ge29

NASA Astrophysics Data System (ADS)

Yoo, S.; Zeng, X. C.

2006-05-01

We performed a constrained search for the geometries of low-lying neutral germanium clusters GeN in the size range of 21⩽N⩽29. The basin-hopping global optimization method is employed for the search. The potential-energy surface is computed based on the plane-wave pseudopotential density functional theory. A new series of low-lying clusters is found on the basis of several generic structural motifs identified previously for silicon clusters [S. Yoo and X. C. Zeng, J. Chem. Phys. 124, 054304 (2006)] as well as for smaller-sized germanium clusters [S. Bulusu et al., J. Chem. Phys. 122, 164305 (2005)]. Among the generic motifs examined, we found that two motifs stand out in producing most low-lying clusters, namely, the six/nine motif, a puckered-hexagonal-ring Ge6 unit attached to a tricapped trigonal prism Ge9, and the six/ten motif, a puckered-hexagonal-ring Ge6 unit attached to a bicapped antiprism Ge10. The low-lying clusters obtained are all prolate in shape and their energies are appreciably lower than the near-spherical low-energy clusters. This result is consistent with the ion-mobility measurement in that medium-sized germanium clusters detected are all prolate in shape until the size N ˜65.
Genome-Wide Identification of Mitogen-Activated Protein Kinase Gene Family across Fungal Lineage Shows Presence of Novel and Diverse Activation Loop Motifs

PubMed Central

Mohanta, Tapan Kumar; Mohanta, Nibedita; Parida, Pratap; Panda, Sujogya Kumar; Ponpandian, Lakshmi Narayanan; Bae, Hanhong

2016-01-01

The mitogen-activated protein kinase (MAPK) is characterized by the presence of the T-E-Y, T-D-Y, and T-G-Y motifs in its activation loop region and plays a significant role in regulating diverse cellular responses in eukaryotic organisms. Availability of large-scale genome data in the fungal kingdom encouraged us to identify and analyse the fungal MAPK gene family consisting of 173 fungal species. The analysis of the MAPK gene family resulted in the discovery of several novel activation loop motifs (T-T-Y, T-I-Y, T-N-Y, T-H-Y, T-S-Y, K-G-Y, T-Q-Y, S-E-Y and S-D-Y) in fungal MAPKs. The phylogenetic analysis suggests that fungal MAPKs are non-polymorphic, had evolved from their common ancestors around 1500 million years ago, and are distantly related to plant MAPKs. We are the first to report the presence of nine novel activation loop motifs in fungal MAPKs. The specificity of the activation loop motif plays a significant role in controlling different growth and stress related pathways in fungi. Hence, the presences of these nine novel activation loop motifs in fungi are of special interest. PMID:26918378
Bacterial RecA Protein Promotes Adenoviral Recombination during In Vitro Infection

PubMed Central

Lee, Jeong Yoon; Lee, Ji Sun; Materne, Emma C.; Rajala, Rahul; Ismail, Ashrafali M.; Seto, Donald; Dyer, David W.

2018-01-01

ABSTRACT Adenovirus infections in humans are common and sometimes lethal. Adenovirus-derived vectors are also commonly chosen for gene therapy in human clinical trials. We have shown in previous work that homologous recombination between adenoviral genomes of human adenovirus species D (HAdV-D), the largest and fastest growing HAdV species, is responsible for the rapid evolution of this species. Because adenovirus infection initiates in mucosal epithelia, particularly at the gastrointestinal, respiratory, genitourinary, and ocular surfaces, we sought to determine a possible role for mucosal microbiota in adenovirus genome diversity. By analysis of known recombination hot spots across 38 human adenovirus genomes in species D (HAdV-D), we identified nucleotide sequence motifs similar to bacterial Chi sequences, which facilitate homologous recombination in the presence of bacterial Rec enzymes. These motifs, referred to here as ChiAD, were identified immediately 5′ to the sequence encoding penton base hypervariable loop 2, which expresses the arginine-glycine-aspartate moiety critical to adenoviral cellular entry. Coinfection with two HAdV-Ds in the presence of an Escherichia coli lysate increased recombination; this was blocked in a RecA mutant strain, E. coli DH5α, or upon RecA depletion. Recombination increased in the presence of E. coli lysate despite a general reduction in viral replication. RecA colocalized with viral DNA in HAdV-D-infected cell nuclei and was shown to bind specifically to ChiAD sequences. These results indicate that adenoviruses may repurpose bacterial recombination machinery, a sharing of evolutionary mechanisms across a diverse microbiota, and unique example of viral commensalism. IMPORTANCE Adenoviruses are common human mucosal pathogens of the gastrointestinal, respiratory, and genitourinary tracts and ocular surface. Here, we report finding Chi-like sequences in adenovirus recombination hot spots. Adenovirus coinfection in the presence of bacterial RecA protein facilitated homologous recombination between viruses. Genetic recombination led to evolution of an important external feature on the adenoviral capsid, namely, the penton base protein hypervariable loop 2, which contains the arginine-glycine-aspartic acid motif critical to viral internalization. We speculate that free Rec proteins present in gastrointestinal secretions upon bacterial cell death facilitate the evolution of human adenoviruses through homologous recombination, an example of viral commensalism and the complexity of virus-host interactions, including regional microbiota. PMID:29925671
Selection of peptides binding to metallic borides by screening M13 phage display libraries.

PubMed

Ploss, Martin; Facey, Sandra J; Bruhn, Carina; Zemel, Limor; Hofmann, Kathrin; Stark, Robert W; Albert, Barbara; Hauer, Bernhard

2014-02-10

Metal borides are a class of inorganic solids that is much less known and investigated than for example metal oxides or intermetallics. At the same time it is a highly versatile and interesting class of compounds in terms of physical and chemical properties, like semiconductivity, ferromagnetism, or catalytic activity. This makes these substances attractive for the generation of new materials. Very little is known about the interaction between organic materials and borides. To generate nanostructured and composite materials which consist of metal borides and organic modifiers it is necessary to develop new synthetic strategies. Phage peptide display libraries are commonly used to select peptides that bind specifically to metals, metal oxides, and semiconductors. Further, these binding peptides can serve as templates to control the nucleation and growth of inorganic nanoparticles. Additionally, the combination of two different binding motifs into a single bifunctional phage could be useful for the generation of new composite materials. In this study, we have identified a unique set of sequences that bind to amorphous and crystalline nickel boride (Ni3B) nanoparticles, from a random peptide library using the phage display technique. Using this technique, strong binders were identified that are selective for nickel boride. Sequence analysis of the peptides revealed that the sequences exhibit similar, yet subtle different patterns of amino acid usage. Although a predominant binding motif was not observed, certain charged amino acids emerged as essential in specific binding to both substrates. The 7-mer peptide sequence LGFREKE, isolated on amorphous Ni3B emerged as the best binder for both substrates. Fluorescence microscopy and atomic force microscopy confirmed the specific binding affinity of LGFREKE expressing phage to amorphous and crystalline Ni3B nanoparticles. This study is, to our knowledge, the first to identify peptides that bind specifically to amorphous and to crystalline Ni3B nanoparticles. We think that the identified strong binding sequences described here could potentially serve for the utilisation of M13 phage as a viable alternative to other methods to create tailor-made boride composite materials or new catalytic surfaces by a biologically driven nano-assembly synthesis and structuring.
Identification of the regulatory autophosphorylation site of autophosphorylation-dependent protein kinase (auto-kinase). Evidence that auto-kinase belongs to a member of the p21-activated kinase family.

PubMed

Yu, J S; Chen, W J; Ni, M H; Chan, W H; Yang, S D

1998-08-15

Autophosphorylation-dependent protein kinase (auto-kinase) was identified from pig brain and liver on the basis of its unique autophosphorylation/activation property [Yang, Fong, Yu and Liu (1987) J. Biol. Chem. 262, 7034-7040; Yang, Chang and Soderling (1987) J. Biol. Chem. 262, 9421-9427]. Its substrate consensus sequence motif was determined as being -R-X-(X)-S*/T*-X3-S/T-. To characterize auto-kinase further, we partly sequenced the kinase purified from pig liver. The N-terminal sequence (VDGGAKTSDKQKKKAXMTDE) and two internal peptide sequences (EKLRTIV and LQNPEK/ILTP/FI) of auto-kinase were obtained. These sequences identify auto-kinase as a C-terminal catalytic fragment of p21-activated protein kinase 2 (PAK2 or gamma-PAK) lacking its N-terminal regulatory region. Auto-kinase can be recognized by an antibody raised against the C-terminal peptide of human PAK2 by immunoblotting. Furthermore the autophosphorylation site sequence of auto-kinase was successfully predicted on the basis of its substrate consensus sequence motif and the known PAK2 sequence, and was further demonstrated to be RST(P)MVGTPYWMAPEVVTR by phosphoamino acid analysis, manual Edman degradation and phosphopeptide mapping via the help of phosphorylation site analysis of a synthetic peptide corresponding to the sequence of PAK2 from residues 396 to 418. During the activation process, auto-kinase autophosphorylates mainly on a single threonine residue Thr402 (according to the sequence numbering of human PAK2). In addition, a phospho-specific antibody against a synthetic phosphopeptide containing this identified sequence was generated and shown to be able to differentially recognize the activated auto-kinase autophosphorylated at Thr402 but not the non-phosphorylated/inactive auto-kinase. Immunoblot analysis with this phospho-specific antibody further revealed that the change in phosphorylation level of Thr402 of auto-kinase was well correlated with the activity change of the kinase during both autophosphorylation/activation and protein phosphatase-mediated dephosphorylation/inactivation processes. Taken together, our results identify Thr402 as the regulatory autophosphorylation site of auto-kinase, which is a C-terminal catalytic fragment of PAK2.

Identification of the regulatory autophosphorylation site of autophosphorylation-dependent protein kinase (auto-kinase). Evidence that auto-kinase belongs to a member of the p21-activated kinase family.

PubMed Central

Yu, J S; Chen, W J; Ni, M H; Chan, W H; Yang, S D

1998-01-01

Autophosphorylation-dependent protein kinase (auto-kinase) was identified from pig brain and liver on the basis of its unique autophosphorylation/activation property [Yang, Fong, Yu and Liu (1987) J. Biol. Chem. 262, 7034-7040; Yang, Chang and Soderling (1987) J. Biol. Chem. 262, 9421-9427]. Its substrate consensus sequence motif was determined as being -R-X-(X)-S*/T*-X3-S/T-. To characterize auto-kinase further, we partly sequenced the kinase purified from pig liver. The N-terminal sequence (VDGGAKTSDKQKKKAXMTDE) and two internal peptide sequences (EKLRTIV and LQNPEK/ILTP/FI) of auto-kinase were obtained. These sequences identify auto-kinase as a C-terminal catalytic fragment of p21-activated protein kinase 2 (PAK2 or gamma-PAK) lacking its N-terminal regulatory region. Auto-kinase can be recognized by an antibody raised against the C-terminal peptide of human PAK2 by immunoblotting. Furthermore the autophosphorylation site sequence of auto-kinase was successfully predicted on the basis of its substrate consensus sequence motif and the known PAK2 sequence, and was further demonstrated to be RST(P)MVGTPYWMAPEVVTR by phosphoamino acid analysis, manual Edman degradation and phosphopeptide mapping via the help of phosphorylation site analysis of a synthetic peptide corresponding to the sequence of PAK2 from residues 396 to 418. During the activation process, auto-kinase autophosphorylates mainly on a single threonine residue Thr402 (according to the sequence numbering of human PAK2). In addition, a phospho-specific antibody against a synthetic phosphopeptide containing this identified sequence was generated and shown to be able to differentially recognize the activated auto-kinase autophosphorylated at Thr402 but not the non-phosphorylated/inactive auto-kinase. Immunoblot analysis with this phospho-specific antibody further revealed that the change in phosphorylation level of Thr402 of auto-kinase was well correlated with the activity change of the kinase during both autophosphorylation/activation and protein phosphatase-mediated dephosphorylation/inactivation processes. Taken together, our results identify Thr402 as the regulatory autophosphorylation site of auto-kinase, which is a C-terminal catalytic fragment of PAK2. PMID:9693111
Ligand binding by repeat proteins: natural and designed

PubMed Central

Grove, Tijana Z; Cortajarena, Aitziber L; Regan, Lynne

2012-01-01

Repeat proteins contain tandem arrays of small structural motifs. As a consequence of this architecture, they adopt non-globular, extended structures that present large, highly specific surfaces for ligand binding. Here we discuss recent advances toward understanding the functional role of this unique modular architecture. We showcase specific examples of natural repeat proteins interacting with diverse ligands and also present examples of designed repeat protein–ligand interactions. PMID:18602006
The HDAC complex and cytoskeleton.

PubMed

Kovacs, Jeffery J; Hubbert, Charlotte; Yao, Tso-Pang

2004-01-01

HDAC6 is a cytoplasmic deacetylase that dynamically associates with the microtubule and actin cytoskeletons. HDAC6 regulates growth factor-induced chemotaxis by its unique deacetylase activity towards microtubules or other substrates. Here we describe a non-catalytic structural domain that is essential for HDAC6 function and places HDAC6 as a critical mediator linking the acetylation and ubiquitination network. This evolutionarily conserved motif, termed the BUZ domain, has features of a zinc finger and binds both mono- and polyubiquitinated proteins. Furthermore, the BUZ domain promotes HDAC6 mono-ubiquitination. These results establish the BUZ domain, in addition to the UIM and CUE domains, as a novel motif that both binds ubiquitin and mediates mono-ubiquitination. Importantly, the BUZ domain is essential for HDAC6 to promote chemotaxis, indicating that communication with the ubiquitin network is critical for proper HDAC6 function. The unique presence of the UIM and CUE domains in proteins involved in endocytic trafficking suggests that HDAC6 might also regulate vesicle transport and protein degradation. Indeed, we have found that HDAC6 is actively transported and concentrated in vesicular compartments. We propose that an integration of reversible acetylation and ubiquitination by HDAC6 may be a novel component in regulating the cytoskeleton, vesicle transport and protein degradation.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns.

PubMed

Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie

2011-09-12

Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

PubMed Central

2011-01-01

Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886
Magnesium-binding architectures in RNA crystal structures: validation, binding preferences, classification and motif detection

PubMed Central

Zheng, Heping; Shabalin, Ivan G.; Handing, Katarzyna B.; Bujnicki, Janusz M.; Minor, Wladek

2015-01-01

The ubiquitous presence of magnesium ions in RNA has long been recognized as a key factor governing RNA folding, and is crucial for many diverse functions of RNA molecules. In this work, Mg2+-binding architectures in RNA were systematically studied using a database of RNA crystal structures from the Protein Data Bank (PDB). Due to the abundance of poorly modeled or incorrectly identified Mg2+ ions, the set of all sites was comprehensively validated and filtered to identify a benchmark dataset of 15 334 ‘reliable’ RNA-bound Mg2+ sites. The normalized frequencies by which specific RNA atoms coordinate Mg2+ were derived for both the inner and outer coordination spheres. A hierarchical classification system of Mg2+ sites in RNA structures was designed and applied to the benchmark dataset, yielding a set of 41 types of inner-sphere and 95 types of outer-sphere coordinating patterns. This classification system has also been applied to describe six previously reported Mg2+-binding motifs and detect them in new RNA structures. Investigation of the most populous site types resulted in the identification of seven novel Mg2+-binding motifs, and all RNA structures in the PDB were screened for the presence of these motifs. PMID:25800744
Missing link in the evolution of Hox clusters.

PubMed

Ogishima, Soichi; Tanaka, Hiroshi

2007-01-31

Hox cluster has key roles in regulating the patterning of the antero-posterior axis in a metazoan embryo. It consists of the anterior, central and posterior genes; the central genes have been identified only in bilaterians, but not in cnidarians, and are responsible for archiving morphological complexity in bilaterian development. However, their evolutionary history has not been revealed, that is, there has been a "missing link". Here we show the evolutionary history of Hox clusters of 18 bilaterians and 2 cnidarians by using a new method, "motif-based reconstruction", examining the gain/loss processes of evolutionarily conserved sequences, "motifs", outside the homeodomain. We successfully identified the missing link in the evolution of Hox clusters between the cnidarian-bilaterian ancestor and the bilaterians as the ancestor of the central genes, which we call the proto-central gene. Exploring the correspondent gene with the proto-central gene, we found that one of the acoela Hox genes has the same motif repertory as that of the proto-central gene. This interesting finding suggests that the acoela Hox cluster corresponds with the missing link in the evolution of the Hox cluster between the cnidarian-bilaterian ancestor and the bilaterians. Our findings suggested that motif gains/diversifications led to the explosive diversity of the bilaterian body plan.
Structural details (kinks and non-α conformations) in transmembrane helices are intrahelically determined and can be predicted by sequence pattern descriptors

PubMed Central

Rigoutsos, Isidore; Riek, Peter; Graham, Robert M.; Novotny, Jiri

2003-01-01

One of the promising methods of protein structure prediction involves the use of amino acid sequence-derived patterns. Here we report on the creation of non-degenerate motif descriptors derived through data mining of training sets of residues taken from the transmembrane-spanning segments of polytopic proteins. These residues correspond to short regions in which there is a deviation from the regular α-helical character (i.e. π-helices, 310-helices and kinks). A ‘search engine’ derived from these motif descriptors correctly identifies, and discriminates amongst instances of the above ‘non-canonical’ helical motifs contained in the SwissProt/TrEMBL database of protein primary structures. Our results suggest that deviations from α-helicity are encoded locally in sequence patterns only about 7–9 residues long and can be determined in silico directly from the amino acid sequence. Delineation of such variations in helical habit is critical to understanding the complex structure–function relationships of polytopic proteins and for drug discovery. The success of our current methodology foretells development of similar prediction tools capable of identifying other structural motifs from sequence alone. The method described here has been implemented and is available on the World Wide Web at http://cbcsrv.watson.ibm.com/Ttkw.html. PMID:12888523
Structural details (kinks and non-alpha conformations) in transmembrane helices are intrahelically determined and can be predicted by sequence pattern descriptors.

PubMed

Rigoutsos, Isidore; Riek, Peter; Graham, Robert M; Novotny, Jiri

2003-08-01

One of the promising methods of protein structure prediction involves the use of amino acid sequence-derived patterns. Here we report on the creation of non-degenerate motif descriptors derived through data mining of training sets of residues taken from the transmembrane-spanning segments of polytopic proteins. These residues correspond to short regions in which there is a deviation from the regular alpha-helical character (i.e. pi-helices, 3(10)-helices and kinks). A 'search engine' derived from these motif descriptors correctly identifies, and discriminates amongst instances of the above 'non-canonical' helical motifs contained in the SwissProt/TrEMBL database of protein primary structures. Our results suggest that deviations from alpha-helicity are encoded locally in sequence patterns only about 7-9 residues long and can be determined in silico directly from the amino acid sequence. Delineation of such variations in helical habit is critical to understanding the complex structure-function relationships of polytopic proteins and for drug discovery. The success of our current methodology foretells development of similar prediction tools capable of identifying other structural motifs from sequence alone. The method described here has been implemented and is available on the World Wide Web at http://cbcsrv.watson.ibm.com/Ttkw.html.
NF-Y Binding Site Architecture Defines a C-Fos Targeted Promoter Class

PubMed Central

Haubrock, Martin; Hartmann, Fabian; Wingender, Edgar

2016-01-01

ChIP-seq experiments detect the chromatin occupancy of known transcription factors in a genome-wide fashion. The comparisons of several species-specific ChIP-seq libraries done for different transcription factors have revealed a complex combinatorial and context-specific co-localization behavior for the identified binding regions. In this study we have investigated human derived ChIP-seq data to identify common cis-regulatory principles for the human transcription factor c-Fos. We found that in four different cell lines, c-Fos targeted proximal and distal genomic intervals show prevalences for either AP-1 motifs or CCAAT boxes as known binding motifs for the transcription factor NF-Y, and thereby act in a mutually exclusive manner. For proximal regions of co-localized c-Fos and NF-YB binding, we gathered evidence that a characteristic configuration of repeating CCAAT motifs may be responsible for attracting c-Fos, probably provided by a nearby AP-1 bound enhancer. Our results suggest a novel regulatory function of NF-Y in gene-proximal regions. Specific CCAAT dimer repeats bound by the transcription factor NF-Y define this novel cis-regulatory module. Based on this behavior we propose a new enhancer promoter interaction model based on AP-1 motif defined enhancers which interact with CCAAT-box characterized promoter regions. PMID:27517874
Role of sequence encoded κB DNA geometry in gene regulation by Dorsal

PubMed Central

Mrinal, Nirotpal; Tomar, Archana; Nagaraju, Javaregowda

2011-01-01

Many proteins of the Rel family can act as both transcriptional activators and repressors. However, mechanism that discerns the ‘activator/repressor’ functions of Rel-proteins such as Dorsal (Drosophila homologue of mammalian NFκB) is not understood. Using genomic, biophysical and biochemical approaches, we demonstrate that the underlying principle of this functional specificity lies in the ‘sequence-encoded structure’ of the κB-DNA. We show that Dorsal-binding motifs exist in distinct activator and repressor conformations. Molecular dynamics of DNA-Dorsal complexes revealed that repressor κB-motifs typically have A-tract and flexible conformation that facilitates interaction with co-repressors. Deformable structure of repressor motifs, is due to changes in the hydrogen bonding in A:T pair in the ‘A-tract’ core. The sixth nucleotide in the nonameric κB-motif, ‘A’ (A6) in the repressor motifs and ‘T’ (T6) in the activator motifs, is critical to confer this functional specificity as A6 → T6 mutation transformed flexible repressor conformation into a rigid activator conformation. These results highlight that ‘sequence encoded κB DNA-geometry’ regulates gene expression by exerting allosteric effect on binding of Rel proteins which in turn regulates interaction with co-regulators. Further, we identified and characterized putative repressor motifs in Dl-target genes, which can potentially aid in functional annotation of Dorsal gene regulatory network. PMID:21890896
IndeCut evaluates performance of network motif discovery algorithms.

PubMed

Ansariola, Mitra; Megraw, Molly; Koslicki, David

2018-05-01

Genomic networks represent a complex map of molecular interactions which are descriptive of the biological processes occurring in living cells. Identifying the small over-represented circuitry patterns in these networks helps generate hypotheses about the functional basis of such complex processes. Network motif discovery is a systematic way of achieving this goal. However, a reliable network motif discovery outcome requires generating random background networks which are the result of a uniform and independent graph sampling method. To date, there has been no method to numerically evaluate whether any network motif discovery algorithm performs as intended on realistically sized datasets-thus it was not possible to assess the validity of resulting network motifs. In this work, we present IndeCut, the first method to date that characterizes network motif finding algorithm performance in terms of uniform sampling on realistically sized networks. We demonstrate that it is critical to use IndeCut prior to running any network motif finder for two reasons. First, IndeCut indicates the number of samples needed for a tool to produce an outcome that is both reproducible and accurate. Second, IndeCut allows users to choose the tool that generates samples in the most independent fashion for their network of interest among many available options. The open source software package is available at https://github.com/megrawlab/IndeCut. megrawm@science.oregonstate.edu or david.koslicki@math.oregonstate.edu. Supplementary data are available at Bioinformatics online.
Searching for statistically significant regulatory modules.

PubMed

Bailey, Timothy L; Noble, William Stafford

2003-10-01

The regulatory machinery controlling gene expression is complex, frequently requiring multiple, simultaneous DNA-protein interactions. The rate at which a gene is transcribed may depend upon the presence or absence of a collection of transcription factors bound to the DNA near the gene. Locating transcription factor binding sites in genomic DNA is difficult because the individual sites are small and tend to occur frequently by chance. True binding sites may be identified by their tendency to occur in clusters, sometimes known as regulatory modules. We describe an algorithm for detecting occurrences of regulatory modules in genomic DNA. The algorithm, called mcast, takes as input a DNA database and a collection of binding site motifs that are known to operate in concert. mcast uses a motif-based hidden Markov model with several novel features. The model incorporates motif-specific p-values, thereby allowing scores from motifs of different widths and specificities to be compared directly. The p-value scoring also allows mcast to only accept motif occurrences with significance below a user-specified threshold, while still assigning better scores to motif occurrences with lower p-values. mcast can search long DNA sequences, modeling length distributions between motifs within a regulatory module, but ignoring length distributions between modules. The algorithm produces a list of predicted regulatory modules, ranked by E-value. We validate the algorithm using simulated data as well as real data sets from fruitfly and human. http://meme.sdsc.edu/MCAST/paper
Quantitatively and Kinetically Identifying Binding Motifs of Amelogenin Proteins to Mineral Crystals Through Biochemical and Spectroscopic Assays

PubMed Central

Zhu, Li; Hwang, Peter; Witkowska, H. Ewa; Liu, Haichuan; Li, Wu

2014-01-01

Tooth enamel is the hardest tissue in vertebrate animals. Consisting of millions of carbonated hydroxyapatite crystals, this highly mineralized tissue develops from a protein matrix in which amelogenin is the predominant component. The enamel matrix proteins are eventually and completely degraded and removed by proteinases to form mineral-enriched tooth enamel. Identification of the apatite-binding motifs in amelogenin is critical for understanding the amelogenin–crystal interactions and amelogenin–proteinases interactions during tooth enamel biomineralization. A stepwise strategy is introduced to kinetically and quantitatively identify the crystal-binding motifs in amelogenin, including a peptide screening assay, a competitive adsorption assay, and a kinetic-binding assay using amelogenin and gene-engineered amelogenin mutants. A modified enzyme-linked immunosorbent assay on crystal surfaces is also applied to compare binding amounts of amelogenin and its mutants on different planes of apatite crystals. We describe the detailed protocols for these assays and provide the considerations for these experiments in this chapter. PMID:24188774
Growth factor pleiotropy is controlled by a receptor Tyr/Ser motif that acts as a binary switch

PubMed Central

Guthridge, Mark A; Powell, Jason A; Barry, Emma F; Stomski, Frank C; McClure, Barbara J; Ramshaw, Hayley; Felquer, Fernando A; Dottore, Mara; Thomas, Daniel T; To, Bik; Begley, C Glenn; Lopez, Angel F

2006-01-01

Pleiotropism is a hallmark of cytokines and growth factors; yet, the underlying mechanisms are not clearly understood. We have identified a motif in the granulocyte macrophage-colony-stimulating factor receptor composed of a tyrosine and a serine residue that functions as a binary switch for the independent regulation of multiple biological activities. Signalling occurs either through Ser585 at lower cytokine concentrations, leading to cell survival only, or through Tyr577 at higher cytokine concentrations, leading to cell survival as well as proliferation, differentiation or functional activation. The phosphorylation of Ser585 and Tyr577 is mutually exclusive and occurs via a unidirectional mechanism that involves protein kinase A and tyrosine kinases, respectively, and is deregulated in at least some leukemias. We have identified similar Tyr/Ser motifs in other cell surface receptors, suggesting that such signalling switches may play important roles in generating specificity and pleiotropy in other biological systems. PMID:16437163
Transcriptome Analysis of an Insecticide Resistant Housefly Strain: Insights about SNPs and Regulatory Elements in Cytochrome P450 Genes.

PubMed

Mahmood, Khalid; Højland, Dorte H; Asp, Torben; Kristensen, Michael

2016-01-01

Insecticide resistance in the housefly, Musca domestica, has been investigated for more than 60 years. It will enter a new era after the recent publication of the housefly genome and the development of multiple next generation sequencing technologies. The genetic background of the xenobiotic response can now be investigated in greater detail. Here, we investigate the 454-pyrosequencing transcriptome of the spinosad-resistant 791spin strain in relation to the housefly genome with focus on P450 genes. The de novo assembly of clean reads gave 35,834 contigs consisting of 21,780 sequences of the spinosad resistant strain. The 3,648 sequences were annotated with an enzyme code EC number and were mapped to 124 KEGG pathways with metabolic processes as most highly represented pathway. One hundred and twenty contigs were annotated as P450s covering 44 different P450 genes of housefly. Eight differentially expressed P450s genes were identified and investigated for SNPs, CpG islands and common regulatory motifs in promoter and coding regions. Functional annotation clustering of metabolic related genes and motif analysis of P450s revealed their association with epigenetic, transcription and gene expression related functions. The sequence variation analysis resulted in 12 SNPs and eight of them found in cyp6d1. There is variation in location, size and frequency of CpG islands and specific motifs were also identified in these P450s. Moreover, identified motifs were associated to GO terms and transcription factors using bioinformatic tools. Transcriptome data of a spinosad resistant strain provide together with genome data fundamental support for future research to understand evolution of resistance in houseflies. Here, we report for the first time the SNPs, CpG islands and common regulatory motifs in differentially expressed P450s. Taken together our findings will serve as a stepping stone to advance understanding of the mechanism and role of P450s in xenobiotic detoxification.
Different binding motifs of the celiac disease-associated HLA molecules DQ2.5, DQ2.2, and DQ7.5 revealed by relative quantitative proteomics of endogenous peptide repertoires.

PubMed

Bergseng, Elin; Dørum, Siri; Arntzen, Magnus Ø; Nielsen, Morten; Nygård, Ståle; Buus, Søren; de Souza, Gustavo A; Sollid, Ludvig M

2015-02-01

Celiac disease is caused by intolerance to cereal gluten proteins, and HLA-DQ molecules are involved in the disease pathogenesis by presentation of gluten peptides to CD4(+) T cells. The α- or β-chain sharing HLA molecules DQ2.5, DQ2.2, and DQ7.5 display different risks for the disease. It was recently demonstrated that T cells of DQ2.5 and DQ2.2 patients recognize distinct sets of gluten epitopes, suggesting that these two DQ2 variants select different peptides for display. To explore whether this is the case, we performed a comprehensive comparison of the endogenous self-peptides bound to HLA-DQ molecules of B-lymphoblastoid cell lines. Peptides were eluted from affinity-purified HLA molecules of nine cell lines and subjected to quadrupole orbitrap mass spectrometry and MaxQuant software analysis. Altogether, 12,712 endogenous peptides were identified at very different relative abundances. Hierarchical clustering of normalized quantitative data demonstrated significant differences in repertoires of peptides between the three DQ variant molecules. The neural network-based method, NNAlign, was used to identify peptide-binding motifs. The binding motifs of DQ2.5 and DQ7.5 concurred with previously established binding motifs. The binding motif of DQ2.2 was strikingly different from that of DQ2.5 with position P3 being a major anchor having a preference for threonine and serine. This is notable as three recently identified epitopes of gluten recognized by T cells of DQ2.2 celiac patients harbor serine at position P3. This study demonstrates that relative quantitative comparison of endogenous peptides sampled from our protein metabolism by HLA molecules provides clues to understand HLA association with disease.
Evolutionary gradient of predicted nuclear localization signals (NLS)-bearing proteins in genomes of family Planctomycetaceae.

PubMed

Guo, Min; Yang, Ruifu; Huang, Chen; Liao, Qiwen; Fan, Guangyi; Sun, Chenghang; Lee, Simon Ming-Yuen

2017-04-04

The nuclear envelope is considered a key classification marker that distinguishes prokaryotes from eukaryotes. However, this marker does not apply to the family Planctomycetaceae, which has intracellular spaces divided by lipidic intracytoplasmic membranes (ICMs). Nuclear localization signal (NLS), a short stretch of amino acid sequence, destines to transport proteins from cytoplasm into nucleus, and is also associated with the development of nuclear envelope. We attempted to investigate the NLS motifs in Planctomycetaceae genomes to demonstrate the potential molecular transition in the development of intracellular membrane system. In this study, we identified NLS-like motifs that have the same amino acid compositions as experimentally identified NLSs in genomes of 11 representative species of family Planctomycetaceae. A total of 15 NLS types and 170 NLS-bearing proteins were detected in the 11 strains. To determine the molecular transformation, we compared NLS-bearing protein abundances in the 11 representative Planctomycetaceae genomes with them in genomes of 16 taxonomically varied microorganisms: nine bacteria, two archaea and five fungi. In the 27 strains, 29 NLS types and 1101 NLS-bearing proteins were identified, principal component analysis showed a significant transitional gradient from bacteria to Planctomycetaceae to fungi on their NLS-bearing protein abundance profiles. Then, we clustered the 993 non-redundant NLS-bearing proteins into 181 families and annotated their involved metabolic pathways. Afterwards, we aligned the ten types of NLS motifs from the 13 families containing NLS-bearing proteins among bacteria, Planctomycetaceae or fungi, considering their diversity, length and origin. A transition towards increased complexity from non-planctomycete bacteria to Planctomycetaceae to archaea and fungi was detected based on the complexity of the 10 types of NLS-like motifs in the 13 NLS-bearing proteins families. The results of this study reveal that Planctomycetaceae separates slightly from the members of non-planctomycete bacteria but still has substantial differences from fungi, based on the NLS-like motifs and NLS-bearing protein analysis.
Functional Genomics Analysis of Big Data Identifies Novel Peroxisome Proliferator-Activated Receptor γ Target Single Nucleotide Polymorphisms Showing Association With Cardiometabolic Outcomes.

PubMed

Richardson, Kris; Schnitzler, Gavin R; Lai, Chao-Qiang; Ordovas, Jose M

2015-12-01

Cardiovascular disease and type 2 diabetes mellitus represent overlapping diseases where a large portion of the variation attributable to genetics remains unexplained. An important player in their pathogenesis is peroxisome proliferator-activated receptor γ (PPARγ) that is involved in lipid and glucose metabolism and maintenance of metabolic homeostasis. We used a functional genomics methodology to interrogate human chromatin immunoprecipitation-sequencing, genome-wide association studies, and expression quantitative trait locus data to inform selection of candidate functional single nucleotide polymorphisms (SNPs) falling in PPARγ motifs. We derived 27 328 chromatin immunoprecipitation-sequencing peaks for PPARγ in human adipocytes through meta-analysis of 3 data sets. The PPARγ consensus motif showed greatest enrichment and mapped to 8637 peaks. We identified 146 SNPs in these motifs. This number was significantly less than would be expected by chance, and Inference of Natural Selection from Interspersed Genomically coHerent elemenTs analysis indicated that these motifs are under weak negative selection. A screen of these SNPs against genome-wide association studies for cardiometabolic traits revealed significant enrichment with 16 SNPs. A screen against the MuTHER expression quantitative trait locus data revealed 8 of these were significantly associated with altered gene expression in human adipose, more than would be expected by chance. Several SNPs fall close, or are linked by expression quantitative trait locus to lipid-metabolism loci including CYP26A1. We demonstrated the use of functional genomics to identify SNPs of potential function. Specifically, that SNPs within PPARγ motifs that bind PPARγ in adipocytes are significantly associated with cardiometabolic disease and with the regulation of transcription in adipose. This method may be used to uncover functional SNPs that do not reach significance thresholds in the agnostic approach of genome-wide association studies. © 2015 American Heart Association, Inc.
The human RNA-binding protein and E3 ligase MEX-3C binds the MEX-3-recognition element (MRE) motif with high affinity.

PubMed

Yang, Lingna; Wang, Chongyuan; Li, Fudong; Zhang, Jiahai; Nayab, Anam; Wu, Jihui; Shi, Yunyu; Gong, Qingguo

2017-09-29

MEX-3 is a K-homology (KH) domain-containing RNA-binding protein first identified as a translational repressor in Caenorhabditis elegans , and its four orthologs (MEX-3A-D) in human and mouse were subsequently found to have E3 ubiquitin ligase activity mediated by a RING domain and critical for RNA degradation. Current evidence implicates human MEX-3C in many essential biological processes and suggests a strong connection with immune diseases and carcinogenesis. The highly conserved dual KH domains in MEX-3 proteins enable RNA binding and are essential for the recognition of the 3'-UTR and post-transcriptional regulation of MEX-3 target transcripts. However, the molecular mechanisms of translational repression and the consensus RNA sequence recognized by the MEX-3C KH domain are unknown. Here, using X-ray crystallography and isothermal titration calorimetry, we investigated the RNA-binding activity and selectivity of human MEX-3C dual KH domains. Our high-resolution crystal structures of individual KH domains complexed with a noncanonical U-rich and a GA-rich RNA sequence revealed that the KH1/2 domains of human MEX-3C bound MRE10, a 10-mer RNA (5'-CAGAGUUUAG-3') consisting of an eight-nucleotide MEX-3-recognition element (MRE) motif, with high affinity. Of note, we also identified a consensus RNA motif recognized by human MEX-3C. The potential RNA-binding sites in the 3'-UTR of the human leukocyte antigen serotype ( HLA-A2 ) mRNA were mapped with this RNA-binding motif and further confirmed by fluorescence polarization. The binding motif identified here will provide valuable information for future investigations of the functional pathways controlled by human MEX-3C and for predicting potential mRNAs regulated by this enzyme. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

Mutations in a Highly Conserved Motif of nsp1β Protein Attenuate the Innate Immune Suppression Function of Porcine Reproductive and Respiratory Syndrome Virus

PubMed Central

Li, Yanhua; Shyu, Duan-Liang; Shang, Pengcheng; Bai, Jianfa; Ouyang, Kang; Dhakal, Santosh; Hiremath, Jagadish; Binjawadagi, Basavaraj

2016-01-01

ABSTRACT Porcine reproductive and respiratory syndrome virus (PRRSV) nonstructural protein 1β (nsp1β) is a multifunctional viral protein, which is involved in suppressing the host innate immune response and activating a unique −2/−1 programmed ribosomal frameshifting (PRF) signal for the expression of frameshifting products. In this study, site-directed mutagenesis analysis showed that the R128A or R129A mutation introduced into a highly conserved motif (123GKYLQRRLQ131) reduced the ability of nsp1β to suppress interferon beta (IFN-β) activation and also impaired nsp1β's function as a PRF transactivator. Three recombinant viruses, vR128A, vR129A, and vRR129AA, carrying single or double mutations in the GKYLQRRLQ motif were characterized. In comparison to the wild-type (WT) virus, vR128A and vR129A showed slightly reduced growth abilities, while the vRR129AA mutant had a significantly reduced growth ability in infected cells. Consistent with the attenuated growth phenotype in vitro, pigs infected with nsp1β mutants had lower levels of viremia than did WT virus-infected pigs. Compared to the WT virus in infected cells, all three mutated viruses stimulated high levels of IFN-α expression and exhibited a reduced ability to suppress the mRNA expression of selected interferon-stimulated genes (ISGs). In pigs infected with nsp1β mutants, IFN-α production was increased in the lungs at early time points postinfection, which was correlated with increased innate NK cell function. Furthermore, the augmented innate response was consistent with the increased production of IFN-γ in pigs infected with mutated viruses. These data demonstrate that residues R128 and R129 are critical for nsp1β function and that modifying these key residues in the GKYLQRRLQ motif attenuates virus growth ability and improves the innate and adaptive immune responses in infected animals. IMPORTANCE PRRSV infection induces poor antiviral innate IFN and cytokine responses, which results in weak adaptive immunity. One of the strategies in next-generation vaccine construction is to manipulate viral proteins/genetic elements involved in antagonizing the host immune response. PRRSV nsp1β was identified to be a strong innate immune antagonist. In this study, two basic amino acids, R128 and R129, in a highly conserved GKYLQRRLQ motif were determined to be critical for nsp1β function. Mutations introduced into these two residues attenuated virus growth and improved the innate and adaptive immune responses of infected animals. Technologies developed in this study could be broadly applied to current commercial PRRSV modified live-virus (MLV) vaccines and other candidate vaccines. PMID:26792733
A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements

DOE PAGES

Henry, Kelli F.; Kawashima, Tomokazu; Goldberg, Robert B.

2015-03-22

Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean ( Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we usemore » site-directed mutagenesis experiments in transgenic tobacco globularstage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. Lastly, a homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.« less
A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Henry, Kelli F.; Kawashima, Tomokazu; Goldberg, Robert B.

Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean ( Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we usemore » site-directed mutagenesis experiments in transgenic tobacco globularstage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. Lastly, a homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.« less
FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral

PubMed Central

Garcia, Fernando; Lopez, Francisco J; Cano, Carlos; Blanco, Armando

2009-01-01

Background Regulatory motifs describe sets of related transcription factor binding sites (TFBSs) and can be represented as position frequency matrices (PFMs). De novo identification of TFBSs is a crucial problem in computational biology which includes the issue of comparing putative motifs with one another and with motifs that are already known. The relative importance of each nucleotide within a given position in the PFMs should be considered in order to compute PFM similarities. Furthermore, biological data are inherently noisy and imprecise. Fuzzy set theory is particularly suitable for modeling imprecise data, whereas fuzzy integrals are highly appropriate for representing the interaction among different information sources. Results We propose FISim, a new similarity measure between PFMs, based on the fuzzy integral of the distance of the nucleotides with respect to the information content of the positions. Unlike existing methods, FISim is designed to consider the higher contribution of better conserved positions to the binding affinity. FISim provides excellent results when dealing with sets of randomly generated motifs, and outperforms the remaining methods when handling real datasets of related motifs. Furthermore, we propose a new cluster methodology based on kernel theory together with FISim to obtain groups of related motifs potentially bound by the same TFs, providing more robust results than existing approaches. Conclusion FISim corrects a design flaw of the most popular methods, whose measures favour similarity of low information content positions. We use our measure to successfully identify motifs that describe binding sites for the same TF and to solve real-life problems. In this study the reliability of fuzzy technology for motif comparison tasks is proven. PMID:19615102
A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements.

PubMed

Henry, Kelli F; Kawashima, Tomokazu; Goldberg, Robert B

2015-06-01

Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean (Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we use site-directed mutagenesis experiments in transgenic tobacco globular-stage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. A homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.
Two alternative ways of start site selection in human norovirus reinitiation of translation.

PubMed

Luttermann, Christine; Meyers, Gregor

2014-04-25

The calicivirus minor capsid protein VP2 is expressed via termination/reinitiation. This process depends on an upstream sequence element denoted termination upstream ribosomal binding site (TURBS). We have shown for feline calicivirus and rabbit hemorrhagic disease virus that the TURBS contains three sequence motifs essential for reinitiation. Motif 1 is conserved among caliciviruses and is complementary to a sequence in the 18 S rRNA leading to the model that hybridization between motif 1 and 18 S rRNA tethers the post-termination ribosome to the mRNA. Motif 2 and motif 2* are proposed to establish a secondary structure positioning the ribosome relative to the start site of the terminal ORF. Here, we analyzed human norovirus (huNV) sequences for the presence and importance of these motifs. The three motifs were identified by sequence analyses in the region upstream of the VP2 start site, and we showed that these motifs are essential for reinitiation of huNV VP2 translation. More detailed analyses revealed that the site of reinitiation is not fixed to a single codon and does not need to be an AUG, even though this codon is clearly preferred. Interestingly, we were able to show that reinitiation can occur at AUG codons downstream of the canonical start/stop site in huNV and feline calicivirus but not in rabbit hemorrhagic disease virus. Although reinitiation at the original start site is independent of the Kozak context, downstream initiation exhibits requirements for start site sequence context known for linear scanning. These analyses on start codon recognition give a more detailed insight into this fascinating mechanism of gene expression.
Complex Interplay among DNA Modification, Noncoding RNA Expression and Protein-Coding RNA Expression in Salvia miltiorrhiza Chloroplast Genome

PubMed Central

Chen, Haimei; Zhang, Jianhui; Yuan, George; Liu, Chang

2014-01-01

Salvia miltiorrhiza is one of the most widely used medicinal plants. As a first step to develop a chloroplast-based genetic engineering method for the over-production of active components from S. miltiorrhiza, we have analyzed the genome, transcriptome, and base modifications of the S. miltiorrhiza chloroplast. Total genomic DNA and RNA were extracted from fresh leaves and then subjected to strand-specific RNA-Seq and Single-Molecule Real-Time (SMRT) sequencing analyses. Mapping the RNA-Seq reads to the genome assembly allowed us to determine the relative expression levels of 80 protein-coding genes. In addition, we identified 19 polycistronic transcription units and 136 putative antisense and intergenic noncoding RNA (ncRNA) genes. Comparison of the abundance of protein-coding transcripts (cRNA) with and without overlapping antisense ncRNAs (asRNA) suggest that the presence of asRNA is associated with increased cRNA abundance (p<0.05). Using the SMRT Portal software (v1.3.2), 2687 potential DNA modification sites and two potential DNA modification motifs were predicted. The two motifs include a TATA box–like motif (CPGDMM1, “TATANNNATNA”), and an unknown motif (CPGDMM2 “WNYANTGAW”). Specifically, 35 of the 97 CPGDMM1 motifs (36.1%) and 91 of the 369 CPGDMM2 motifs (24.7%) were found to be significantly modified (p<0.01). Analysis of genes downstream of the CPGDMM1 motif revealed the significantly increased abundance of ncRNA genes that are less than 400 bp away from the significantly modified CPGDMM1motif (p<0.01). Taking together, the present study revealed a complex interplay among DNA modifications, ncRNA and cRNA expression in chloroplast genome. PMID:24914614
Complex interplay among DNA modification, noncoding RNA expression and protein-coding RNA expression in Salvia miltiorrhiza chloroplast genome.

PubMed

Chen, Haimei; Zhang, Jianhui; Yuan, George; Liu, Chang

2014-01-01

Salvia miltiorrhiza is one of the most widely used medicinal plants. As a first step to develop a chloroplast-based genetic engineering method for the over-production of active components from S. miltiorrhiza, we have analyzed the genome, transcriptome, and base modifications of the S. miltiorrhiza chloroplast. Total genomic DNA and RNA were extracted from fresh leaves and then subjected to strand-specific RNA-Seq and Single-Molecule Real-Time (SMRT) sequencing analyses. Mapping the RNA-Seq reads to the genome assembly allowed us to determine the relative expression levels of 80 protein-coding genes. In addition, we identified 19 polycistronic transcription units and 136 putative antisense and intergenic noncoding RNA (ncRNA) genes. Comparison of the abundance of protein-coding transcripts (cRNA) with and without overlapping antisense ncRNAs (asRNA) suggest that the presence of asRNA is associated with increased cRNA abundance (p<0.05). Using the SMRT Portal software (v1.3.2), 2687 potential DNA modification sites and two potential DNA modification motifs were predicted. The two motifs include a TATA box-like motif (CPGDMM1, "TATANNNATNA"), and an unknown motif (CPGDMM2 "WNYANTGAW"). Specifically, 35 of the 97 CPGDMM1 motifs (36.1%) and 91 of the 369 CPGDMM2 motifs (24.7%) were found to be significantly modified (p<0.01). Analysis of genes downstream of the CPGDMM1 motif revealed the significantly increased abundance of ncRNA genes that are less than 400 bp away from the significantly modified CPGDMM1motif (p<0.01). Taking together, the present study revealed a complex interplay among DNA modifications, ncRNA and cRNA expression in chloroplast genome.
Affinity and specificity of interactions between Nedd4 isoforms and the epithelial Na+ channel.

PubMed

Henry, Pauline C; Kanelis, Voula; O'Brien, M Christine; Kim, Brian; Gautschi, Ivan; Forman-Kay, Julie; Schild, Laurent; Rotin, Daniela

2003-05-30

The epithelial Na+ channel (alphabetagammaENaC) regulates salt and fluid homeostasis and blood pressure. Each ENaC subunit contains a PY motif (PPXY) that binds to the WW domains of Nedd4, a Hect family ubiquitin ligase containing 3-4 WW domains and usually a C2 domain. It has been proposed that Nedd4-2, but not Nedd4-1, isoforms can bind to and suppress ENaC activity. Here we challenge this notion and show that, instead, the presence of a unique WW domain (WW3*) in either Nedd4-2 or Nedd4-1 determines high affinity interactions and the ability to suppress ENaC. WW3* from either Nedd4-2 or Nedd4-1 binds ENaC-PY motifs equally well (e.g. Kd approximately 10 microm for alpha- or betaENaC, 3-6-fold higher affinity than WW4), as determined by intrinsic tryptophan fluorescence. Moreover, dNedd4-1, which naturally contains a WW3* instead of WW2, is able to suppress ENaC function equally well as Nedd4-2. Homology models of the WW3*.betaENaC-PY complex revealed that a Pro and Ala conserved in all WW3*, but not other Nedd4-WW domains, help form the binding pocket for PY motif prolines. Extensive contacts are formed between the betaENaC-PY motif and the Pro in WW3*, and the small Ala creates a large pocket to accommodate the peptide. Indeed, mutating the conserved Pro and Ala in WW3* reduces binding affinity 2-3-fold. Additionally, we demonstrate that mutations in PY motif residues that form contacts with the WW domain based on our previously solved structure either abolish or severely reduce binding affinity to the WW domain and that the extent of binding correlates with the level of ENaC suppression. Independently, we show that a peptide encompassing the PY motif of sgk1, previously proposed to bind to Nedd4-2 and alter its ability to regulate ENaC, does not bind (or binds poorly) the WW domains of Nedd4-2. Collectively, these results suggest that high affinity of WW domain-PY-motif interactions rather than affiliation with Nedd4-1/Nedd-2 is critical for ENaC suppression by Nedd4 proteins.
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.

PubMed

Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark

2003-07-04

The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
Survey of protein–DNA interactions in Aspergillus oryzae on a genomic scale

PubMed Central

Wang, Chao; Lv, Yangyong; Wang, Bin; Yin, Chao; Lin, Ying; Pan, Li

2015-01-01

The genome-scale delineation of in vivo protein–DNA interactions is key to understanding genome function. Only ∼5% of transcription factors (TFs) in the Aspergillus genus have been identified using traditional methods. Although the Aspergillus oryzae genome contains >600 TFs, knowledge of the in vivo genome-wide TF-binding sites (TFBSs) in aspergilli remains limited because of the lack of high-quality antibodies. We investigated the landscape of in vivo protein–DNA interactions across the A. oryzae genome through coupling the DNase I digestion of intact nuclei with massively parallel sequencing and the analysis of cleavage patterns in protein–DNA interactions at single-nucleotide resolution. The resulting map identified overrepresented de novo TF-binding motifs from genomic footprints, and provided the detailed chromatin remodeling patterns and the distribution of digital footprints near transcription start sites. The TFBSs of 19 known Aspergillus TFs were also identified based on DNase I digestion data surrounding potential binding sites in conjunction with TF binding specificity information. We observed that the cleavage patterns of TFBSs were dependent on the orientation of TF motifs and independent of strand orientation, consistent with the DNA shape features of binding motifs with flanking sequences. PMID:25883143
Comparative Phosphoproteomics Reveals the Role of AmpC β-lactamase Phosphorylation in the Clinical Imipenem-resistant Strain Acinetobacter baumannii SK17.

PubMed

Lai, Juo-Hsin; Yang, Jhih-Tian; Chern, Jeffy; Chen, Te-Li; Wu, Wan-Ling; Liao, Jiahn-Haur; Tsai, Shih-Feng; Liang, Suh-Yuen; Chou, Chi-Chi; Wu, Shih-Hsiung

2016-01-01

Nosocomial infectious outbreaks caused by multidrug-resistant Acinetobacter baumannii have emerged as a serious threat to human health. Phosphoproteomics of pathogenic bacteria has been used to identify the mechanisms of bacterial virulence and antimicrobial resistance. In this study, we used a shotgun strategy combined with high-accuracy mass spectrometry to analyze the phosphoproteomics of the imipenem-susceptible strain SK17-S and -resistant strain SK17-R. We identified 410 phosphosites on 248 unique phosphoproteins in SK17-S and 285 phosphosites on 211 unique phosphoproteins in SK17-R. The distributions of the Ser/Thr/Tyr/Asp/His phosphosites in SK17-S and SK17-R were 47.0%/27.6%/12.4%/8.0%/4.9% versus 41.4%/29.5%/17.5%/6.7%/4.9%, respectively. The Ser-90 phosphosite, located on the catalytic motif S(88)VS(90)K of the AmpC β-lactamase, was first identified in SK17-S. Based on site-directed mutagenesis, the nonphosphorylatable mutant S90A was found to be more resistant to imipenem, whereas the phosphorylation-simulated mutant S90D was sensitive to imipenem. Additionally, the S90A mutant protein exhibited higher β-lactamase activity and conferred greater bacterial protection against imipenem in SK17-S compared with the wild-type. In sum, our results revealed that in A. baumannii, Ser-90 phosphorylation of AmpC negatively regulates both β-lactamase activity and the ability to counteract the antibiotic effects of imipenem. These findings highlight the impact of phosphorylation-mediated regulation in antibiotic-resistant bacteria on future drug design and new therapies. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Comparative Phosphoproteomics Reveals the Role of AmpC β-lactamase Phosphorylation in the Clinical Imipenem-resistant Strain Acinetobacter baumannii SK17*

PubMed Central

Lai, Juo-Hsin; Yang, Jhih-Tian; Chern, Jeffy; Chen, Te-Li; Wu, Wan-Ling; Liao, Jiahn-Haur; Tsai, Shih-Feng; Liang, Suh-Yuen; Chou, Chi-Chi

2016-01-01

Nosocomial infectious outbreaks caused by multidrug-resistant Acinetobacter baumannii have emerged as a serious threat to human health. Phosphoproteomics of pathogenic bacteria has been used to identify the mechanisms of bacterial virulence and antimicrobial resistance. In this study, we used a shotgun strategy combined with high-accuracy mass spectrometry to analyze the phosphoproteomics of the imipenem-susceptible strain SK17-S and -resistant strain SK17-R. We identified 410 phosphosites on 248 unique phosphoproteins in SK17-S and 285 phosphosites on 211 unique phosphoproteins in SK17-R. The distributions of the Ser/Thr/Tyr/Asp/His phosphosites in SK17-S and SK17-R were 47.0%/27.6%/12.4%/8.0%/4.9% versus 41.4%/29.5%/17.5%/6.7%/4.9%, respectively. The Ser-90 phosphosite, located on the catalytic motif S88VS90K of the AmpC β-lactamase, was first identified in SK17-S. Based on site-directed mutagenesis, the nonphosphorylatable mutant S90A was found to be more resistant to imipenem, whereas the phosphorylation-simulated mutant S90D was sensitive to imipenem. Additionally, the S90A mutant protein exhibited higher β-lactamase activity and conferred greater bacterial protection against imipenem in SK17-S compared with the wild-type. In sum, our results revealed that in A. baumannii, Ser-90 phosphorylation of AmpC negatively regulates both β-lactamase activity and the ability to counteract the antibiotic effects of imipenem. These findings highlight the impact of phosphorylation-mediated regulation in antibiotic-resistant bacteria on future drug design and new therapies. PMID:26499836
A comprehensive proteomics and genomics analysis reveals novel transmembrane proteins in human platelets and mouse megakaryocytes including G6b-B, a novel ITIM protein

PubMed Central

Senis, Yotis A.; Tomlinson, Michael G.; García, Ángel; Dumon, Stephanie; Heath, Victoria L.; Herbert, John; Cobbold, Stephen P.; Spalton, Jennifer C.; Ayman, Sinem; Antrobus, Robin; Zitzmann, Nicole; Bicknell, Roy; Frampton, Jon; Authi, Kalwant; Martin, Ashley; Wakelam, Michael J.O.; Watson, Stephen P.

2007-01-01

Summary The platelet surface is poorly characterized due to the low abundance of many membrane proteins and the lack of specialist tools for their investigation. In this study we have identified novel human platelet and mouse megakaryocyte membrane proteins using specialist proteomic and genomic approaches. Three separate methods were used to enrich platelet surface proteins prior to identification by liquid chromatography and tandem mass spectrometry: lectin affinity chromatography; biotin/NeutrAvidin affinity chromatography; and free flow electrophoresis. Many known, abundant platelet surface transmembrane proteins and several novel proteins were identified using each receptor enrichment strategy. In total, two or more unique peptides were identified for 46, 68 and 22 surface membrane, intracellular membrane and membrane proteins of unknown sub-cellular localization, respectively. The majority of these were single transmembrane proteins. To complement the proteomic studies, we analysed the transcriptome of a highly purified preparation of mature primary mouse megakaryocytes using serial analysis of gene expression in view of the increasing importance of mutant mouse models in establishing protein function in platelets. This approach identified all of the major classes of platelet transmembrane receptors, including multi-transmembrane proteins. Strikingly, 17 of the 25 most megakaryocyte-specific genes (relative to 30 other SAGE libraries) were transmembrane proteins, illustrating the unique nature of the megakaryocyte/platelet surface. The list of novel plasma membrane proteins identified using proteomics includes the immunoglobulin superfamily member G6b, which undergoes extensive alternate splicing. Specific antibodies were used to demonstrate expression of the G6b-B isoform, which contains an immunoreceptor tyrosine-based inhibition motif. G6b-B undergoes tyrosine phosphorylation and association with the SH2-containing phosphatase, SHP-1, in stimulated platelets suggesting that it may play a novel role in limiting platelet activation. PMID:17186946
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yu, Bingke; Cheng, Hui-Chun; Brautigam, Chad A.

Vibrio parahaemolyticus protein L (VopL) is an actin nucleation factor that induces stress fibers when injected into eukaryotic host cells. VopL contains three N-terminal Wiskott-Aldrich homology 2 (WH2) motifs and a unique VopL C-terminal domain (VCD). We describe crystallographic and biochemical analyses of filament nucleation by VopL. The WH2 element of VopL does not nucleate on its own and requires the VCD for activity. The VCD forms a U-shaped dimer in the crystal, stabilized by a terminal coiled coil. Dimerization of the WH2 motifs contributes strongly to nucleation activity, as do contacts of the VCD to actin. Our data leadmore » to a model in which VopL stabilizes primarily lateral (short-pitch) contacts between actin monomers to create the base of a two-stranded filament. Stabilization of lateral contacts may be a common feature of actin filament nucleation by WH2-based factors.« less
Covering of Discrete Quasiperiodic Sets: Concepts and Theory

NASA Astrophysics Data System (ADS)

Kramer, Peter

The packing of congruent, convex, impenetrable bodies in 3-space has obvious practical applications. Tilings are, in a sense, optimal packings, leaving no space between the bodies. Their applications range from practical tilings or tessellations of walls and areas of ground, through structure determination in crystallography and the physics of crystalline matter, to aperiodic tilings and to the mathematical analysis of topological manifolds and their applications in cosmology. In many applications, a local motif is uniquely related to a body or geometric object. The geometric arrangement then generates a pattern with this motif. In covering, one allows the overlap of the geometric objects. This survey of approaches to covering shows the variety of pathways opened up in this new field. In the theory of covering there arise a number of distinctions, some of which will be taken up in the other contributions to this book.
Crystal structures of 6-chloroindan-1-one and 6-bromoindan-1-one exhibit different intermolecular packing interactions

PubMed Central

Caruso, Alessio; Blair, Benjamin; Tanski, Joseph M.

2016-01-01

The two title compounds are analogs of 1-indanone that are substituted at the 6-position with chlorine and bromine. Although very similar in molecular structure, the crystal structures are not isomorphous and reveal that 6-chloroindan-1-one, C9H7ClO (I), and 6-bromoindan-1-one, C9H7BrO (II), exhibit unique intermolecular packing motifs. The molecules of the chloro analog (I) pack with a herringbone packing motif of C—H⋯O interactions, whereas the bromo derivative (II) packs with offset face-to-face π-stacking, C—H⋯O, C—H⋯Br and Br⋯O interactions. Compound (II) was refined as a two-component non-merohedral twin, BASF 0.0762 (5). PMID:27840702
Identification of early zygotic genes in the yellow fever mosquito Aedes aegypti and discovery of a motif involved in early zygotic genome activation.

PubMed

Biedler, James K; Hu, Wanqi; Tae, Hongseok; Tu, Zhijian

2012-01-01

During early embryogenesis the zygotic genome is transcriptionally silent and all mRNAs present are of maternal origin. The maternal-zygotic transition marks the time over which embryogenesis changes its dependence from maternal RNAs to zygotically transcribed RNAs. Here we present the first systematic investigation of early zygotic genes (EZGs) in a mosquito species and focus on genes involved in the onset of transcription during 2-4 hr. We used transcriptome sequencing to identify the "pure" (without maternal expression) EZGs by analyzing transcripts from four embryonic time ranges of 0-2, 2-4, 4-8, and 8-12 hr, which includes the time of cellular blastoderm formation and up to the start of gastrulation. Blast of 16,789 annotated transcripts vs. the transcriptome reads revealed evidence for 63 (P<0.001) and 143 (P<0.05) nonmaternally derived transcripts having a significant increase in expression at 2-4 hr. One third of the 63 EZG transcripts do not have predicted introns compared to 10% of all Ae. aegypti genes. We have confirmed by RT-PCR that zygotic transcription starts as early as 2-3 hours. A degenerate motif VBRGGTA was found to be overrepresented in the upstream sequences of the identified EZGs using a motif identification software called SCOPE. We find evidence for homology between this motif and the TAGteam motif found in Drosophila that has been implicated in EZG activation. A 38 bp sequence in the proximal upstream sequence of a kinesin light chain EZG (KLC2.1) contains two copies of the mosquito motif. This sequence was shown to support EZG transcription by luciferase reporter assays performed on injected early embryos, and confers early zygotic activity to a heterologous promoter from a divergent mosquito species. The results of these studies are consistent with the model of early zygotic genome activation via transcriptional activators, similar to what has been found recently in Drosophila.
A Genome-Wide Survey of the Microsatellite Content of the Globe Artichoke Genome and the Development of a Web-Based Database

PubMed Central

Portis, Ezio; Portis, Flavio; Valente, Luisa; Moglia, Andrea; Barchi, Lorenzo; Lanteri, Sergio; Acquadro, Alberto

2016-01-01

The recently acquired genome sequence of globe artichoke (Cynara cardunculus var. scolymus) has been used to catalog the genome’s content of simple sequence repeat (SSR) markers. More than 177,000 perfect SSRs were revealed, equivalent to an overall density across the genome of 244.5 SSRs/Mbp, but some 224,000 imperfect SSRs were also identified. About 21% of these SSRs were complex (two stretches of repeats separated by <100 nt). Some 73% of the SSRs were composed of dinucleotide motifs. The SSRs were categorized for the numbers of repeats present, their overall length and were allocated to their linkage group. A total of 4,761 perfect and 6,583 imperfect SSRs were present in 3,781 genes (14.11% of the total), corresponding to an overall density across the gene space of 32,5 and 44,9 SSRs/Mbp for perfect and imperfect motifs, respectively. A putative function has been assigned, using the gene ontology approach, to the set of genes harboring at least one SSR. The same search parameters were applied to reveal the SSR content of 14 other plant species for which genome sequence is available. Certain species-specific SSR motifs were identified, along with a hexa-nucleotide motif shared only with the other two Compositae species (sunflower (Helianthus annuus) and horseweed (Conyza canadensis)) included in the study. Finally, a database, called “Cynara cardunculus MicroSatellite DataBase” (CyMSatDB) was developed to provide a searchable interface to the SSR data. CyMSatDB facilitates the retrieval of SSR markers, as well as suggested forward and reverse primers, on the basis of genomic location, genomic vs genic context, perfect vs imperfect repeat, motif type, motif sequence and repeat number. The SSR markers were validated via an in silico based PCR analysis adopting two available assembled transcriptomes, derived from contrasting globe artichoke accessions, as templates. PMID:27648830
Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation.

PubMed

Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P M; Zhu, Xin-Guang

2016-09-01

Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5'UTR, 3'UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5'UTR, 3'UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology.

A comparative hidden Markov model analysis pipeline identifies proteins characteristic of cereal-infecting fungi

PubMed Central

2013-01-01

Background Fungal pathogens cause devastating losses in economically important cereal crops by utilising pathogen proteins to infect host plants. Secreted pathogen proteins are referred to as effectors and have thus far been identified by selecting small, cysteine-rich peptides from the secretome despite increasing evidence that not all effectors share these attributes. Results We take advantage of the availability of sequenced fungal genomes and present an unbiased method for finding putative pathogen proteins and secreted effectors in a query genome via comparative hidden Markov model analyses followed by unsupervised protein clustering. Our method returns experimentally validated fungal effectors in Stagonospora nodorum and Fusarium oxysporum as well as the N-terminal Y/F/WxC-motif from the barley powdery mildew pathogen. Application to the cereal pathogen Fusarium graminearum reveals a secreted phosphorylcholine phosphatase that is characteristic of hemibiotrophic and necrotrophic cereal pathogens and shares an ancient selection process with bacterial plant pathogens. Three F. graminearum protein clusters are found with an enriched secretion signal. One of these putative effector clusters contains proteins that share a [SG]-P-C-[KR]-P sequence motif in the N-terminal and show features not commonly associated with fungal effectors. This motif is conserved in secreted pathogenic Fusarium proteins and a prime candidate for functional testing. Conclusions Our pipeline has successfully uncovered conservation patterns, putative effectors and motifs of fungal pathogens that would have been overlooked by existing approaches that identify effectors as small, secreted, cysteine-rich peptides. It can be applied to any pathogenic proteome data, such as microbial pathogen data of plants and other organisms. PMID:24252298
Computational study of the fibril organization of polyglutamine repeats reveals a common motif identified in beta-helices.

PubMed

Zanuy, David; Gunasekaran, Kannan; Lesk, Arthur M; Nussinov, Ruth

2006-04-21

The formation of fibril aggregates by long polyglutamine sequences is assumed to play a major role in neurodegenerative diseases such as Huntington. Here, we model peptides rich in glutamine, through a series of molecular dynamics simulations. Starting from a rigid nanotube-like conformation, we have obtained a new conformational template that shares structural features of a tubular helix and of a beta-helix conformational organization. Our new model can be described as a super-helical arrangement of flat beta-sheet segments linked by planar turns or bends. Interestingly, our comprehensive analysis of the Protein Data Bank reveals that this is a common motif in beta-helices (termed beta-bend), although it has not been identified so far. The motif is based on the alternation of beta-sheet and helical conformation as the protein sequence is followed from the N to the C termini (beta-alpha(R)-beta-polyPro-beta). We further identify this motif in the ssNMR structure of the protofibril of the amyloidogenic peptide Abeta(1-40). The recurrence of the beta-bend suggests a general mode of connecting long parallel beta-sheet segments that would allow the growth of partially ordered fibril structures. The design allows the peptide backbone to change direction with a minimal loss of main chain hydrogen bonds. The identification of a coherent organization beyond that of the beta-sheet segments in different folds rich in parallel beta-sheets suggests a higher degree of ordered structure in protein fibrils, in agreement with their low solubility and dense molecular packing.
Learning cellular sorting pathways using protein interactions and sequence motifs.

PubMed

Lin, Tien-Ho; Bar-Joseph, Ziv; Murphy, Robert F

2011-11-01

Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/.
A kinesin-1 binding motif in vaccinia virus that is widespread throughout the human genome

PubMed Central

Dodding, Mark P; Mitter, Richard; Humphries, Ashley C; Way, Michael

2011-01-01

Transport of cargoes by kinesin-1 is essential for many cellular processes. Nevertheless, the number of proteins known to recruit kinesin-1 via its cargo binding light chain (KLC) is still quite small. We also know relatively little about the molecular features that define kinesin-1 binding. We now show that a bipartite tryptophan-based kinesin-1 binding motif, originally identified in Calsyntenin is present in A36, a vaccinia integral membrane protein. This bipartite motif in A36 is required for kinesin-1-dependent transport of the virus to the cell periphery. Bioinformatic analysis reveals that related bipartite tryptophan-based motifs are present in over 450 human proteins. Using vaccinia as a surrogate cargo, we show that regions of proteins containing this motif can function to recruit KLC and promote virus transport in the absence of A36. These proteins interact with the kinesin light chain outside the context of infection and have distinct preferences for KLC1 and KLC2. Our observations demonstrate that KLC binding can be conferred by a common set of features that are found in a wide range of proteins associated with diverse cellular functions and human diseases. PMID:21915095
Stem/Progenitor Cell Proteoglycans Decorated with 7-D-4, 4-C-3 and 3-B-3(-) Chondroitin Sulphate Motifs Are Morphogenetic Markers Of Tissue Development.

PubMed

Hayes, Anthony J; Smith, Susan M; Caterson, Bruce; Melrose, James

2018-06-11

This study reviewed the occurrence of chondroitin sulphate (CS) motifs 4-C-3, 7-D-4 and 3-B-3(-) which are expressed by progenitor cells in tissues undergoing morphogenesis. These motifs have a transient early expression pattern during tissue development and also appear in mature tissues during pathological remodeling and attempted repair processes by activated adult stem cells. The CS motifs are information and recognition modules, which may regulate cellular behavior and delineate stem cell niches in developmental tissues. One of the difficulties in determining the precise role of stem cells in tissue development and repair processes is their short engraftment period and the lack of specific markers, which differentiate the activated stem cell lineages from the resident cells. The CS sulphation motifs 7-D-4, 4-C-3 and 3-B-3 (-) decorate cell surface proteoglycans on activated stem/progenitor cells and appear to identify these cells in transitional areas of tissue development and in tissue repair and may be applicable to determining a more precise role for stem cells in tissue morphogenesis. This article is protected by copyright. All rights reserved. © 2018 AlphaMed Press.
The CcpA regulon of Streptococcus suis reveals novel insights into the regulation of the streptococcal central carbon metabolism by binding of CcpA to two distinct binding motifs.

PubMed

Willenborg, Jörg; de Greeff, Astrid; Jarek, Michael; Valentin-Weigand, Peter; Goethe, Ralph

2014-04-01

Streptococcus suis (S. suis) is a neglected zoonotic streptococcus causing fatal diseases in humans and in pigs. The transcriptional regulator CcpA (catabolite control protein A) is involved in the metabolic adaptation to different carbohydrate sources and virulence of S. suis and other pathogenic streptococci. In this study, we determined the DNA binding characteristics of CcpA and identified the CcpA regulon during growth of S. suis. Electrophoretic mobility shift analyses showed promiscuous DNA binding of CcpA to cognate cre sites in vitro. In contrast, sequencing of immunoprecipitated chromatin revealed two specific consensus motifs, a pseudo-palindromic cre motif (WWGAAARCGYTTTCWW) and a novel cre2 motif (TTTTYHWDHHWWTTTY), within the regulatory elements of the genes directly controlled by CcpA. Via these elements CcpA regulates expression of genes involved in carbohydrate uptake and conversion, and in addition in important metabolic pathways of the central carbon metabolism, like glycolysis, mixed-acid fermentation, and the fragmentary TCA cycle. Furthermore, our analyses provide evidence that CcpA regulates the genes of the central carbon metabolism by binding either the pseudo-palindromic cre motif or the cre2 motif in a HPr(Ser)∼P independent conformation. © 2014 John Wiley & Sons Ltd.
Identification, occurrence, and validation of DRE and ABRE Cis-regulatory motifs in the promoter regions of genes of Arabidopsis thaliana.

PubMed

Mishra, Sonal; Shukla, Aparna; Upadhyay, Swati; Sanchita; Sharma, Pooja; Singh, Seema; Phukan, Ujjal J; Meena, Abha; Khan, Feroz; Tripathi, Vineeta; Shukla, Rakesh Kumar; Shrama, Ashok

2014-04-01

Plants posses a complex co-regulatory network which helps them to elicit a response under diverse adverse conditions. We used an in silico approach to identify the genes with both DRE and ABRE motifs in their promoter regions in Arabidopsis thaliana. Our results showed that Arabidopsis contains a set of 2,052 genes with ABRE and DRE motifs in their promoter regions. Approximately 72% or more of the total predicted 2,052 genes had a gap distance of less than 400 bp between DRE and ABRE motifs. For positional orientation of the DRE and ABRE motifs, we found that the DR form (one in direct and the other one in reverse orientation) was more prevalent than other forms. These predicted 2,052 genes include 155 transcription factors. Using microarray data from The Arabidopsis Information Resource (TAIR) database, we present 44 transcription factors out of 155 which are upregulated by more than twofold in response to osmotic stress and ABA treatment. Fifty-one transcripts from the one predicted above were validated using semiquantitative expression analysis to support the microarray data in TAIR. Taken together, we report a set of genes containing both DRE and ABRE motifs in their promoter regions in A. thaliana, which can be useful to understand the role of ABA under osmotic stress condition. © 2013 Institute of Botany, Chinese Academy of Sciences.
Cellular Localization and Characterization of Cytosolic Binding Partners for Gla Domain-containing Proteins PRRG4 and PRRG2*

PubMed Central

Yazicioglu, Mustafa N.; Monaldini, Luca; Chu, Kirk; Khazi, Fayaz R.; Murphy, Samuel L.; Huang, Heshu; Margaritis, Paris; High, Katherine A.

2013-01-01

The genes encoding a family of proteins termed proline-rich γ-carboxyglutamic acid (PRRG) proteins were identified and characterized more than a decade ago, but their functions remain unknown. These novel membrane proteins have an extracellular γ-carboxyglutamic acid (Gla) protein domain and cytosolic WW binding motifs. We screened WW domain arrays for cytosolic binding partners for PRRG4 and identified novel protein-protein interactions for the protein. We also uncovered a new WW binding motif in PRRG4 that is essential for these newly found protein-protein interactions. Several of the PRRG-interacting proteins we identified are essential for a variety of physiologic processes. Our findings indicate possible novel and previously unidentified functions for PRRG proteins. PMID:23873930
SUMOylation target sites at the C terminus protect Axin from ubiquitination and confer protein stability

PubMed Central

Kim, Min Jung; Chia, Ian V.; Costantini, Frank

2008-01-01

Axin is a scaffold protein for the β-catenin destruction complex, and a negative regulator of canonical Wnt signaling. Previous studies implicated the six C-terminal amino acids (C6 motif) in the ability of Axin to activate c-Jun N-terminal kinase, and identified them as a SUMOylation target. Deletion of the C6 motif of mouse Axin in vivo reduced the steady-state protein level, which caused embryonic lethality. Here, we report that this deletion (Axin-ΔC6) causes a reduced half-life in mouse embryonic fibroblasts and an increased susceptibility to ubiquitination in HEK 293T cells. We confirmed the C6 motif as a SUMOylation target in vitro, and found that mutating the C-terminal SUMOylation target residues increased the susceptibility of Axin to polyubiquitination and reduced its steady-state level. Heterologous SUMOylation target sites could replace C6 in providing this protective effect. These findings suggest that SUMOylation of the C6 motif may prevent polyubiquitination, thus increasing the stability of Axin. Although C6 deletion also caused increased association of Axin with Dvl-1, this interaction was not altered by mutating the lysine residues in C6, nor could heterologous SUMOylation motifs replace the C6 motif in this assay. Therefore, some other specific property of the C6 motif seems to reduce the interaction of Axin with Dvl-1.—Kim, M. J., Chia, I. V., Costantini, F. SUMOylation target sites at the C terminus protect Axin from ubiquitination and confer protein stability. PMID:18632848
Circuit Motifs for Contrast-Adaptive Differentiation in Early Sensory Systems: The Role of Presynaptic Inhibition and Short-Term Plasticity

PubMed Central

Zhang, Danke; Wu, Si; Rasch, Malte J.

2015-01-01

In natural signals, such as the luminance value across of a visual scene, abrupt changes in intensity value are often more relevant to an organism than intensity values at other positions and times. Thus to reduce redundancy, sensory systems are specialized to detect the times and amplitudes of informative abrupt changes in the input stream rather than coding the intensity values at all times. In theory, a system that responds transiently to fast changes is called a differentiator. In principle, several different neural circuit mechanisms exist that are capable of responding transiently to abrupt input changes. However, it is unclear which circuit would be best suited for early sensory systems, where the dynamic range of the natural input signals can be very wide. We here compare the properties of different simple neural circuit motifs for implementing signal differentiation. We found that a circuit motif based on presynaptic inhibition (PI) is unique in a sense that the vesicle resources in the presynaptic site can be stably maintained over a wide range of stimulus intensities, making PI a biophysically plausible mechanism to implement a differentiator with a very wide dynamical range. Moreover, by additionally considering short-term plasticity (STP), differentiation becomes contrast adaptive in the PI-circuit but not in other potential neural circuit motifs. Numerical simulations show that the behavior of the adaptive PI-circuit is consistent with experimental observations suggesting that adaptive presynaptic inhibition might be a good candidate neural mechanism to achieve differentiation in early sensory systems. PMID:25723493
Circuit motifs for contrast-adaptive differentiation in early sensory systems: the role of presynaptic inhibition and short-term plasticity.

PubMed

Zhang, Danke; Wu, Si; Rasch, Malte J

2015-01-01

In natural signals, such as the luminance value across of a visual scene, abrupt changes in intensity value are often more relevant to an organism than intensity values at other positions and times. Thus to reduce redundancy, sensory systems are specialized to detect the times and amplitudes of informative abrupt changes in the input stream rather than coding the intensity values at all times. In theory, a system that responds transiently to fast changes is called a differentiator. In principle, several different neural circuit mechanisms exist that are capable of responding transiently to abrupt input changes. However, it is unclear which circuit would be best suited for early sensory systems, where the dynamic range of the natural input signals can be very wide. We here compare the properties of different simple neural circuit motifs for implementing signal differentiation. We found that a circuit motif based on presynaptic inhibition (PI) is unique in a sense that the vesicle resources in the presynaptic site can be stably maintained over a wide range of stimulus intensities, making PI a biophysically plausible mechanism to implement a differentiator with a very wide dynamical range. Moreover, by additionally considering short-term plasticity (STP), differentiation becomes contrast adaptive in the PI-circuit but not in other potential neural circuit motifs. Numerical simulations show that the behavior of the adaptive PI-circuit is consistent with experimental observations suggesting that adaptive presynaptic inhibition might be a good candidate neural mechanism to achieve differentiation in early sensory systems.
β-hairpin-mediated nucleation of polyglutamine amyloid formation

PubMed Central

Kar, Karunakar; Hoop, Cody L.; Drombosky, Kenneth W.; Baker, Matthew A.; Kodali, Ravindra; Arduini, Irene; van der Wel, Patrick C. A.; Horne, W. Seth; Wetzel, Ronald

2013-01-01

The conformational preferences of polyglutamine (polyQ) sequences are of major interest because of their central importance in the expanded CAG repeat diseases that include Huntington’s disease (HD). Here we explore the response of various biophysical parameters to the introduction of β-hairpin motifs within polyQ sequences. These motifs (trpzip, disulfide, D-Pro-Gly, Coulombic attraction, L-Pro-Gly) enhance formation rates and stabilities of amyloid fibrils with degrees of effectiveness well-correlated with their known abilities to enhance β-hairpin formation in other peptides. These changes led to decreases in the critical nucleus for amyloid formation from a value of n* = 4 for a simple, unbroken Q23 sequence to approximate unitary n* values for similar length polyQs containing β-hairpin motifs. At the same time, the morphologies, secondary structures, and bioactivities of the resulting fibrils were essentially unchanged from simple polyQ aggregates. In particular, the signature pattern of SSNMR 13C Gln resonances that appears to be unique to polyQ amyloid is replicated exactly in fibrils from a β-hairpin polyQ. Importantly, while β-hairpin motifs do produce enhancements in the equilibrium constant for nucleation in aggregation reactions, these Kn* values remain quite low (~ 10−10) and there is no evidence for significant embellishment of β-structure within the monomer ensemble. The results indicate an important role for β-turns in the nucleation mechanism and structure of polyQ amyloid and have implications for the nature of the toxic species in expanded CAG repeat diseases. PMID:23353826
Recoding method that removes inhibitory sequences and improves HIV gene expression

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rabadan, Raul; Krasnitz, Michael; Robins, Harlan

The invention relates to inhibitory nucleotide signal sequences or "INS" sequences in the genomes of lentiviruses. In particular the invention relates to the AGG motif present in all viral genomes. The AGG motif may have an inhibitory effect on a virus, for example by reducing the levels of, or maintaining low steady-state levels of, viral RNAs in host cells, and inducing and/or maintaining in viral latency. In one aspect, the invention provides vaccines that contain, or are produced from, viral nucleic acids in which the AGG sequences have been mutated. In another aspect, the invention provides methods and compositions formore » affecting the function of the AGG motif, and methods for identifying other INS sequences in viral genomes.« less
Dynamic motifs in socio-economic networks

NASA Astrophysics Data System (ADS)

Zhang, Xin; Shao, Shuai; Stanley, H. Eugene; Havlin, Shlomo

2014-12-01

Socio-economic networks are of central importance in economic life. We develop a method of identifying and studying motifs in socio-economic networks by focusing on “dynamic motifs,” i.e., evolutionary connection patterns that, because of “node acquaintances” in the network, occur much more frequently than random patterns. We examine two evolving bi-partite networks: i) the world-wide commercial ship chartering market and ii) the ship build-to-order market. We find similar dynamic motifs in both bipartite networks, even though they describe different economic activities. We also find that “influence” and “persistence” are strong factors in the interaction behavior of organizations. When two companies are doing business with the same customer, it is highly probable that another customer who currently only has business relationship with one of these two companies, will become customer of the second in the future. This is the effect of influence. Persistence means that companies with close business ties to customers tend to maintain their relationships over a long period of time.
G4 motifs affect origin positioning and efficiency in two vertebrate replicators

PubMed Central

Valton, Anne-Laure; Hassan-Zadeh, Vahideh; Lema, Ingrid; Boggetto, Nicole; Alberti, Patrizia; Saintomé, Carole; Riou, Jean-François; Prioleau, Marie-Noëlle

2014-01-01

DNA replication ensures the accurate duplication of the genome at each cell cycle. It begins at specific sites called replication origins. Genome-wide studies in vertebrates have recently identified a consensus G-rich motif potentially able to form G-quadruplexes (G4) in most replication origins. However, there is no experimental evidence to demonstrate that G4 are actually required for replication initiation. We show here, with two model origins, that G4 motifs are required for replication initiation. Two G4 motifs cooperate in one of our model origins. The other contains only one critical G4, and its orientation determines the precise position of the replication start site. Point mutations affecting the stability of this G4 in vitro also impair origin function. Finally, this G4 is not sufficient for origin activity and must cooperate with a 200-bp cis-regulatory element. In conclusion, our study strongly supports the predicted essential role of G4 in replication initiation. PMID:24521668
Discovery of a Regulatory Motif for Human Satellite DNA Transcription in Response to BATF2 Overexpression.

PubMed

Bai, Xuejia; Huang, Wenqiu; Zhang, Chenguang; Niu, Jing; Ding, Wei

2016-03-01

One of the basic leucine zipper transcription factors, BATF2, has been found to suppress cancer growth and migration. However, little is known about the genes downstream of BATF2. HeLa cells were stably transfected with BATF2, then chromatin immunoprecipitation-sequencing was employed to identify the DNA motifs responsive to BATF2. Comprehensive bioinformatics analyses indicated that the most significant motif discovered as TTCCATT[CT]GATTCCATTC[AG]AT was primarily distributed among the chromosome centromere regions and mostly within human type II satellite DNA. Such motifs were able to prime the transcription of type II satellite DNA in a directional and asymmetrical manner. Consistently, satellite II transcription was up-regulated in BATF2-overexpressing cells. The present study provides insight into understanding the role of BATF2 in tumours and the importance of satellite DNA in the maintenance of genomic stability. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
The Thiamin Pyrophosphate-Motif

NASA Technical Reports Server (NTRS)

Dominiak, Paulina M.; Ciszak, Ewa M.

2003-01-01

Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.
Identification and characterization of a cis-regulatory element for zygotic gene expression in Chlamydomonas reinhardtii

DOE PAGES

Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; ...

2016-03-26

Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient tomore » confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. Furthermore, we predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes.« less
Copper-catalyzed azide-alkyne cycloaddition (click chemistry)-based Detection of Global Pathogen-host AMPylation on Self-assembled Human Protein Microarrays*

PubMed Central

Yu, Xiaobo; Woolery, Andrew R.; Luong, Phi; Hao, Yi Heng; Grammel, Markus; Westcott, Nathan; Park, Jin; Wang, Jie; Bian, Xiaofang; Demirkan, Gokhan; Hang, Howard C.; Orth, Kim; LaBaer, Joshua

2014-01-01

AMPylation (adenylylation) is a recently discovered mechanism employed by infectious bacteria to regulate host cell signaling. However, despite significant effort, only a few host targets have been identified, limiting our understanding of how these pathogens exploit this mechanism to control host cells. Accordingly, we developed a novel nonradioactive AMPylation screening platform using high-density cell-free protein microarrays displaying human proteins produced by human translational machinery. We screened 10,000 unique human proteins with Vibrio parahaemolyticus VopS and Histophilus somni IbpAFic2, and identified many new AMPylation substrates. Two of these, Rac2, and Rac3, were confirmed in vivo as bona fide substrates during infection with Vibrio parahaemolyticus. We also mapped the site of AMPylation of a non-GTPase substrate, LyGDI, to threonine 51, in a region regulated by Src kinase, and demonstrated that AMPylation prevented its phosphorylation by Src. Our results greatly expanded the repertoire of potential host substrates for bacterial AMPylators, determined their recognition motif, and revealed the first pathogen-host interaction AMPylation network. This approach can be extended to identify novel substrates of AMPylators with different domains or in different species and readily adapted for other post-translational modifications. PMID:25073739
New archetypes in self-assembled Phe-Phe motif induced nanostructures from nucleoside conjugated-diphenylalanines.

PubMed

Datta, Dhrubajyoti; Tiwari, Omshanker; Ganesh, Krishna N

2018-02-15

During the last two decades, the molecular self-assembly of the short peptide diphenylalanine (Phe-Phe) motif has attracted increasing focus due to its unique morphological structure and utility for potential applications in biomaterial chemistry, sensors and bioelectronics. Due to the ease of their synthetic modifications and a plethora of available experimental tools, the self-assembly of free and protected diphenylalanine scaffolds (H-Phe-Phe-OH, Boc-Phe-Phe-OH and Boc-Phe-Phe-OMe) has unfurled interesting tubular, vesicular or fibrillar morphologies. Developing on this theme, here we attempt to examine the effect of structure and properties (hydrophobic and H-bonding) modifying the functional C-terminus conjugated substituents on Boc-Phe-Phe on its self-assembly process. The consequent self-sorting due to H-bonding, van der Waals force and π-π interactions, generates monodisperse nano-vesicles from these peptides characterized via their SEM, HRTEM, AFM pictures and DLS experiments. The stability of these vesicles to different external stimuli such as pH and temperature, encapsulation of fluorescent probes inside the vesicles and their release by external trigger are reported. The results point to a new direction in the study and applications of the Phe-Phe motif to rationally engineer new functional nano-architectures.

Sequence motifs and prokaryotic expression of the reptilian paramyxovirus fusion protein

USGS Publications Warehouse

Franke, J.; Batts, W.N.; Ahne, W.; Kurath, G.; Winton, J.R.

2006-01-01

Fourteen reptilian paramyxovirus isolates were chosen to represent the known extent of genetic diversity among this novel group of viruses. Selected regions of the fusion (F) gene were sequenced, analyzed and compared. The F gene of all isolates contained conserved motifs homologous to those described for other members of the family Paramyxoviridae including: signal peptide, transmembrane domain, furin cleavage site, fusion peptide, N-linked glycosylation sites, and two heptad repeats, the second of which (HRB-LZ) had the characteristics of a leucine zipper. Selected regions of the fusion gene of isolate Gono-GER85 were inserted into a prokaryotic expression system to generate three recombinant protein fragments of various sizes. The longest recombinant protein was cleaved by furin into two fragments of predicted length. Western blot analysis with virus-neutralizing rabbit-antiserum against this isolate demonstrated that only the longest construct reacted with the antiserum. This construct was unique in containing 30 additional C-terminal amino acids that included most of the HRB-LZ. These results indicate that the F genes of reptilian paramyxoviruses contain highly conserved motifs typical of other members of the family and suggest that the HRB-LZ domain of the reptilian paramyxovirus F protein contains a linear antigenic epitope. ?? Springer-Verlag 2005.
Identification and classification of hubs in brain networks.

PubMed

Sporns, Olaf; Honey, Christopher J; Kötter, Rolf

2007-10-17

Brain regions in the mammalian cerebral cortex are linked by a complex network of fiber bundles. These inter-regional networks have previously been analyzed in terms of their node degree, structural motif, path length and clustering coefficient distributions. In this paper we focus on the identification and classification of hub regions, which are thought to play pivotal roles in the coordination of information flow. We identify hubs and characterize their network contributions by examining motif fingerprints and centrality indices for all regions within the cerebral cortices of both the cat and the macaque. Motif fingerprints capture the statistics of local connection patterns, while measures of centrality identify regions that lie on many of the shortest paths between parts of the network. Within both cat and macaque networks, we find that a combination of degree, motif participation, betweenness centrality and closeness centrality allows for reliable identification of hub regions, many of which have previously been functionally classified as polysensory or multimodal. We then classify hubs as either provincial (intra-cluster) hubs or connector (inter-cluster) hubs, and proceed to show that lesioning hubs of each type from the network produces opposite effects on the small-world index. Our study presents an approach to the identification and classification of putative hub regions in brain networks on the basis of multiple network attributes and charts potential links between the structural embedding of such regions and their functional roles.
RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants.

PubMed

Li, Pingchuan; Quan, Xiande; Jia, Gaofeng; Xiao, Jin; Cloutier, Sylvie; You, Frank M

2016-11-02

Resistance gene analogs (RGAs), such as NBS-encoding proteins, receptor-like protein kinases (RLKs) and receptor-like proteins (RLPs), are potential R-genes that contain specific conserved domains and motifs. Thus, RGAs can be predicted based on their conserved structural features using bioinformatics tools. Computer programs have been developed for the identification of individual domains and motifs from the protein sequences of RGAs but none offer a systematic assessment of the different types of RGAs. A user-friendly and efficient pipeline is needed for large-scale genome-wide RGA predictions of the growing number of sequenced plant genomes. An integrative pipeline, named RGAugury, was developed to automate RGA prediction. The pipeline first identifies RGA-related protein domains and motifs, namely nucleotide binding site (NB-ARC), leucine rich repeat (LRR), transmembrane (TM), serine/threonine and tyrosine kinase (STTK), lysin motif (LysM), coiled-coil (CC) and Toll/Interleukin-1 receptor (TIR). RGA candidates are identified and classified into four major families based on the presence of combinations of these RGA domains and motifs: NBS-encoding, TM-CC, and membrane associated RLP and RLK. All time-consuming analyses of the pipeline are paralleled to improve performance. The pipeline was evaluated using the well-annotated Arabidopsis genome. A total of 98.5, 85.2, and 100 % of the reported NBS-encoding genes, membrane associated RLPs and RLKs were validated, respectively. The pipeline was also successfully applied to predict RGAs for 50 sequenced plant genomes. A user-friendly web interface was implemented to ease command line operations, facilitate visualization and simplify result management for multiple datasets. RGAugury is an efficiently integrative bioinformatics tool for large scale genome-wide identification of RGAs. It is freely available at Bitbucket: https://bitbucket.org/yaanlpc/rgaugury .
Structural motif screening reveals a novel, conserved carbohydrate-binding surface in the pathogenesis-related protein PR-5d.

PubMed

Doxey, Andrew C; Cheng, Zhenyu; Moffatt, Barbara A; McConkey, Brendan J

2010-08-03

Aromatic amino acids play a critical role in protein-glycan interactions. Clusters of surface aromatic residues and their features may therefore be useful in distinguishing glycan-binding sites as well as predicting novel glycan-binding proteins. In this work, a structural bioinformatics approach was used to screen the Protein Data Bank (PDB) for coplanar aromatic motifs similar to those found in known glycan-binding proteins. The proteins identified in the screen were significantly associated with carbohydrate-related functions according to gene ontology (GO) enrichment analysis, and predicted motifs were found frequently within novel folds and glycan-binding sites not included in the training set. In addition to numerous binding sites predicted in structural genomics proteins of unknown function, one novel prediction was a surface motif (W34/W36/W192) in the tobacco pathogenesis-related protein, PR-5d. Phylogenetic analysis revealed that the surface motif is exclusive to a subfamily of PR-5 proteins from the Solanaceae family of plants, and is absent completely in more distant homologs. To confirm PR-5d's insoluble-polysaccharide binding activity, a cellulose-pulldown assay of tobacco proteins was performed and PR-5d was identified in the cellulose-binding fraction by mass spectrometry. Based on the combined results, we propose that the putative binding site in PR-5d may be an evolutionary adaptation of Solanaceae plants including potato, tomato, and tobacco, towards defense against cellulose-containing pathogens such as species of the deadly oomycete genus, Phytophthora. More generally, the results demonstrate that coplanar aromatic clusters on protein surfaces are a structural signature of glycan-binding proteins, and can be used to computationally predict novel glycan-binding proteins from 3 D structure.
Transmissible Gastroenteritis Coronavirus Genome Packaging Signal Is Located at the 5′ End of the Genome and Promotes Viral RNA Incorporation into Virions in a Replication-Independent Process

PubMed Central

Morales, Lucia; Mateos-Gomez, Pedro A.; Capiscol, Carmen; del Palacio, Lorena; Sola, Isabel

2013-01-01

Preferential RNA packaging in coronaviruses involves the recognition of viral genomic RNA, a crucial process for viral particle morphogenesis mediated by RNA-specific sequences, known as packaging signals. An essential packaging signal component of transmissible gastroenteritis coronavirus (TGEV) has been further delimited to the first 598 nucleotides (nt) from the 5′ end of its RNA genome, by using recombinant viruses transcribing subgenomic mRNA that included potential packaging signals. The integrity of the entire sequence domain was necessary because deletion of any of the five structural motifs defined within this region abrogated specific packaging of this viral RNA. One of these RNA motifs was the stem-loop SL5, a highly conserved motif in coronaviruses located at nucleotide positions 106 to 136. Partial deletion or point mutations within this motif also abrogated packaging. Using TGEV-derived defective minigenomes replicated in trans by a helper virus, we have shown that TGEV RNA packaging is a replication-independent process. Furthermore, the last 494 nt of the genomic 3′ end were not essential for packaging, although this region increased packaging efficiency. TGEV RNA sequences identified as necessary for viral genome packaging were not sufficient to direct packaging of a heterologous sequence derived from the green fluorescent protein gene. These results indicated that TGEV genome packaging is a complex process involving many factors in addition to the identified RNA packaging signal. The identification of well-defined RNA motifs within the TGEV RNA genome that are essential for packaging will be useful for designing packaging-deficient biosafe coronavirus-derived vectors and providing new targets for antiviral therapies. PMID:23966403
Analysis of zinc binding sites in protein crystal structures.

PubMed

Alberts, I L; Nadassy, K; Wodak, S J

1998-08-01

The geometrical properties of zinc binding sites in a dataset of high quality protein crystal structures deposited in the Protein Data Bank have been examined to identify important differences between zinc sites that are directly involved in catalysis and those that play a structural role. Coordination angles in the zinc primary coordination sphere are compared with ideal values for each coordination geometry, and zinc coordination distances are compared with those in small zinc complexes from the Cambridge Structural Database as a guide of expected trends. We find that distances and angles in the primary coordination sphere are in general close to the expected (or ideal) values. Deviations occur primarily for oxygen coordinating atoms and are found to be mainly due to H-bonding of the oxygen coordinating ligand to protein residues, bidentate binding arrangements, and multi-zinc sites. We find that H-bonding of oxygen containing residues (or water) to zinc bound histidines is almost universal in our dataset and defines the elec-His-Zn motif. Analysis of the stereochemistry shows that carboxyl elec-His-Zn motifs are geometrically rigid, while water elec-His-Zn motifs show the most geometrical variation. As catalytic motifs have a higher proportion of carboxyl elec atoms than structural motifs, they provide a more rigid framework for zinc binding. This is understood biologically, as a small distortion in the zinc position in an enzyme can have serious consequences on the enzymatic reaction. We also analyze the sequence pattern of the zinc ligands and residues that provide elecs, and identify conserved hydrophobic residues in the endopeptidases that also appear to contribute to stabilizing the catalytic zinc site. A zinc binding template in protein crystal structures is derived from these observations.
Thioredoxin reductase regulates AP-1 activity as well as thioredoxin nuclear localization via active cysteines in response to ionizing radiation.

PubMed

Karimpour, Shervin; Lou, Junyang; Lin, Lilie L; Rene, Luis M; Lagunas, Lucio; Ma, Xinrong; Karra, Sreenivasu; Bradbury, C Matthew; Markovina, Stephanie; Goswami, Prabhat C; Spitz, Douglas R; Hirota, Kiichi; Kalvakolanu, Dhananjaya V; Yodoi, Junji; Gius, David

2002-09-12

A recently identified class of signaling factors uses critical cysteine motif(s) that act as redox-sensitive 'sulfhydryl switches' to reversibly modulate specific signal transduction cascades regulating downstream proteins with similar redox-sensitive sites. For example, signaling factors such as redox factor-1 (Ref-1) and transcription factors such as the AP-1 complex both contain redox-sensitive cysteine motifs that regulate activity in response to oxidative stress. The mammalian thioredoxin reductase-1 (TR) is an oxidoreductase selenocysteine-containing flavoprotein that also appears to regulate multiple downstream intracellular redox-sensitive proteins. Since ionizing radiation (IR) induces oxidative stress as well as increases AP-1 DNA-binding activity via the activation of Ref-1, the potential roles of TR and thioredoxin (TRX) in the regulation of AP-1 activity in response to IR were investigated. Permanently transfected cell lines that overexpress wild type TR demonstrated constitutive increases in AP-1 DNA-binding activity as well as AP-1-dependent reporter gene expression, relative to vector control cells. In contrast, permanently transfected cell lines expressing a TR gene with the active site cysteine motif deleted were unable to induce AP-1 activity or reporter gene expression in response to IR. Transient genetic overexpression of either the TR wild type or dominant-negative genes demonstrated similar results using a transient assay system. One mechanism through which TR regulates AP-1 activity appears to involve TRX sub-cellular localization, with no change in the total TRX content of the cell. These results identify a novel function of the TR enzyme as a signaling factor in the regulation of AP-1 activity via a cysteine motif located in the protein.
Identification of new members of the MAPK gene family in plants shows diverse conserved domains and novel activation loop variants.

PubMed

Mohanta, Tapan Kumar; Arora, Pankaj Kumar; Mohanta, Nibedita; Parida, Pratap; Bae, Hanhong

2015-02-06

Mitogen Activated Protein Kinase (MAPK) signaling is of critical importance in plants and other eukaryotic organisms. The MAPK cascade plays an indispensible role in the growth and development of plants, as well as in biotic and abiotic stress responses. The MAPKs are constitute the most downstream module of the three tier MAPK cascade and are phosphorylated by upstream MAP kinase kinases (MAPKK), which are in turn are phosphorylated by MAP kinase kinase kinase (MAPKKK). The MAPKs play pivotal roles in regulation of many cytoplasmic and nuclear substrates, thus regulating several biological processes. A total of 589 MAPKs genes were identified from the genome wide analysis of 40 species. The sequence analysis has revealed the presence of several N- and C-terminal conserved domains. The MAPKs were previously believed to be characterized by the presence of TEY/TDY activation loop motifs. The present study showed that, in addition to presence of activation loop TEY/TDY motifs, MAPKs are also contain MEY, TEM, TQM, TRM, TVY, TSY, TEC and TQY activation loop motifs. Phylogenetic analysis of all predicted MAPKs were clustered into six different groups (group A, B, C, D, E and F), and all predicted MAPKs were assigned with specific names based on their orthology based evolutionary relationships with Arabidopsis or Oryza MAPKs. We conducted global analysis of the MAPK gene family of plants from lower eukaryotes to higher eukaryotes and analyzed their genomic and evolutionary aspects. Our study showed the presence of several new activation loop motifs and diverse conserved domains in MAPKs. Advance study of newly identified activation loop motifs can provide further information regarding the downstream signaling cascade activated in response to a wide array of stress conditions, as well as plant growth and development.
Interactions of HIPPI, a molecular partner of Huntingtin interacting protein HIP1, with the specific motif present at the putative promoter sequence of the caspase-1, caspase-8 and caspase-10 genes.

PubMed

Majumder, P; Choudhury, A; Banerjee, M; Lahiri, A; Bhattacharyya, N P

2007-08-01

To investigate the mechanism of increased expression of caspase-1 caused by exogenous Hippi, observed earlier in HeLa and Neuro2A cells, in this work we identified a specific motif AAAGACATG (- 101 to - 93) at the caspase-1 gene upstream sequence where HIPPI could bind. Various mutations in this specific sequence compromised the interaction, showing the specificity of the interactions. In the luciferase reporter assay, when the reporter gene was driven by caspase-1 gene upstream sequences (- 151 to - 92) with the mutation G to T at position - 98, luciferase activity was decreased significantly in green fluorescent protein-Hippi-expressing HeLa cells in comparison to that obtained with the wild-type caspase-1 gene 60 bp upstream sequence, indicating the biological significance of such binding. It was observed that the C-terminal 'pseudo' death effector domain of HIPPI interacted with the 60 bp (- 151 to - 92) upstream sequence of the caspase-1 gene containing the motif. We further observed that expression of caspase-8 and caspase-10 was increased in green fluorescent protein-Hippi-expressing HeLa cells. In addition, HIPPI interacted in vitro with putative promoter sequences of these genes, containing a similar motif. In summary, we identified a novel function of HIPPI; it binds to specific upstream sequences of the caspase-1, caspase-8 and caspase-10 genes and alters the expression of the genes. This result showed the motif-specific interaction of HIPPI with DNA, and indicates that it could act as transcription regulator.
Canonical Bcl-2 motifs of the Na+/K+ pump revealed by the BH3 mimetic chelerythrine: early signal transducers of apoptosis?

PubMed

Lauf, Peter K; Heiny, Judith; Meller, Jarek; Lepera, Michael A; Koikov, Leonid; Alter, Gerald M; Brown, Thomas L; Adragna, Norma C

2013-01-01

Chelerythrine [CET], a protein kinase C [PKC] inhibitor, is a prop-apoptotic BH3-mimetic binding to BH1-like motifs of Bcl-2 proteins. CET action was examined on PKC phosphorylation-dependent membrane transporters (Na+/K+ pump/ATPase [NKP, NKA], Na+-K+-2Cl+ [NKCC] and K+-Cl- [KCC] cotransporters, and channel-supported K+ loss) in human lens epithelial cells [LECs]. K+ loss and K+ uptake, using Rb+ as congener, were measured by atomic absorption/emission spectrophotometry with NKP and NKCC inhibitors, and Cl- replacement by NO3ˉ to determine KCC. 3H-Ouabain binding was performed on a pig renal NKA in the presence and absence of CET. Bcl-2 protein and NKA sequences were aligned and motifs identified and mapped using PROSITE in conjunction with BLAST alignments and analysis of conservation and structural similarity based on prediction of secondary and crystal structures. CET inhibited NKP and NKCC by >90% (IC50 values ~35 and ~15 μM, respectively) without significant KCC activity change, and stimulated K+ loss by ~35% at 10-30 μM. Neither ATP levels nor phosphorylation of the NKA α1 subunit changed. 3H-ouabain was displaced from pig renal NKA only at 100 fold higher CET concentrations than the ligand. Sequence alignments of NKA with BH1- and BH3-like motifs containing pro-survival Bcl-2 and BclXl proteins showed more than one BH1-like motif within NKA for interaction with CET or with BH3 motifs. One NKA BH1-like motif (ARAAEILARDGPN) was also found in all P-type ATPases. Also, NKA possessed a second motif similar to that near the BH3 region of Bcl-2. Findings support the hypothesis that CET inhibits NKP by binding to BH1-like motifs and disrupting the α1 subunit catalytic activity through conformational changes. By interacting with Bcl-2 proteins through their complementary BH1- or BH3-like-motifs, NKP proteins may be sensors of normal and pathological cell functions, becoming important yet unrecognized signal transducers in the initial phases of apoptosis. CET action on NKCC1 and K+ channels may involve PKC-regulated mechanisms; however, limited sequence homologies to BH1-like motifs cannot exclude direct effects.
In-Silico Identification Of Micro-Loops In Myelodysplastic Syndromes

NASA Astrophysics Data System (ADS)

Beck, Dominik; Brandl, Miriam; Pham, Tuan D.; Chang, Chung-Che; Zhou, Xiaobo

2011-06-01

Micro-loops are regulatory network motifs that leverage transcriptional and posttranscriptional control to effectively regulate the transcriptome. In this paper a regulatory network for Myelodysplastic Syndromes (MDSs) was constructed from the literature and publicly available data sources. The network was filtered using data from deep-sequencing of small RNAs, exon and microarrays. Motif discovery showed that micro-loops might exist in MDS. We further used the identified micro-loops and performed basic network analysis to identify the known disease gene RUNX1/AML, as well as miRNA family hsa-mir-181. This suggested that the concept of micro-loops can be applied to enhance disease gene identification and biomarker discovery.
The most common Chinese rhesus macaque MHC class I molecule shares peptide binding repertoire with the HLA-B7 supertype

PubMed Central

Solomon, Christopher; Southwood, Scott; Hoof, Ilka; Rudersdorf, Richard; Peters, Bjoern; Sidney, John; Pinilla, Clemencia; Marcondes, Maria Cecilia Garibaldi; Ling, Binhua; Marx, Preston; Sette, Alessandro

2010-01-01

Of the two rhesus macaque subspecies used for AIDS studies, the Simian immunodeficiency virus-infected Indian rhesus macaque (Macaca mulatta) is the most established model of HIV infection, providing both insight into pathogenesis and a system for testing novel vaccines. Despite the Chinese rhesus macaque potentially being a more relevant model for AIDS outcomes than the Indian rhesus macaque, the Chinese-origin rhesus macaques have not been well-characterized for their major histocompatibility complex (MHC) composition and function, reducing their greater utilization. In this study, we characterized a total of 50 unique Chinese rhesus macaques from several varying origins for their entire MHC class I allele composition and identified a total of 58 unique complete MHC class I sequences. Only nine of the sequences had been associated with Indian rhesus macaques, and 28/58 (48.3%) of the sequences identified were novel. From all MHC alleles detected, we prioritized Mamu-A1*02201 for functional characterization based on its higher frequency of expression. Upon the development of MHC/peptide binding assays and definition of its associated motif, we revealed that this allele shares peptide binding characteristics with the HLA-B7 supertype, the most frequent supertype in human populations. These studies provide the first functional characterization of an MHC class I molecule in the context of Chinese rhesus macaques and the first instance of HLA-B7 analogy for rhesus macaques. Electronic supplementary material The online version of this article (doi:10.1007/s00251-010-0450-3) contains supplementary material, which is available to authorized users. PMID:20480161
An Amino Acid Code for Irregular and Mixed Protein Packing

PubMed Central

Joo, Hyun; Chavan, Archana; Fraga, Keith; Tsai, Jerry

2015-01-01

To advance our understanding of protein tertiary structure, the development of the knob-socket model is completed in an analysis of the packing in irregular coil and turn secondary structure packing as well as between mixed secondary structure. The knob-socket model simplifies packing based on repeated patterns of 2 motifs: a 3 residue socket for packing within 2° structure and a 4 residue knob-socket for 3° packing. For coil and turn secondary structure, knob-sockets allow identification of a correlation between amino acid composition and tertiary arrangements in space. Coil contributes almost as much as α-helices to tertiary packing. Irregular secondary structure involves 3 residue cliques of consecutive contacting residues or XYZ sockets. In irregular sockets, Gly, Pro, Asp and Ser are favored, while Cys, His, Met and Trp are not. For irregular knobs, the preference order is Arg, Asp, Pro, Asn, Thr, Leu, and Gly, while Cys, His, Met and Trp are not. In mixed packing, the knob amino acid preferences are a function of the socket that they are packing into, whereas the amino acid composition of the sockets does not depend on the secondary structure of the knob. A unique motif of a coil knob with an XYZ β-sheet socket may potentially function to inhibit β-sheet extension. In addition, analysis of the preferred crossing angles for strands within a β-sheet and mixed α-helices/β-sheets identifies canonical packing patterns useful in protein design. Lastly, the knob-socket model abstracts the complexity of protein tertiary structure into an intuitive packing surface topology map. PMID:26370334
High-throughput analysis of the protein sequence-stability landscape using a quantitative "yeast surface two-hybrid" system and fragment reconstitution

PubMed Central

Dutta, Sanjib; Koide, Akiko; Koide, Shohei

2008-01-01

Stability evaluation of many mutants can lead to a better understanding of the sequence determinants of a structural motif and of factors governing protein stability and protein evolution. The traditional biophysical analysis of protein stability is low throughput, limiting our ability to widely explore the sequence space in a quantitative manner. In this study, we have developed a high-throughput library screening method for quantifying stability changes, which is based on protein fragment reconstitution and yeast surface display. Our method exploits the thermodynamic linkage between protein stability and fragment reconstitution and the ability of the yeast surface display technique to quantitatively evaluate protein-protein interactions. The method was applied to a fibronectin type III (FN3) domain. Characterization of fragment reconstitution was facilitated by the co-expression of two FN3 fragments, thus establishing a "yeast surface two-hybrid" method. Importantly, our method does not rely on competition between clones and thus eliminates a common limitation of high-throughput selection methods in which the most stable variants are predominantly recovered. Thus, it allows for the isolation of sequences that exhibits a desired level of stability. We identified over one hundred unique sequences for a β-bulge motif, which was significantly more informative than natural sequences of the FN3 family in revealing the sequence determinants for the β-bulge. Our method provides a powerful means to rapidly assess stability of many variants, to systematically assess contribution of different factors to protein stability and to enhance protein stability. PMID:18674545
Interaction of p190A RhoGAP with eIF3A and Other Translation Preinitiation Factors Suggests a Role in Protein Biosynthesis.

PubMed

Parasuraman, Prasanna; Mulligan, Peter; Walker, James A; Li, Bihua; Boukhali, Myriam; Haas, Wilhelm; Bernards, Andre

2017-02-17

The negative regulator of Rho family GTPases, p190A RhoGAP, is one of six mammalian proteins harboring so-called FF motifs. To explore the function of these and other p190A segments, we identified interacting proteins by tandem mass spectrometry. Here we report that endogenous human p190A, but not its 50% identical p190B paralog, associates with all 13 eIF3 subunits and several other translational preinitiation factors. The interaction involves the first FF motif of p190A and the winged helix/PCI domain of eIF3A, is enhanced by serum stimulation and reduced by phosphatase treatment. The p190A/eIF3A interaction is unaffected by mutating phosphorylated p190A-Tyr 308 , but disrupted by a S296A mutation, targeting the only other known phosphorylated residue in the first FF domain. The p190A-eIF3 complex is distinct from eIF3 complexes containing S6K1 or mammalian target of rapamycin (mTOR), and appears to represent an incomplete preinitiation complex lacking several subunits. Based on these findings we propose that p190A may affect protein translation by controlling the assembly of functional preinitiation complexes. Whether such a role helps to explain why, unique among the large family of RhoGAPs, p190A exhibits a significantly increased mutation rate in cancer remains to be determined. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Distinct structural features of the peroxide response regulator from group A Streptococcus drive DNA binding.

PubMed

Lin, Chang Sheng-Huei; Chao, Shi-Yu; Hammel, Michal; Nix, Jay C; Tseng, Hsiao-Ling; Tsou, Chih-Cheng; Fei, Chun-Hsien; Chiou, Huo-Sheng; Jeng, U-Ser; Lin, Yee-Shin; Chuang, Woei-Jer; Wu, Jiunn-Jong; Wang, Shuying

2014-01-01

Group A streptococcus (GAS, Streptococcus pyogenes) is a strict human pathogen that causes severe, invasive diseases. GAS does not produce catalase, but has an ability to resist killing by reactive oxygen species (ROS) through novel mechanisms. The peroxide response regulator (PerR), a member of ferric uptake regulator (Fur) family, plays a key role for GAS to cope with oxidative stress by regulating the expression of multiple genes. Our previous studies have found that expression of an iron-binding protein, Dpr, is under the direct control of PerR. To elucidate the molecular interactions of PerR with its cognate promoter, we have carried out structural studies on PerR and PerR-DNA complex. By combining crystallography and small-angle X-ray scattering (SAXS), we confirmed that the determined PerR crystal structure reflects its conformation in solution. Through mutagenesis and biochemical analysis, we have identified DNA-binding residues suggesting that PerR binds to the dpr promoter at the per box through a winged-helix motif. Furthermore, we have performed SAXS analysis and resolved the molecular architecture of PerR-DNA complex, in which two 30 bp DNA fragments wrap around two PerR homodimers by interacting with the adjacent positively-charged winged-helix motifs. Overall, we provide structural insights into molecular recognition of DNA by PerR and define the hollow structural arrangement of PerR-30bpDNA complex, which displays a unique topology distinct from currently proposed DNA-binding models for Fur family regulators.
Transcriptome Analysis of Honeybee (Apis Mellifera) Haploid and Diploid Embryos Reveals Early Zygotic Transcription during Cleavage

PubMed Central

Pires, Camilla Valente; Freitas, Flávia Cristina de Paula; Cristino, Alexandre S.; Dearden, Peter K.; Simões, Zilá Luz Paulino

2016-01-01

In honeybees, the haplodiploid sex determination system promotes a unique embryogenesis process wherein females develop from fertilized eggs and males develop from unfertilized eggs. However, the developmental strategies of honeybees during early embryogenesis are virtually unknown. Similar to most animals, the honeybee oocytes are supplied with proteins and regulatory elements that support early embryogenesis. As the embryo develops, the zygotic genome is activated and zygotic products gradually replace the preloaded maternal material. The analysis of small RNA and mRNA libraries of mature oocytes and embryos originated from fertilized and unfertilized eggs has allowed us to explore the gene expression dynamics in the first steps of development and during the maternal-to-zygotic transition (MZT). We localized a short sequence motif identified as TAGteam motif and hypothesized to play a similar role in honeybees as in fruit flies, which includes the timing of early zygotic expression (MZT), a function sustained by the presence of the zelda ortholog, which is the main regulator of genome activation. Predicted microRNA (miRNA)-target interactions indicated that there were specific regulators of haploid and diploid embryonic development and an overlap of maternal and zygotic gene expression during the early steps of embryogenesis. Although a number of functions are highly conserved during the early steps of honeybee embryogenesis, the results showed that zygotic genome activation occurs earlier in honeybees than in Drosophila based on the presence of three primary miRNAs (pri-miRNAs) (ame-mir-375, ame-mir-34 and ame-mir-263b) during the cleavage stage in haploid and diploid embryonic development. PMID:26751956
Genome sequencing and comparative genomics of honey bee microsporidia, Nosema apis reveal novel insights into host-parasite interactions.

PubMed

Chen, Yan ping; Pettis, Jeffery S; Zhao, Yan; Liu, Xinyue; Tallon, Luke J; Sadzewicz, Lisa D; Li, Renhua; Zheng, Huoqing; Huang, Shaokang; Zhang, Xuan; Hamilton, Michele C; Pernal, Stephen F; Melathopoulos, Andony P; Yan, Xianghe; Evans, Jay D

2013-07-05

The microsporidia parasite Nosema contributes to the steep global decline of honey bees that are critical pollinators of food crops. There are two species of Nosema that have been found to infect honey bees, Nosema apis and N. ceranae. Genome sequencing of N. apis and comparative genome analysis with N. ceranae, a fully sequenced microsporidia species, reveal novel insights into host-parasite interactions underlying the parasite infections. We applied the whole-genome shotgun sequencing approach to sequence and assemble the genome of N. apis which has an estimated size of 8.5 Mbp. We predicted 2,771 protein- coding genes and predicted the function of each putative protein using the Gene Ontology. The comparative genomic analysis led to identification of 1,356 orthologs that are conserved between the two Nosema species and genes that are unique characteristics of the individual species, thereby providing a list of virulence factors and new genetic tools for studying host-parasite interactions. We also identified a highly abundant motif in the upstream promoter regions of N. apis genes. This motif is also conserved in N. ceranae and other microsporidia species and likely plays a role in gene regulation across the microsporidia. The availability of the N. apis genome sequence is a significant addition to the rapidly expanding body of microsprodian genomic data which has been improving our understanding of eukaryotic genome diversity and evolution in a broad sense. The predicted virulent genes and transcriptional regulatory elements are potential targets for innovative therapeutics to break down the life cycle of the parasite.
Identification of TTAGGG-binding proteins in Neurospora crassa, a fungus with vertebrate-like telomere repeats.

PubMed

Casas-Vila, Núria; Scheibe, Marion; Freiwald, Anja; Kappei, Dennis; Butter, Falk

2015-11-17

To date, telomere research in fungi has mainly focused on Saccharomyces cerevisiae and Schizosaccharomyces pombe, despite the fact that both yeasts have degenerated telomeric repeats in contrast to the canonical TTAGGG motif found in vertebrates and also several other fungi. Using label-free quantitative proteomics, we here investigate the telosome of Neurospora crassa, a fungus with canonical telomeric repeats. We show that at least six of the candidates detected in our screen are direct TTAGGG-repeat binding proteins. While three of the direct interactors (NCU03416 [ncTbf1], NCU01991 [ncTbf2] and NCU02182 [ncTay1]) feature the known myb/homeobox DNA interaction domain also found in the vertebrate telomeric factors, we additionally show that a zinc-finger protein (NCU07846) and two proteins without any annotated DNA-binding domain (NCU02644 and NCU05718) are also direct double-strand TTAGGG binders. We further find two single-strand binders (NCU02404 [ncGbp2] and NCU07735 [ncTcg1]). By quantitative label-free interactomics we identify TTAGGG-binding proteins in Neurospora crassa, suggesting candidates for telomeric factors that are supported by phylogenomic comparison with yeast species. Intriguingly, homologs in yeast species with degenerated telomeric repeats are also TTAGGG-binding proteins, e.g. in S. cerevisiae Tbf1 recognizes the TTAGGG motif found in its subtelomeres. However, there is also a subset of proteins that is not conserved. While a rudimentary core TTAGGG-recognition machinery may be conserved across yeast species, our data suggests Neurospora as an emerging model organism with unique features.
Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules

PubMed Central

Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex

2012-01-01

Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789

Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.

PubMed

Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex

2012-01-01

Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.
The Caenorhabditis elegans vulva: A post-embryonic gene regulatory network controlling organogenesis

PubMed Central

Ririe, Ted O.; Fernandes, Jolene S.; Sternberg, Paul W.

2008-01-01

The Caenorhabditis elegans vulva is an elegant model for dissecting a gene regulatory network (GRN) that directs postembryonic organogenesis. The mature vulva comprises seven cell types (vulA, vulB1, vulB2, vulC, vulD, vulE, and vulF), each with its own unique pattern of spatial and temporal gene expression. The mechanisms that specify these cell types in a precise spatial pattern are not well understood. Using reverse genetic screens, we identified novel components of the vulval GRN, including nhr-113 in vulA. Several transcription factors (lin-11, lin-29, cog-1, egl-38, and nhr-67) interact with each other and act in concert to regulate target gene expression in the diverse vulval cell types. For example, egl-38 (Pax2/5/8) stabilizes the vulF fate by positively regulating vulF characteristics and by inhibiting characteristics associated with the neighboring vulE cells. nhr-67 and egl-38 regulate cog-1, helping restrict its expression to vulE. Computational approaches have been successfully used to identify functional cis-regulatory motifs in the zmp-1 (zinc metalloproteinase) promoter. These results provide an overview of the regulatory network architecture for each vulval cell type. PMID:19104047
Rare k-mer DNA: Identification of sequence motifs and prediction of CpG island and promoter.

PubMed

Mohamed Hashim, Ezzeddin Kamil; Abdullah, Rosni

2015-12-21

Empirical analysis on k-mer DNA has been proven as an effective tool in finding unique patterns in DNA sequences which can lead to the discovery of potential sequence motifs. In an extensive study of empirical k-mer DNA on hundreds of organisms, the researchers found unique multi-modal k-mer spectra occur in the genomes of organisms from the tetrapod clade only which includes all mammals. The multi-modality is caused by the formation of the two lowest modes where k-mers under them are referred as the rare k-mers. The suppression of the two lowest modes (or the rare k-mers) can be attributed to the CG dinucleotide inclusions in them. Apart from that, the rare k-mers are selectively distributed in certain genomic features of CpG Island (CGI), promoter, 5' UTR, and exon. We correlated the rare k-mers with hundreds of annotated features using several bioinformatic tools, performed further intrinsic rare k-mer analyses within the correlated features, and modeled the elucidated rare k-mer clustering feature into a classifier to predict the correlated CGI and promoter features. Our correlation results show that rare k-mers are highly associated with several annotated features of CGI, promoter, 5' UTR, and open chromatin regions. Our intrinsic results show that rare k-mers have several unique topological, compositional, and clustering properties in CGI and promoter features. Finally, the performances of our RWC (rare-word clustering) method in predicting the CGI and promoter features are ranked among the top three, in eight of the CGI and promoter evaluations, among eight of the benchmarked datasets. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.
Novel Carvedilol Analogs that Suppress Store Overload Induced Ca2+ Release

PubMed Central

Smith, Chris D.; Wang, Aixia; Vembaiyan, Kannan; Zhang, Jingqun; Xie, Cuihong; Zhou, Qiang; Wu, Guogen; Wayne Chen, S. R.; Back, Thomas G.

2013-01-01

Carvedilol is a uniquely effective drug for the treatment of cardiac arrhythmias in patients with heart failure. This activity is in part due to its ability to inhibit store overload-induced calcium release (SOICR) through the RyR2 channel. We describe the synthesis, characterization and bioassay of ca. 100 compounds based on the carvedilol motif in order to identify features that correlate with and optimize SOICR inhibition. A single cell bioassay was employed based on the RyR2-R4496C mutant HEK-293 cell line, in which calcium release from the endoplasmic reticulum through the defective channel was measured. IC50 values for SOICR inhibition were thus obtained. The compounds investigated contained modifications to the three principal subunits of carvedilol, including the carbazole and catechol moieties, as well as the linker chain containing the β-amino alcohol functionality. The SAR results indicate that significant alterations are tolerated in each of the three subunits. PMID:24124794
Intramolecular hydrophobic interactions are critical mediators of STAT5 dimerization

NASA Astrophysics Data System (ADS)

Fahrenkamp, Dirk; Li, Jinyu; Ernst, Sabrina; Schmitz-van de Leur, Hildegard; Chatain, Nicolas; Küster, Andrea; Koschmieder, Steffen; Lüscher, Bernhard; Rossetti, Giulia; Müller-Newen, Gerhard

2016-10-01

STAT5 is an essential transcription factor in hematopoiesis, which is activated through tyrosine phosphorylation in response to cytokine stimulation. Constitutive activation of STAT5 is a hallmark of myeloid and lymphoblastic leukemia. Using homology modeling and molecular dynamics simulations, a model of the STAT5 phosphotyrosine-SH2 domain interface was generated providing first structural information on the activated STAT5 dimer including a sequence, for which no structural information is available for any of the STAT proteins. We identified a novel intramolecular interaction mediated through F706, adjacent to the phosphotyrosine motif, and a unique hydrophobic interface on the surface of the SH2 domain. Analysis of corresponding STAT5 mutants revealed that this interaction is dispensable for Epo receptor-mediated phosphorylation of STAT5 but essential for dimer formation and subsequent nuclear accumulation. Moreover, the herein presented model clarifies molecular mechanisms of recently discovered leukemic STAT5 mutants and will help to guide future drug development.
CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-07-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.
Diverse functions of myosin VI elucidated by an isoform-specific α-helix domain

PubMed Central

Magistrati, Elisa; Molteni, Erika; Lupia, Michela; Soffientini, Paolo; Rottner, Klemens; Cavallaro, Ugo; Pozzoli, Uberto; Mapelli, Marina; Walters, Kylie J.; Polo, Simona

2016-01-01

Myosin VI functions in endocytosis and cell motility. Alternative splicing of myosin VI mRNA generates two distinct isoform types, myosin VIshort and myosin VIlong, which differ in the C-terminal region. Their physiological and pathological role remains unknown. Here we identified an isoform-specific regulatory helix, named α2-linker that defines specific conformations and hence determines the target selectivity of human myosin VI. The presence of the α2-linker structurally defines a novel clathrin-binding domain that is unique to myosin VIlong and masks the known RRL interaction motif. This finding is relevant to ovarian cancer, where alternative myosin VI splicing is aberrantly regulated, and exon skipping dictates cell addiction to myosin VIshort for tumor cell migration. The RRL interactor optineurin contributes to this process by selectively binding myosin VIshort. Thus the α2-linker acts like a molecular switch that assigns myosin VI to distinct endocytic (myosin VIlong) or migratory (myosin VIshort) functional roles. PMID:26950368
Diverse functions of myosin VI elucidated by an isoform-specific α-helix domain.

PubMed

Wollscheid, Hans-Peter; Biancospino, Matteo; He, Fahu; Magistrati, Elisa; Molteni, Erika; Lupia, Michela; Soffientini, Paolo; Rottner, Klemens; Cavallaro, Ugo; Pozzoli, Uberto; Mapelli, Marina; Walters, Kylie J; Polo, Simona

2016-04-01

Myosin VI functions in endocytosis and cell motility. Alternative splicing of myosin VI mRNA generates two distinct isoform types, myosin VI(short) and myosin VI(long), which differ in the C-terminal region. Their physiological and pathological roles remain unknown. Here we identified an isoform-specific regulatory helix, named the α2-linker, that defines specific conformations and hence determines the target selectivity of human myosin VI. The presence of the α2-linker structurally defines a new clathrin-binding domain that is unique to myosin VI(long) and masks the known RRL interaction motif. This finding is relevant to ovarian cancer, in which alternative myosin VI splicing is aberrantly regulated, and exon skipping dictates cell addiction to myosin VI(short) in tumor-cell migration. The RRL interactor optineurin contributes to this process by selectively binding myosin VI(short). Thus, the α2-linker acts like a molecular switch that assigns myosin VI to distinct endocytic (myosin VI(long)) or migratory (myosin VI(short)) functional roles.
Two-Dimensional Stoichiometric Boron Oxides as a Versatile Platform for Electronic Structure Engineering.

PubMed

Zhang, Ruiqi; Li, Zhenyu; Yang, Jinlong

2017-09-21

Oxides of two-dimensional (2D) atomic crystals have been widely studied due to their unique properties. In most 2D oxides, oxygen acts as a functional group, which makes it difficult to control the degree of oxidation. Because borophene is an electron-deficient system, it is expected that oxygen will be intrinsically incorporated into the basal plane of borophene, forming stoichiometric 2D boron oxide (BO) structures. By using first-principles global optimization, we systematically explore structures and properties of 2D BO systems with well-defined degrees of oxidation. Stable B-O-B and OB 3 tetrahedron structure motifs are identified in these structures. Interesting properties, such as strong linear dichroism, Dirac node-line (DNL) semimetallicity, and negative differential resistance, have been predicted for these systems. Our results demonstrate that 2D BO represents a versatile platform for electronic structure engineering via tuning the stoichiometric degree of oxidation, which leads to various technological applications.
Intramolecular hydrophobic interactions are critical mediators of STAT5 dimerization

PubMed Central

Fahrenkamp, Dirk; Li, Jinyu; Ernst, Sabrina; Schmitz-Van de Leur, Hildegard; Chatain, Nicolas; Küster, Andrea; Koschmieder, Steffen; Lüscher, Bernhard; Rossetti, Giulia; Müller-Newen, Gerhard

2016-01-01

STAT5 is an essential transcription factor in hematopoiesis, which is activated through tyrosine phosphorylation in response to cytokine stimulation. Constitutive activation of STAT5 is a hallmark of myeloid and lymphoblastic leukemia. Using homology modeling and molecular dynamics simulations, a model of the STAT5 phosphotyrosine-SH2 domain interface was generated providing first structural information on the activated STAT5 dimer including a sequence, for which no structural information is available for any of the STAT proteins. We identified a novel intramolecular interaction mediated through F706, adjacent to the phosphotyrosine motif, and a unique hydrophobic interface on the surface of the SH2 domain. Analysis of corresponding STAT5 mutants revealed that this interaction is dispensable for Epo receptor-mediated phosphorylation of STAT5 but essential for dimer formation and subsequent nuclear accumulation. Moreover, the herein presented model clarifies molecular mechanisms of recently discovered leukemic STAT5 mutants and will help to guide future drug development. PMID:27752093
Structural constraints in the packaging of bluetongue virus genomic segments

PubMed Central

Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C.

2014-01-01

The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by biochemical data analysis suggested that a conformational motif formed by interaction of the 5′ and 3′ ends of the molecule was necessary and sufficient for packaging. A similar structural signal was also identified in S8 of BTV serotype 1. Furthermore, the same conformational analysis of secondary structures for positive-sense ssRNAs was used to generate a chimeric segment that maintained the putative packaging motif but contained unrelated internal sequences. This chimeric segment was packaged successfully, confirming that the motif identified directs the correct packaging of the segment. PMID:24980574
A common antigenic motif recognized by naturally occurring human VH5-51/VL4-1 anti-tau antibodies with distinct functionalities.

PubMed

Apetri, Adrian; Crespo, Rosa; Juraszek, Jarek; Pascual, Gabriel; Janson, Roosmarijn; Zhu, Xueyong; Zhang, Heng; Keogh, Elissa; Holland, Trevin; Wadia, Jay; Verveen, Hanneke; Siregar, Berdien; Mrosek, Michael; Taggenbrock, Renske; Ameijde, Jeroenvan; Inganäs, Hanna; van Winsen, Margot; Koldijk, Martin H; Zuijdgeest, David; Borgers, Marianne; Dockx, Koen; Stoop, Esther J M; Yu, Wenli; Brinkman-van der Linden, Els C; Ummenthum, Kimberley; van Kolen, Kristof; Mercken, Marc; Steinbacher, Stefan; de Marco, Donata; Hoozemans, Jeroen J; Wilson, Ian A; Koudstaal, Wouter; Goudsmit, Jaap

2018-05-31

Misfolding and aggregation of tau protein are closely associated with the onset and progression of Alzheimer's Disease (AD). By interrogating IgG + memory B cells from asymptomatic donors with tau peptides, we have identified two somatically mutated V H 5-51/V L 4-1 antibodies. One of these, CBTAU-27.1, binds to the aggregation motif in the R3 repeat domain and blocks the aggregation of tau into paired helical filaments (PHFs) by sequestering monomeric tau. The other, CBTAU-28.1, binds to the N-terminal insert region and inhibits the spreading of tau seeds and mediates the uptake of tau aggregates into microglia by binding PHFs. Crystal structures revealed that the combination of V H 5-51 and V L 4-1 recognizes a common Pro-X n -Lys motif driven by germline-encoded hotspot interactions while the specificity and thereby functionality of the antibodies are defined by the CDR3 regions. Affinity improvement led to improvement in functionality, identifying their epitopes as new targets for therapy and prevention of AD.
Targeting malaria parasite proteins to the erythrocyte.

PubMed

Templeton, Thomas J; Deitsch, Kirk W

2005-09-01

The intraerythrocytic stages of the protozoan parasite Plasmodium falciparum reside within a parasitophorous vacuole (PV) and set up unique "extraparasite, intraerythrocyte" protein-trafficking pathways that target parasite-encoded proteins to the erythrocyte cytoplasm and cell surface. Two recent articles report the identification of trafficking motifs that regulate the transport of parasite-encoded proteins across the PV. These articles greatly aid the annotation of the parasite "secretome" catalog of proteins that are targeted to the erythrocyte cytoplasm or cell membrane.
Crystal structure of Toll-like receptor adaptor MAL/TIRAP reveals the molecular basis for signal transduction and disease protection

PubMed Central

Valkov, Eugene; Stamp, Anna; DiMaio, Frank; Baker, David; Verstak, Brett; Roversi, Pietro; Kellie, Stuart; Sweet, Matthew J.; Mansell, Ashley; Gay, Nicholas J.; Martin, Jennifer L.; Kobe, Bostjan

2011-01-01

Initiation of the innate immune response requires agonist recognition by pathogen-recognition receptors such as the Toll-like receptors (TLRs). Toll/interleukin-1 receptor (TIR) domain-containing adaptors are critical in orchestrating the signal transduction pathways after TLR and interleukin-1 receptor activation. Myeloid differentiation primary response gene 88 (MyD88) adaptor-like (MAL)/TIR domain-containing adaptor protein (TIRAP) is involved in bridging MyD88 to TLR2 and TLR4 in response to bacterial infection. Genetic studies have associated a number of unique single-nucleotide polymorphisms in MAL with protection against invasive microbial infection, but a molecular understanding has been hampered by a lack of structural information. The present study describes the crystal structure of MAL TIR domain. Significant structural differences exist in the overall fold of MAL compared with other TIR domain structures: A sequence motif comprising a β-strand in other TIR domains instead corresponds to a long loop, placing the functionally important “BB loop” proline motif in a unique surface position in MAL. The structure suggests possible dimerization and MyD88-interacting interfaces, and we confirm the key interface residues by coimmunoprecipitation using site-directed mutants. Jointly, our results provide a molecular and structural basis for the role of MAL in TLR signaling and disease protection. PMID:21873236
What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira

PubMed Central

Fouts, Derrick E.; Matthias, Michael A.; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E.; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L.; Haake, David A.; Haft, Daniel H.; Hartskeerl, Rudy; Ko, Albert I.; Levett, Paul N.; Matsunaga, James; Mechaly, Ariel E.; Monk, Jonathan M.; Nascimento, Ana L. T.; Nelson, Karen E.; Palsson, Bernhard; Peacock, Sharon J.; Picardeau, Mathieu; Ricaldi, Jessica N.; Thaipandungpanit, Janjira; Wunder, Elsio A.; Yang, X. Frank; Zhang, Jun-Jie; Vinetz, Joseph M.

2016-01-01

Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade’s refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic) vs. non-infectious Leptospira, this work provides new insights into the evolution of a genus of bacterial pathogens. This work will be a comprehensive roadmap for understanding leptospirosis pathogenesis. More generally, it provides new insights into mechanisms by which bacterial pathogens adapt to mammalian hosts. PMID:26890609
What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira.

PubMed

Fouts, Derrick E; Matthias, Michael A; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L; Haake, David A; Haft, Daniel H; Hartskeerl, Rudy; Ko, Albert I; Levett, Paul N; Matsunaga, James; Mechaly, Ariel E; Monk, Jonathan M; Nascimento, Ana L T; Nelson, Karen E; Palsson, Bernhard; Peacock, Sharon J; Picardeau, Mathieu; Ricaldi, Jessica N; Thaipandungpanit, Janjira; Wunder, Elsio A; Yang, X Frank; Zhang, Jun-Jie; Vinetz, Joseph M

2016-02-01

Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade's refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic) vs. non-infectious Leptospira, this work provides new insights into the evolution of a genus of bacterial pathogens. This work will be a comprehensive roadmap for understanding leptospirosis pathogenesis. More generally, it provides new insights into mechanisms by which bacterial pathogens adapt to mammalian hosts.
Non-canonical binding interactions of the RNA recognition motif (RRM) domains of P34 protein modulate binding within the 5S ribonucleoprotein particle (5S RNP).

PubMed

Kamina, Anyango D; Williams, Noreen

2017-01-01

RNA binding proteins are involved in many aspects of RNA metabolism. In Trypanosoma brucei, our laboratory has identified two trypanosome-specific RNA binding proteins P34 and P37 that are involved in the maturation of the 60S subunit during ribosome biogenesis. These proteins are part of the T. brucei 5S ribonucleoprotein particle (5S RNP) and P34 binds to 5S ribosomal RNA (rRNA) and ribosomal protein L5 through its N-terminus and its RNA recognition motif (RRM) domains. We generated truncated P34 proteins to determine these domains' interactions with 5S rRNA and L5. Our analyses demonstrate that RRM1 of P34 mediates the majority of binding with 5S rRNA and the N-terminus together with RRM1 contribute the most to binding with L5. We determined that the consensus ribonucleoprotein (RNP) 1 and 2 sequences, characteristic of canonical RRM domains, are not fully conserved in the RRM domains of P34. However, the aromatic amino acids previously described to mediate base stacking interactions with their RNA target are conserved in both of the RRM domains of P34. Surprisingly, mutation of these aromatic residues did not disrupt but instead enhanced 5S rRNA binding. However, we identified four arginine residues located in RRM1 of P34 that strongly impact L5 binding. These mutational analyses of P34 suggest that the binding site for 5S rRNA and L5 are near each other and specific residues within P34 regulate the formation of the 5S RNP. These studies show the unique way that the domains of P34 mediate binding with the T. brucei 5S RNP.
Non-canonical binding interactions of the RNA recognition motif (RRM) domains of P34 protein modulate binding within the 5S ribonucleoprotein particle (5S RNP)

PubMed Central

Kamina, Anyango D.; Williams, Noreen

2017-01-01

RNA binding proteins are involved in many aspects of RNA metabolism. In Trypanosoma brucei, our laboratory has identified two trypanosome-specific RNA binding proteins P34 and P37 that are involved in the maturation of the 60S subunit during ribosome biogenesis. These proteins are part of the T. brucei 5S ribonucleoprotein particle (5S RNP) and P34 binds to 5S ribosomal RNA (rRNA) and ribosomal protein L5 through its N-terminus and its RNA recognition motif (RRM) domains. We generated truncated P34 proteins to determine these domains’ interactions with 5S rRNA and L5. Our analyses demonstrate that RRM1 of P34 mediates the majority of binding with 5S rRNA and the N-terminus together with RRM1 contribute the most to binding with L5. We determined that the consensus ribonucleoprotein (RNP) 1 and 2 sequences, characteristic of canonical RRM domains, are not fully conserved in the RRM domains of P34. However, the aromatic amino acids previously described to mediate base stacking interactions with their RNA target are conserved in both of the RRM domains of P34. Surprisingly, mutation of these aromatic residues did not disrupt but instead enhanced 5S rRNA binding. However, we identified four arginine residues located in RRM1 of P34 that strongly impact L5 binding. These mutational analyses of P34 suggest that the binding site for 5S rRNA and L5 are near each other and specific residues within P34 regulate the formation of the 5S RNP. These studies show the unique way that the domains of P34 mediate binding with the T. brucei 5S RNP. PMID:28542332
In cell mutational interference mapping experiment (in cell MIME) identifies the 5' polyadenylation signal as a dual regulator of HIV-1 genomic RNA production and packaging.

PubMed

Smyth, Redmond P; Smith, Maureen R; Jousset, Anne-Caroline; Despons, Laurence; Laumond, Géraldine; Decoville, Thomas; Cattenoz, Pierre; Moog, Christiane; Jossinet, Fabrice; Mougel, Marylène; Paillart, Jean-Christophe; von Kleist, Max; Marquet, Roland

2018-05-18

Non-coding RNA regulatory elements are important for viral replication, making them promising targets for therapeutic intervention. However, regulatory RNA is challenging to detect and characterise using classical structure-function assays. Here, we present in cell Mutational Interference Mapping Experiment (in cell MIME) as a way to define RNA regulatory landscapes at single nucleotide resolution under native conditions. In cell MIME is based on (i) random mutation of an RNA target, (ii) expression of mutated RNA in cells, (iii) physical separation of RNA into functional and non-functional populations, and (iv) high-throughput sequencing to identify mutations affecting function. We used in cell MIME to define RNA elements within the 5' region of the HIV-1 genomic RNA (gRNA) that are important for viral replication in cells. We identified three distinct RNA motifs controlling intracellular gRNA production, and two distinct motifs required for gRNA packaging into virions. Our analysis reveals the 73AAUAAA78 polyadenylation motif within the 5' PolyA domain as a dual regulator of gRNA production and gRNA packaging, and demonstrates that a functional polyadenylation signal is required for viral packaging even though it negatively affects gRNA production.
In cell mutational interference mapping experiment (in cell MIME) identifies the 5′ polyadenylation signal as a dual regulator of HIV-1 genomic RNA production and packaging

PubMed Central

Smith, Maureen R; Jousset, Anne-Caroline; Despons, Laurence; Laumond, Géraldine; Decoville, Thomas; Cattenoz, Pierre; Moog, Christiane; Jossinet, Fabrice; Mougel, Marylène; Paillart, Jean-Christophe

2018-01-01

Abstract Non-coding RNA regulatory elements are important for viral replication, making them promising targets for therapeutic intervention. However, regulatory RNA is challenging to detect and characterise using classical structure-function assays. Here, we present in cell Mutational Interference Mapping Experiment (in cell MIME) as a way to define RNA regulatory landscapes at single nucleotide resolution under native conditions. In cell MIME is based on (i) random mutation of an RNA target, (ii) expression of mutated RNA in cells, (iii) physical separation of RNA into functional and non-functional populations, and (iv) high-throughput sequencing to identify mutations affecting function. We used in cell MIME to define RNA elements within the 5′ region of the HIV-1 genomic RNA (gRNA) that are important for viral replication in cells. We identified three distinct RNA motifs controlling intracellular gRNA production, and two distinct motifs required for gRNA packaging into virions. Our analysis reveals the 73AAUAAA78 polyadenylation motif within the 5′ PolyA domain as a dual regulator of gRNA production and gRNA packaging, and demonstrates that a functional polyadenylation signal is required for viral packaging even though it negatively affects gRNA production. PMID:29514260

Some links on this page may take you to non-federal websites. Their policies may differ from this site.