conserved domain database: Topics by Science.gov

Sample records for conserved domain database

Genetic Testing Registry

MedlinePlus

... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
National Center for Biotechnology Information

MedlinePlus

... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.

PubMed

Marchler-Bauer, Aron; Bo, Yu; Han, Lianyi; He, Jane; Lanczycki, Christopher J; Lu, Shennan; Chitsaz, Farideh; Derbyshire, Myra K; Geer, Renata C; Gonzales, Noreen R; Gwadz, Marc; Hurwitz, David I; Lu, Fu; Marchler, Gabriele H; Song, James S; Thanki, Narmada; Wang, Zhouxi; Yamashita, Roxanne A; Zhang, Dachuan; Zheng, Chanjuan; Geer, Lewis Y; Bryant, Stephen H

2017-01-04

NCBI's Conserved Domain Database (CDD) aims at annotating biomolecular sequences with the location of evolutionarily conserved protein domain footprints, and functional sites inferred from such footprints. An archive of pre-computed domain annotation is maintained for proteins tracked by NCBI's Entrez database, and live search services are offered as well. CDD curation staff supplements a comprehensive collection of protein domain and protein family models, which have been imported from external providers, with representations of selected domain families that are curated in-house and organized into hierarchical classifications of functionally distinct families and sub-families. CDD also supports comparative analyses of protein families via conserved domain architectures, and a recent curation effort focuses on providing functional characterizations of distinct subfamily architectures using SPARCLE: Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.

PubMed

Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine

2011-03-10

Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de.
HMMerThread: Detecting Remote, Functional Conserved Domains in Entire Genomes by Combining Relaxed Sequence-Database Searches with Fold Recognition

PubMed Central

Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine

2011-01-01

Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de. PMID:21423752
Database resources of the National Center for Biotechnology Information.

PubMed

2016-01-04

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Database resources of the National Center for Biotechnology Information.

PubMed

2015-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures

PubMed Central

2012-01-01

Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups. PMID:22726767
Database resources of the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Kenton, David L.; Khovayko, Oleg; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Schriml, Lynn M.; Sequeira, Edwin; Sherry, Stephen T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Suzek, Tugba O.; Tatusov, Roman; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene

2006-01-01

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: . PMID:16381840
Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; Dicuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian

2012-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology Information

PubMed Central

Acland, Abigail; Agarwala, Richa; Barrett, Tanya; Beck, Jeff; Benson, Dennis A.; Bollin, Colleen; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Church, Deanna M.; Clark, Karen; DiCuccio, Michael; Dondoshansky, Ilya; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Gorelenkov, Viatcheslav; Hoeppner, Marilu; Johnson, Mark; Kelly, Christopher; Khotomlianski, Viatcheslav; Kimchi, Avi; Kimelman, Michael; Kitts, Paul; Krasnov, Sergey; Kuznetsov, Anatoliy; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Karsch-Mizrachi, Ilene; Murphy, Terence; Ostell, James; O'Sullivan, Christopher; Panchenko, Anna; Phan, Lon; Pruitt, Don Preussm Kim D.; Rubinstein, Wendy; Sayers, Eric W.; Schneider, Valerie; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Siyan, Karanjit; Slotta, Douglas; Soboleva, Alexandra; Soussov, Vladimir; Starchenko, Grigory; Tatusova, Tatiana A.; Trawick, Bart W.; Vakatov, Denis; Wang, Yanli; Ward, Minghong; John Wilbur, W.; Yaschenko, Eugene; Zbicz, Kerry

2014-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, PubReader, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Primer-BLAST, COBALT, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, ClinVar, MedGen, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page. PMID:24259429
Database resources of the National Center for Biotechnology

PubMed Central

Wheeler, David L.; Church, Deanna M.; Federhen, Scott; Lash, Alex E.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Sequeira, Edwin; Tatusova, Tatiana A.; Wagner, Lukas

2003-01-01

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, PubMed, PubMed Central (PMC), LocusLink, the NCBITaxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR (e-PCR), Open Reading Frame (ORF) Finder, References Sequence (RefSeq), UniGene, HomoloGene, ProtEST, Database of Single Nucleotide Polymorphisms (dbSNP), Human/Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker (MM), Evidence Viewer (EV), Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. PMID:12519941
Database resources of the National Center for Biotechnology Information

PubMed Central

2015-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:25398906
Database resources of the National Center for Biotechnology Information

PubMed Central

2016-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:26615191
Structure-Based Design of Molecules to Reactivate Tumor-Derived p53 Mutations

DTIC Science & Technology

2007-06-01

cluster in conserved regions or “hot spots” (Hainaut and Hollstein, 2000). Missense mutations leading to amino acid changes are the most common p53...domain stabilization compounds. Analysis of the residue-specific temperature factors of the high resolution core domain structure, coupled with a...second scoring results, 13 compounds (10 from the SPECS database and 3 from the TimTec database) were selected for further analysis using solution
TOPDOM: database of conservatively located domains and motifs in proteins.

PubMed

Varga, Julia; Dobson, László; Tusnády, Gábor E

2016-09-01

The TOPDOM database-originally created as a collection of domains and motifs located consistently on the same side of the membranes in α-helical transmembrane proteins-has been updated and extended by taking into consideration consistently localized domains and motifs in globular proteins, too. By taking advantage of the recently developed CCTOP algorithm to determine the type of a protein and predict topology in case of transmembrane proteins, and by applying a thorough search for domains and motifs as well as utilizing the most up-to-date version of all source databases, we managed to reach a 6-fold increase in the size of the whole database and a 2-fold increase in the number of transmembrane proteins. TOPDOM database is available at http://topdom.enzim.hu The webpage utilizes the common Apache, PHP5 and MySQL software to provide the user interface for accessing and searching the database. The database itself is generated on a high performance computer. tusnady.gabor@ttk.mta.hu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Database resources of the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Miller, Vadim; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Shumway, Martin; Sequeira, Edwin; Sherry, Steven T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L.; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene

2008-01-01

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data available through NCBI's web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace, Assembly, and Short Read Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Database of Genotype and Phenotype, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting the web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:18045790
Database resources of the National Center for Biotechnology Information

PubMed Central

Sayers, Eric W.; Barrett, Tanya; Benson, Dennis A.; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M.; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D.; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A.; Wagner, Lukas; Wang, Yanli; Wilbur, W. John; Yaschenko, Eugene; Ye, Jian

2012-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:22140104
Database resources of the National Center for Biotechnology Information

PubMed Central

2013-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page. PMID:23193264
Database resources of the National Center for Biotechnology Information.

PubMed

Wheeler, David L; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Ostell, James; Miller, Vadim; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Steven T; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene

2007-01-01

In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link(BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace and Assembly Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Viral Genotyping Tools, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.

Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene; Ye, Jian

2009-01-01

In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the web applications is custom implementation of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian

2011-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
A conserved gene family encodes transmembrane proteins with fibronectin, immunoglobulin and leucine-rich repeat domains (FIGLER)

PubMed Central

Munfus, Delicia L; Haga, Christopher L; Burrows, Peter D; Cooper, Max D

2007-01-01

Background In mouse the cytokine interleukin-7 (IL-7) is required for generation of B lymphocytes, but human IL-7 does not appear to have this function. A bioinformatics approach was therefore used to identify IL-7 receptor related genes in the hope of identifying the elusive human cytokine. Results Our database search identified a family of nine gene candidates, which we have provisionally named fibronectin immunoglobulin leucine-rich repeat (FIGLER). The FIGLER 1–9 genes are predicted to encode type I transmembrane glycoproteins with 6–12 leucine-rich repeats (LRR), a C2 type Ig domain, a fibronectin type III domain, a hydrophobic transmembrane domain, and a cytoplasmic domain containing one to four tyrosine residues. Members of this multichromosomal gene family possess 20–47% overall amino acid identity and are differentially expressed in cell lines and primary hematopoietic lineage cells. Genes for FIGLER homologs were identified in macaque, orangutan, chimpanzee, mouse, rat, dog, chicken, toad, and puffer fish databases. The non-human FIGLER homologs share 38–99% overall amino acid identity with their human counterpart. Conclusion The extracellular domain structure and absence of recognizable cytoplasmic signaling motifs in members of the highly conserved FIGLER gene family suggest a trophic or cell adhesion function for these molecules. PMID:17854505
[Genome-wide identification and expression analysis of the WRKY gene family in peach].

PubMed

Gu, Yan-bing; Ji, Zhi-rui; Chi, Fu-mei; Qiao, Zhuang; Xu, Cheng-nan; Zhang, Jun-xiang; Zhou, Zong-shan; Dong, Qing-long

2016-03-01

The WRKY transcription factors are one of the largest families of transcriptional regulators and play diverse regulatory roles in biotic and abiotic stresses, plant growth and development processes. In this study, the WRKY DNA-binding domain (Pfam Database number: PF03106) downloaded from Pfam protein families database was exploited to identify WRKY genes from the peach (Prunus persica 'Lovell') genome using HMMER 3.0. The obtained amino acid sequences were analyzed with DNAMAN 5.0, WebLogo 3, MEGA 5.1, MapInspect and MEME bioinformatics softwares. Totally 61 peach WRKY genes were found in the peach genome. Our phylogenetic analysis revealed that peach WRKY genes were classified into three Groups: Ⅰ, Ⅱ and Ⅲ. The WRKY N-terminal and C-terminal domains of Group Ⅰ (group I-N and group I-C) were monophyletic. The Group Ⅱ was sub-divided into five distinct clades (groupⅡ-a, Ⅱ-b, Ⅱ-c, Ⅱ-d and Ⅱ-e). Our domain analysis indicated that the WRKY regions contained a highly conserved heptapeptide stretch WRKYGQK at its N-terminus followed by a zinc-finger motif. The chromosome mapping analysis showed that peach WRKY genes were distributed with different densities over 8 chromosomes. The intron-exon structure analysis revealed that structures of the WRKY gene were highly conserved in the peach. The conserved motif analysis showed that the conserved motifs 1, 2 and 3, which specify the WRKY domain, were observed in all peach WRKY proteins, motif 5 as the unknown domain was observed in group Ⅱ-d, two WRKY domains were assigned to GroupⅠ. SqRT-PCR and qRT-PCR results indicated that 16 PpWRKY genes were expressed in roots, stems, leaves, flowers and fruits at various expression levels. Our analysis thus identified the PpWRKY gene families, and future functional studies are needed to reveal its specific roles.
A Fast Alignment-Free Approach for De Novo Detection of Protein Conserved Regions

PubMed Central

Abnousi, Armen; Broschat, Shira L.; Kalyanaraman, Ananth

2016-01-01

Background Identifying conserved regions in protein sequences is a fundamental operation, occurring in numerous sequence-driven analysis pipelines. It is used as a way to decode domain-rich regions within proteins, to compute protein clusters, to annotate sequence function, and to compute evolutionary relationships among protein sequences. A number of approaches exist for identifying and characterizing protein families based on their domains, and because domains represent conserved portions of a protein sequence, the primary computation involved in protein family characterization is identification of such conserved regions. However, identifying conserved regions from large collections (millions) of protein sequences presents significant challenges. Methods In this paper we present a new, alignment-free method for detecting conserved regions in protein sequences called NADDA (No-Alignment Domain Detection Algorithm). Our method exploits the abundance of exact matching short subsequences (k-mers) to quickly detect conserved regions, and the power of machine learning is used to improve the prediction accuracy of detection. We present a parallel implementation of NADDA using the MapReduce framework and show that our method is highly scalable. Results We have compared NADDA with Pfam and InterPro databases. For known domains annotated by Pfam, accuracy is 83%, sensitivity 96%, and specificity 44%. For sequences with new domains not present in the training set an average accuracy of 63% is achieved when compared to Pfam. A boost in results in comparison with InterPro demonstrates the ability of NADDA to capture conserved regions beyond those present in Pfam. We have also compared NADDA with ADDA and MKDOM2, assuming Pfam as ground-truth. On average NADDA shows comparable accuracy, more balanced sensitivity and specificity, and being alignment-free, is significantly faster. Excluding the one-time cost of training, runtimes on a single processor were 49s, 10,566s, and 456s for NADDA, ADDA, and MKDOM2, respectively, for a data set comprised of approximately 2500 sequences. PMID:27552220
Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

PubMed

Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

2012-01-01

Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.
Distribution and cluster analysis of predicted intrinsically disordered protein Pfam domains

PubMed Central

Williams, Robert W; Xue, Bin; Uversky, Vladimir N; Dunker, A Keith

2013-01-01

The Pfam database groups regions of proteins by how well hidden Markov models (HMMs) can be trained to recognize similarities among them. Conservation pressure is probably in play here. The Pfam seed training set includes sequence and structure information, being drawn largely from the PDB. A long standing hypothesis among intrinsically disordered protein (IDP) investigators has held that conservation pressures are also at play in the evolution of different kinds of intrinsic disorder, but we find that predicted intrinsic disorder (PID) is not always conserved across Pfam domains. Here we analyze distributions and clusters of PID regions in 193024 members of the version 23.0 Pfam seed database. To include the maximum information available for proteins that remain unfolded in solution, we employ the 10 linearly independent Kidera factors1–3 for the amino acids, combined with PONDR4 predictions of disorder tendency, to transform the sequences of these Pfam members into an 11 column matrix where the number of rows is the length of each Pfam region. Cluster analyses of the set of all regions, including those that are folded, show 6 groupings of domains. Cluster analyses of domains with mean VSL2b scores greater than 0.5 (half predicted disorder or more) show at least 3 separated groups. It is hypothesized that grouping sets into shorter sequences with more uniform length will reveal more information about intrinsic disorder and lead to more finely structured and perhaps more accurate predictions. HMMs could be trained to include this information. PMID:28516017
Formin homology 2 domains occur in multiple contexts in angiosperms

PubMed Central

Cvrčková, Fatima; Novotný, Marian; Pícková, Denisa; Žárský, Viktor

2004-01-01

Background Involvement of conservative molecular modules and cellular mechanisms in the widely diversified processes of eukaryotic cell morphogenesis leads to the intriguing question: how do similar proteins contribute to dissimilar morphogenetic outputs. Formins (FH2 proteins) play a central part in the control of actin organization and dynamics, providing a good example of evolutionarily versatile use of a conserved protein domain in the context of a variety of lineage-specific structural and signalling interactions. Results In order to identify possible plant-specific sequence features within the FH2 protein family, we performed a detailed analysis of angiosperm formin-related sequences available in public databases, with particular focus on the complete Arabidopsis genome and the nearly finished rice genome sequence. This has led to revision of the current annotation of half of the 22 Arabidopsis formin-related genes. Comparative analysis of the two plant genomes revealed a good conservation of the previously described two subfamilies of plant formins (Class I and Class II), as well as several subfamilies within them that appear to predate the separation of monocot and dicot plants. Moreover, a number of plant Class II formins share an additional conserved domain, related to the protein phosphatase/tensin/auxilin fold. However, considerable inter-species variability sets limits to generalization of any functional conclusions reached on a single species such as Arabidopsis. Conclusions The plant-specific domain context of the conserved FH2 domain, as well as plant-specific features of the domain itself, may reflect distinct functional requirements in plant cells. The variability of formin structures found in plants far exceeds that known from both fungi and metazoans, suggesting a possible contribution of FH2 proteins in the evolution of the plant type of multicellularity. PMID:15256004
CORAL: aligning conserved core regions across domain families.

PubMed

Fong, Jessica H; Marchler-Bauer, Aron

2009-08-01

Homologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile-profile method CORAL that aligns individual core regions as gap-free units. CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved 'readability' that facilitate manual refinement. CORAL will be included in future versions of the NCBI Cn3D/CDTree software, which can be downloaded at http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml. Supplementary data are available at Bioinformatics online.
[Identification of new conserved and variable regions in the 16S rRNA gene of acetic acid bacteria and acetobacteraceae family].

PubMed

Chakravorty, S; Sarkar, S; Gachhui, R

2015-01-01

The Acetobacteraceae family of the class Alpha Proteobacteria is comprised of high sugar and acid tolerant bacteria. The Acetic Acid Bacteria are the economically most significant group of this family because of its association with food products like vinegar, wine etc. Acetobacteraceae are often hard to culture in laboratory conditions and they also maintain very low abundances in their natural habitats. Thus identification of the organisms in such environments is greatly dependent on modern tools of molecular biology which require a thorough knowledge of specific conserved gene sequences that may act as primers and or probes. Moreover unconserved domains in genes also become markers for differentiating closely related genera. In bacteria, the 16S rRNA gene is an ideal candidate for such conserved and variable domains. In order to study the conserved and variable domains of the 16S rRNA gene of Acetic Acid Bacteria and the Acetobacteraceae family, sequences from publicly available databases were aligned and compared. Near complete sequences of the gene were also obtained from Kombucha tea biofilm, a known Acetobacteraceae family habitat, in order to corroborate the domains obtained from the alignment studies. The study indicated that the degree of conservation in the gene is significantly higher among the Acetic Acid Bacteria than the whole Acetobacteraceae family. Moreover it was also observed that the previously described hypervariable regions V1, V3, V5, V6 and V7 were more or less conserved in the family and the spans of the variable regions are quite distinct as well.
[Family of ribosomal proteins S1 contains unique conservative domain].

PubMed

Deriusheva, E I; Machulin, A V; Selivanova, O M; Serdiuk, I N

2010-01-01

Different representatives of bacteria have different number of amino acid residues in the ribosomal proteins S1. This number varies from 111 (Spiroplasma kunkelii) to 863 a.a. (Treponema pallidum). Traditionally and for lack of this protein three-dimensional structure, its architecture is represented as repeating S1 domains. Number of these domains depends on the protein's length. Domain's quantity and its boundaries data are contained in the specialized databases, such as SMART, Pfam and PROSITE. However, for the same object these data may be very different. For search of domain's quantity and its boundaries, new approach, based on the analysis of dicted secondary structure (PsiPred), was used. This approach allowed us to reveal structural domains in amino acid sequences of S1 proteins and at that number varied from one to six. Alignment of S1 proteins, containing different domain's number, with the S1 RNAbinding domain of Escherichia coli PNPase elicited a fact that in family of ribosomal proteins SI one domain has maximal homology with S1 domain from PNPase. This conservative domain migrates along polypeptide chain and locates in proteins, containing different domain's number, according to specified pattern. In this domain as well in the S1 domain from PNPase, residues Phe-19, Phe-22, His-34, Asp-64 and Arg-68 are clustered on the surface and formed RNA binding site.
Expression of Anaplasma marginale ankyrin repeat-containing proteins during infection of the mammalian host and tick vector

USDA-ARS?s Scientific Manuscript database

Using searches of the NCBI conserved domain database and SMART genomic architecture analysis, we identified three ankyrin repeat-containing genes in Anaplasma marginale: AM705, AM926 and AM638. Recombinant protein was used to immunize mice and generate fusion hybridomas secreting protein-specific mo...
Proteins with an Euonymus lectin-like domain are ubiquitous in Embryophyta

PubMed Central

2009-01-01

Background Cloning of the Euonymus lectin led to the discovery of a novel domain that also occurs in some stress-induced plant proteins. The distribution and the diversity of proteins with an Euonymus lectin (EUL) domain were investigated using detailed analysis of sequences in publicly accessible genome and transcriptome databases. Results Comprehensive in silico analyses indicate that the recently identified Euonymus europaeus lectin domain represents a conserved structural unit of a novel family of putative carbohydrate-binding proteins, which will further be referred to as the Euonymus lectin (EUL) family. The EUL domain is widespread among plants. Analysis of retrieved sequences revealed that some sequences consist of a single EUL domain linked to an unrelated N-terminal domain whereas others comprise two in tandem arrayed EUL domains. A new classification system for these lectins is proposed based on the overall domain architecture. Evolutionary relationships among the sequences with EUL domains are discussed. Conclusion The identification of the EUL family provides the first evidence for the occurrence in terrestrial plants of a highly conserved plant specific domain. The widespread distribution of the EUL domain strikingly contrasts the more limited or even narrow distribution of most other lectin domains found in plants. The apparent omnipresence of the EUL domain is indicative for a universal role of this lectin domain in plants. Although there is unambiguous evidence that several EUL domains possess carbohydrate-binding activity further research is required to corroborate the carbohydrate-binding properties of different members of the EUL family. PMID:19930663
NPIDB: Nucleic acid-Protein Interaction DataBase.

PubMed

Kirsanov, Dmitry D; Zanegina, Olga N; Aksianov, Evgeniy A; Spirin, Sergei A; Karyagina, Anna S; Alexeevski, Andrei V

2013-01-01

The Nucleic acid-Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid-Protein Interaction DataBase is an upgrade of the version published in 2007. The improvements include a new web interface, new tools for calculation of intermolecular interactions, a classification of SCOP families that contains DNA-binding protein domains and data on conserved water molecules on the DNA-protein interface.
A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3

PubMed Central

Dietmann, Sabine; Park, Jong; Notredame, Cedric; Heger, Andreas; Lappe, Michael; Holm, Liisa

2001-01-01

The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank (PDB). The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to: (i) supersecondary structural motifs (attractors in fold space), (ii) the topology of globular domains (fold types), (iii) remote homologues (functional families) and (iv) homologues with sequence identity above 25% (sequence families). The computational definitions of attractors and functional families are new. In September 2000, the Dali classification contained 10 531 PDB entries comprising 17 101 chains, which were partitioned into five attractor regions, 1375 fold types, 2582 functional families and 3724 domain sequence families. Sequence families were further associated with 99 582 unique homologous sequences in the HSSP database, which increases the number of effectively known structures several-fold. The resulting database contains the description of protein domain architecture, the definition of structural neighbours around each known structure, the definition of structurally conserved cores and a comprehensive library of explicit multiple alignments of distantly related protein families. PMID:11125048
Cloning and analysis of DnaJ family members in the silkworm, Bombyx mori.

PubMed

Li, Yinü; Bu, Cuiyu; Li, Tiantian; Wang, Shibao; Jiang, Feng; Yi, Yongzhu; Yang, Huipeng; Zhang, Zhifang

2016-01-15

Heat shock proteins (Hsps) are involved in a variety of critical biological functions, including protein folding, degradation, and translocation and macromolecule assembly, act as molecular chaperones during periods of stress by binding to other proteins. Using expressed sequence tag (EST) and silkworm (Bombyx mori) transcriptome databases, we identified 27 cDNA sequences encoding the conserved J domain, which is found in DnaJ-type Hsps. Of the 27 J domain-containing sequences, 25 were complete cDNA sequences. We divided them into three types according to the number and presence of conserved domains. By analyzing the gene structures, intron numbers, and conserved domains and constructing a phylogenetic tree, we found that the DnaJ family had undergone convergent evolution, obtaining new domains to expand the diversity of its family members. The acquisition of the new DnaJ domains most likely occurred prior to the evolutionary divergence of prokaryotes and eukaryotes. The expression of DnaJ genes in the silkworm was generally higher in the fat body. The tissue distribution of DnaJ1 proteins was detected by western blotting, demonstrating that in the fifth-instar larvae, the DnaJ1 proteins were expressed at their highest levels in hemocytes, followed by the fat body and head. We also found that the DnaJ1 transcripts were likely differentially translated in different tissues. Using immunofluorescence cytochemistry, we revealed that in the blood cells, DnaJ1 was mainly localized in the cytoplasm. Copyright © 2015 Elsevier B.V. All rights reserved.
Cry-Bt identifier: a biological database for PCR detection of Cry genes present in transgenic plants.

PubMed

Singh, Vinay Kumar; Ambwani, Sonu; Marla, Soma; Kumar, Anil

2009-10-23

We describe the development of a user friendly tool that would assist in the retrieval of information relating to Cry genes in transgenic crops. The tool also helps in detection of transformed Cry genes from Bacillus thuringiensis present in transgenic plants by providing suitable designed primers for PCR identification of these genes. The tool designed based on relational database model enables easy retrieval of information from the database with simple user queries. The tool also enables users to access related information about Cry genes present in various databases by interacting with different sources (nucleotide sequences, protein sequence, sequence comparison tools, published literature, conserved domains, evolutionary and structural data). http://insilicogenomics.in/Cry-btIdentifier/welcome.html.
Rebelling for a Reason: Protein Structural “Outliers”

PubMed Central

Arumugam, Gandhimathi; Nair, Anu G.; Hariharaputran, Sridhar; Ramanathan, Sowdhamini

2013-01-01

Analysis of structural variation in domain superfamilies can reveal constraints in protein evolution which aids protein structure prediction and classification. Structure-based sequence alignment of distantly related proteins, organized in PASS2 database, provides clues about structurally conserved regions among different functional families. Some superfamily members show large structural differences which are functionally relevant. This paper analyses the impact of structural divergence on function for multi-member superfamilies, selected from the PASS2 superfamily alignment database. Functional annotations within superfamilies, with structural outliers or ‘rebels’, are discussed in the context of structural variations. Overall, these data reinforce the idea that functional similarities cannot be extrapolated from mere structural conservation. The implication for fold-function prediction is that the functional annotations can only be inherited with very careful consideration, especially at low sequence identities. PMID:24073209
Predicting chromatin architecture from models of polymer physics.

PubMed

Bianco, Simona; Chiariello, Andrea M; Annunziatella, Carlo; Esposito, Andrea; Nicodemi, Mario

2017-03-01

We review the picture of chromatin large-scale 3D organization emerging from the analysis of Hi-C data and polymer modeling. In higher mammals, Hi-C contact maps reveal a complex higher-order organization, extending from the sub-Mb to chromosomal scales, hierarchically folded in a structure of domains-within-domains (metaTADs). The domain folding hierarchy is partially conserved throughout differentiation, and deeply correlated to epigenomic features. Rearrangements in the metaTAD topology relate to gene expression modifications: in particular, in neuronal differentiation models, topologically associated domains (TADs) tend to have coherent expression changes within architecturally conserved metaTAD niches. To identify the nature of architectural domains and their molecular determinants within a principled approach, we discuss models based on polymer physics. We show that basic concepts of interacting polymer physics explain chromatin spatial organization across chromosomal scales and cell types. The 3D structure of genomic loci can be derived with high accuracy and its molecular determinants identified by crossing information with epigenomic databases. In particular, we illustrate the case of the Sox9 locus, linked to human congenital disorders. The model in-silico predictions on the effects of genomic rearrangements are confirmed by available 5C data. That can help establishing new diagnostic tools for diseases linked to chromatin mis-folding, such as congenital disorders and cancer.
Atomic interaction networks in the core of protein domains and their native folds.

PubMed

Soundararajan, Venkataramanan; Raman, Rahul; Raguram, S; Sasisekharan, V; Sasisekharan, Ram

2010-02-23

Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be "signature" of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1-2 angstroms (mean 1.61A) C(alpha) RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the 'twilight' and 'midnight' zones wherein <30% and <10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plague-causative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence-identity-independent applications demonstrated in this work, we suggest that the PCAIN is a fundamental fold feature that could be a valuable addition to the arsenal of protein modeling and analysis tools.

Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds

PubMed Central

Soundararajan, Venkataramanan; Raman, Rahul; Raguram, S.; Sasisekharan, V.; Sasisekharan, Ram

2010-01-01

Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be “signature” of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1–2 angstroms (mean 1.61A) Cα RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the ‘twilight’ and ‘midnight’ zones wherein <30% and <10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plague-causative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence-identity-independent applications demonstrated in this work, we suggest that the PCAIN is a fundamental fold feature that could be a valuable addition to the arsenal of protein modeling and analysis tools. PMID:20186337
Sequence conservation from human to prokaryotes of Surf1, a protein involved in cytochrome c oxidase assembly, deficient in Leigh syndrome.

PubMed

Poyau, A; Buchet, K; Godinot, C

1999-12-03

The human SURF1 gene encoding a protein involved in cytochrome c oxidase (COX) assembly, is mutated in most patients presenting Leigh syndrome associated with COX deficiency. Proteins homologous to the human Surf1 have been identified in nine eukaryotes and six prokaryotes using database alignment tools, structure prediction and/or cDNA sequencing. Their sequence comparison revealed a remarkable Surf1 conservation during evolution and put forward at least four highly conserved domains that should be essential for Surf1 function. In Paracoccus denitrificans, the Surf1 homologue is found in the quinol oxidase operon, suggesting that Surf1 is associated with a primitive quinol oxidase which belongs to the same superfamily as cytochrome oxidase.
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins

PubMed Central

Pruitt, Kim D.; Tatusova, Tatiana; Maglott, Donna R.

2005-01-01

The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) provides a non-redundant collection of sequences representing genomic data, transcripts and proteins. Although the goal is to provide a comprehensive dataset representing the complete sequence information for any given species, the database pragmatically includes sequence data that are currently publicly available in the archival databases. The database incorporates data from over 2400 organisms and includes over one million proteins representing significant taxonomic diversity spanning prokaryotes, eukaryotes and viruses. Nucleotide and protein sequences are explicitly linked, and the sequences are linked to other resources including the NCBI Map Viewer and Gene. Sequences are annotated to include coding regions, conserved domains, variation, references, names, database cross-references, and other features using a combined approach of collaboration and other input from the scientific community, automated annotation, propagation from GenBank and curation by NCBI staff. PMID:15608248
Database resources of the National Center for Biotechnology Information: 2002 update

PubMed Central

Wheeler, David L.; Church, Deanna M.; Lash, Alex E.; Leipe, Detlef D.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Tatusova, Tatiana A.; Wagner, Lukas; Rapp, Barbara A.

2002-01-01

In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, Human¡VMouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:11752242
V-SINEs: A New Superfamily of Vertebrate SINEs That Are Widespread in Vertebrate Genomes and Retain a Strongly Conserved Segment within Each Repetitive Unit

PubMed Central

Ogiwara, Ikuo; Miya, Masaki; Ohshima, Kazuhiko; Okada, Norihiro

2002-01-01

We have identified a new superfamily of vertebrate short interspersed repetitive elements (SINEs), designated V-SINEs, that are widespread in fishes and frogs. Each V-SINE includes a central conserved domain preceded by a 5′-end tRNA-related region and followed by a potentially recombinogenic (TG)n tract, with a 3′ tail derived from the 3′ untranslated region (UTR) of the corresponding partner long interspersed repetitive element (LINE) that encodes a functional reverse transcriptase. The central domain is strongly conserved and is even found in SINEs in the lamprey genome, suggesting that V-SINEs might be ∼550 Myr old or older in view of the timing of divergence of the lamprey lineage from the bony fish lineage. The central conserved domain might have been subject to some form of positive selection. Although the contemporary 3′ tails of V-SINEs differ from one another, it is possible that the original 3′ tail might have been replaced, via recombination, by the 3′ tails of more active partner LINEs, thereby retaining retropositional activity and the ability to survive for long periods on the evolutionary time scale. It seems plausible that V-SINEs may have some function(s) that have been maintained by the coevolution of SINEs and LINEs during the evolution of vertebrates. [The sequences reported in this paper have been deposited in the DDBJ/GenBank database under accession nos. AB072981–AB073004. Supplemental figures are available online at http://www.genome.org.] PMID:11827951
PASS2: an automated database of protein alignments organised as structural superfamilies.

PubMed

Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan

2004-04-02

The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html
Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource

PubMed Central

Koike, Asako; Kobayashi, Yoshiyuki; Takagi, Toshihisa

2003-01-01

Protein kinases play a crucial role in the regulation of cellular functions. Various kinds of information about these molecules are important for understanding signaling pathways and organism characteristics. We have developed the Kinase Pathway Database, an integrated database involving major completely sequenced eukaryotes. It contains the classification of protein kinases and their functional conservation, ortholog tables among species, protein–protein, protein–gene, and protein–compound interaction data, domain information, and structural information. It also provides an automatic pathway graphic image interface. The protein, gene, and compound interactions are automatically extracted from abstracts for all genes and proteins by natural-language processing (NLP).The method of automatic extraction uses phrase patterns and the GENA protein, gene, and compound name dictionary, which was developed by our group. With this database, pathways are easily compared among species using data with more than 47,000 protein interactions and protein kinase ortholog tables. The database is available for querying and browsing at http://kinasedb.ontology.ims.u-tokyo.ac.jp/. PMID:12799355
Metallopeptidases of Toxoplasma gondii: in silico identification and gene expression.

PubMed

Escotte-Binet, Sandie; Huguenin, Antoine; Aubert, Dominique; Martin, Anne-Pascaline; Kaltenbach, Matthieu; Florent, Isabelle; Villena, Isabelle

2018-01-01

Metallopeptidases are a family of proteins with domains that remain highly conserved throughout evolution. These hydrolases require divalent metal cation(s) to activate the water molecule in order to carry out their catalytic action on peptide bonds by nucleophilic attack. Metallopeptidases from parasitic protozoa, including Toxoplasma, are investigated because of their crucial role in parasite biology. In the present study, we screened the T. gondii database using PFAM motifs specific for metallopeptidases in association with the MEROPS peptidase Database (release 10.0). In all, 49 genes encoding proteins with metallopeptidase signatures were identified in the Toxoplasma genome. An Interpro Search enabled us to uncover their domain/motif organization, and orthologs with the highest similarity by BLAST were used for annotation. These 49 Toxoplasma metallopeptidases clustered into 15 families described in the MEROPS database. Experimental expression analysis of their genes in the tachyzoite stage revealed transcription for all genes studied. Further research on the role of these peptidases should increase our knowledge of basic Toxoplasma biology and provide opportunities to identify novel therapeutic targets. This type of study would also open a path towards the comparative biology of apicomplexans. © S. Escotte-Binet et al., published by EDP Sciences, 2018.
ApiEST-DB: analyzing clustered EST data of the apicomplexan parasites.

PubMed

Li, Li; Crabtree, Jonathan; Fischer, Steve; Pinney, Deborah; Stoeckert, Christian J; Sibley, L David; Roos, David S

2004-01-01

ApiEST-DB (http://www.cbil.upenn.edu/paradbs-servlet/) provides integrated access to publicly available EST data from protozoan parasites in the phylum Apicomplexa. The database currently incorporates a total of nearly 100,000 ESTs from several parasite species of clinical and/or veterinary interest, including Eimeria tenella, Neospora caninum, Plasmodium falciparum, Sarcocystis neurona and Toxoplasma gondii. To facilitate analysis of these data, EST sequences were clustered and assembled to form consensus sequences for each organism, and these assemblies were then subjected to automated annotation via similarity searches against protein and domain databases. The underlying relational database infrastructure, Genomics Unified Schema (GUS), enables complex biologically based queries, facilitating validation of gene models, identification of alternative splicing, detection of single nucleotide polymorphisms, identification of stage-specific genes and recognition of phylogenetically conserved and phylogenetically restricted sequences.
5SRNAdb: an information resource for 5S ribosomal RNAs.

PubMed

Szymanski, Maciej; Zielezinski, Andrzej; Barciszewski, Jan; Erdmann, Volker A; Karlowski, Wojciech M

2016-01-04

Ribosomal 5S RNA (5S rRNA) is the ubiquitous RNA component found in the large subunit of ribosomes in all known organisms. Due to its small size, abundance and evolutionary conservation 5S rRNA for many years now is used as a model molecule in studies on RNA structure, RNA-protein interactions and molecular phylogeny. 5SRNAdb (http://combio.pl/5srnadb/) is the first database that provides a high quality reference set of ribosomal 5S RNAs (5S rRNA) across three domains of life. Here, we give an overview of new developments in the database and associated web tools since 2002, including updates to database content, curation processes and user web interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Structural diversity of domain superfamilies in the CATH database.

PubMed

Reeves, Gabrielle A; Dallman, Timothy J; Redfern, Oliver C; Akpor, Adrian; Orengo, Christine A

2006-07-14

The CATH database of domain structures has been used to explore the structural variation of homologous domains in 294 well populated domain structure superfamilies, each containing at least three sequence diverse relatives. Our analyses confirm some previously detected trends relating sequence divergence to structural variation but for a much larger dataset and in some superfamilies the new data reveal exceptional structural variation. Use of a new algorithm (2DSEC) to analyse variability in secondary structure compositions across a superfamily sheds new light on how structures evolve. 2DSEC detects inserted secondary structures that embellish the core of conserved secondary structures found throughout the superfamily. Analysis showed that for 56% of highly populated superfamilies (>9 sequence diverse relatives), there are twofold or more increases in the numbers of secondary structures in some relatives. In some families fivefold increases occur, sometimes modifying the fold of the domain. Manual inspection of secondary structure insertions or embellishments in 48 particularly variable superfamilies revealed that although these insertions were usually discontiguous in the sequence they were often co-located in 3D resulting in a larger structural motif that often modified the geometry of the active site or the surface conformation promoting diverse domain partnerships and protein interactions. These observations, supported by automatic analysis of all well populated CATH families, suggest that accretion of small secondary structure insertions may provide a simple mechanism for evolving new functions in diverse relatives. Some layered domain architectures (e.g. mainly-beta and alpha-beta sandwiches) that recur highly in the genomes more frequently exploit these types of embellishments to modify function. In these architectures, aggregation occurs most often at the edges, top or bottom of the beta-sheets. Information on structural variability across domain superfamilies has been made available through the CATH Dictionary of Homologous Structures (DHS).
Database resources of the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Church, Deanna M.; Lash, Alex E.; Leipe, Detlef D.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Tatusova, Tatiana A.; Wagner, Lukas; Rapp, Barbara A.

2001-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, GeneMap’99, Human–Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP), SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. PMID:11125038
The conservation pattern of short linear motifs is highly correlated with the function of interacting protein domains.

PubMed

Ren, Siyuan; Yang, Guang; He, Youyu; Wang, Yiguo; Li, Yixue; Chen, Zhengjun

2008-10-01

Many well-represented domains recognize primary sequences usually less than 10 amino acids in length, called Short Linear Motifs (SLiMs). Accurate prediction of SLiMs has been difficult because they are short (often < 10 amino acids) and highly degenerate. In this study, we combined scoring matrixes derived from peptide library and conservation analysis to identify protein classes enriched of functional SLiMs recognized by SH2, SH3, PDZ and S/T kinase domains. Our combined approach revealed that SLiMs are highly conserved in proteins from functional classes that are known to interact with a specific domain, but that they are not conserved in most other protein groups. We found that SLiMs recognized by SH2 domains were highly conserved in receptor kinases/phosphatases, adaptor molecules, and tyrosine kinases/phosphatases, that SLiMs recognized by SH3 domains were highly conserved in cytoskeletal and cytoskeletal-associated proteins, that SLiMs recognized by PDZ domains were highly conserved in membrane proteins such as channels and receptors, and that SLiMs recognized by S/T kinase domains were highly conserved in adaptor molecules, S/T kinases/phosphatases, and proteins involved in transcription or cell cycle control. We studied Tyr-SLiMs recognized by SH2 domains in more detail, and found that SH2-recognized Tyr-SLiMs on the cytoplasmic side of membrane proteins are more highly conserved than those on the extra-cellular side. Also, we found that SH2-recognized Tyr-SLiMs that are associated with SH3 motifs and a tyrosine kinase phosphorylation motif are more highly conserved. The interactome of protein domains is reflected by the evolutionary conservation of SLiMs recognized by these domains. Combining scoring matrixes derived from peptide libraries and conservation analysis, we would be able to find those protein groups that are more likely to interact with specific domains.
The conservation pattern of short linear motifs is highly correlated with the function of interacting protein domains

PubMed Central

Ren, Siyuan; Yang, Guang; He, Youyu; Wang, Yiguo; Li, Yixue; Chen, Zhengjun

2008-01-01

Background Many well-represented domains recognize primary sequences usually less than 10 amino acids in length, called Short Linear Motifs (SLiMs). Accurate prediction of SLiMs has been difficult because they are short (often < 10 amino acids) and highly degenerate. In this study, we combined scoring matrixes derived from peptide library and conservation analysis to identify protein classes enriched of functional SLiMs recognized by SH2, SH3, PDZ and S/T kinase domains. Results Our combined approach revealed that SLiMs are highly conserved in proteins from functional classes that are known to interact with a specific domain, but that they are not conserved in most other protein groups. We found that SLiMs recognized by SH2 domains were highly conserved in receptor kinases/phosphatases, adaptor molecules, and tyrosine kinases/phosphatases, that SLiMs recognized by SH3 domains were highly conserved in cytoskeletal and cytoskeletal-associated proteins, that SLiMs recognized by PDZ domains were highly conserved in membrane proteins such as channels and receptors, and that SLiMs recognized by S/T kinase domains were highly conserved in adaptor molecules, S/T kinases/phosphatases, and proteins involved in transcription or cell cycle control. We studied Tyr-SLiMs recognized by SH2 domains in more detail, and found that SH2-recognized Tyr-SLiMs on the cytoplasmic side of membrane proteins are more highly conserved than those on the extra-cellular side. Also, we found that SH2-recognized Tyr-SLiMs that are associated with SH3 motifs and a tyrosine kinase phosphorylation motif are more highly conserved. Conclusion The interactome of protein domains is reflected by the evolutionary conservation of SLiMs recognized by these domains. Combining scoring matrixes derived from peptide libraries and conservation analysis, we would be able to find those protein groups that are more likely to interact with specific domains. PMID:18828911
How national context, project design, and local community characteristics influence success in community-based conservation projects.

PubMed

Brooks, Jeremy S; Waylen, Kerry A; Borgerhoff Mulder, Monique

2012-12-26

Community-based conservation (CBC) promotes the idea that conservation success requires engaging with, and providing benefits for, local communities. However, CBC projects are neither consistently successful nor free of controversy. Innovative recent studies evaluating the factors associated with success and failure typically examine only a single resource domain, have limited geographic scope, consider only one outcome, or ignore the nested nature of socioecological systems. To remedy these issues, we use a global comparative database of CBC projects identified by systematic review to evaluate success in four outcome domains (attitudes, behaviors, ecological, economic) and explore synergies and trade-offs among these outcomes. We test hypotheses about how features of the national context, project design, and local community characteristics affect these measures of success. Using bivariate analyses and multivariate proportional odds logistic regressions within a multilevel analysis and model-fitting framework, we show that project design, particularly capacity-building in local communities, is associated with success across all outcomes. In addition, some characteristics of the local community in which projects are conducted, such as tenure regimes and supportive cultural beliefs and institutions, are important for project success. Surprisingly, there is little evidence that national context systematically influences project outcomes. We also find evidence of synergies between pairs of outcomes, particularly between ecological and economic success. We suggest that well-designed and implemented projects can overcome many of the obstacles imposed by local and national conditions to succeed in multiple domains.
Structural analysis of key gap junction domains--Lessons from genome data and disease-linked mutants.

PubMed

Bai, Donglin

2016-02-01

A gap junction (GJ) channel is formed by docking of two GJ hemichannels and each of these hemichannels is a hexamer of connexins. All connexin genes have been identified in human, mouse, and rat genomes and their homologous genes in many other vertebrates are available in public databases. The protein sequences of these connexins align well with high sequence identity in the same connexin across different species. Domains in closely related connexins and several residues in all known connexins are also well-conserved. These conserved residues form signatures (also known as sequence logos) in these domains and are likely to play important biological functions. In this review, the sequence logos of individual connexins, groups of connexins with common ancestors, and all connexins are analyzed to visualize natural evolutionary variations and the hot spots for human disease-linked mutations. Several gap junction domains are homologous, likely forming similar structures essential for their function. The availability of a high resolution Cx26 GJ structure and the subsequently-derived homology structure models for other connexin GJ channels elevated our understanding of sequence logos at the three-dimensional GJ structure level, thus facilitating the understanding of how disease-linked connexin mutants might impair GJ structure and function. This knowledge will enable the design of complementary variants to rescue disease-linked mutants. Copyright © 2015 Elsevier Ltd. All rights reserved.
Domain architecture conservation in orthologs

PubMed Central

2011-01-01

Background As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence. To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs. Results The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation. The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent. Conclusions On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance. PMID:21819573
Functional Characterization of the Vitamin K2 Biosynthetic Enzyme UBIAD1

PubMed Central

Hirota, Yoshihisa; Nakagawa, Kimie; Sawada, Natsumi; Okuda, Naoko; Suhara, Yoshitomo; Uchino, Yuri; Kimoto, Takashi; Funahashi, Nobuaki; Kamao, Maya; Tsugawa, Naoko; Okano, Toshio

2015-01-01

UbiA prenyltransferase domain-containing protein 1 (UBIAD1) plays a significant role in vitamin K2 (MK-4) synthesis. We investigated the enzymological properties of UBIAD1 using microsomal fractions from Sf9 cells expressing UBIAD1 by analysing MK-4 biosynthetic activity. With regard to UBIAD1 enzyme reaction conditions, highest MK-4 synthetic activity was demonstrated under basic conditions at a pH between 8.5 and 9.0, with a DTT ≥0.1 mM. In addition, we found that geranyl pyrophosphate and farnesyl pyrophosphate were also recognized as a side-chain source and served as a substrate for prenylation. Furthermore, lipophilic statins were found to directly inhibit the enzymatic activity of UBIAD1. We analysed the aminoacid sequences homologies across the menA and UbiA families to identify conserved structural features of UBIAD1 proteins and focused on four highly conserved domains. We prepared protein mutants deficient in the four conserved domains to evaluate enzyme activity. Because no enzyme activity was detected in the mutants deficient in the UBIAD1 conserved domains, these four domains were considered to play an essential role in enzymatic activity. We also measured enzyme activities using point mutants of the highly conserved aminoacids in these domains to elucidate their respective functions. We found that the conserved domain I is a substrate recognition site that undergoes a structural change after substrate binding. The conserved domain II is a redox domain site containing a CxxC motif. The conserved domain III is a hinge region important as a catalytic site for the UBIAD1 enzyme. The conserved domain IV is a binding site for Mg2+/isoprenyl side-chain. In this study, we provide a molecular mapping of the enzymological properties of UBIAD1. PMID:25874989
Application of Wavelet Transform for PDZ Domain Classification

PubMed Central

Daqrouq, Khaled; Alhmouz, Rami; Balamesh, Ahmed; Memic, Adnan

2015-01-01

PDZ domains have been identified as part of an array of signaling proteins that are often unrelated, except for the well-conserved structural PDZ domain they contain. These domains have been linked to many disease processes including common Avian influenza, as well as very rare conditions such as Fraser and Usher syndromes. Historically, based on the interactions and the nature of bonds they form, PDZ domains have most often been classified into one of three classes (class I, class II and others - class III), that is directly dependent on their binding partner. In this study, we report on three unique feature extraction approaches based on the bigram and trigram occurrence and existence rearrangements within the domain's primary amino acid sequences in assisting PDZ domain classification. Wavelet packet transform (WPT) and Shannon entropy denoted by wavelet entropy (WE) feature extraction methods were proposed. Using 115 unique human and mouse PDZ domains, the existence rearrangement approach yielded a high recognition rate (78.34%), which outperformed our occurrence rearrangements based method. The recognition rate was (81.41%) with validation technique. The method reported for PDZ domain classification from primary sequences proved to be an encouraging approach for obtaining consistent classification results. We anticipate that by increasing the database size, we can further improve feature extraction and correct classification. PMID:25860375
Crystal Structure of the CLOCK Transactivation Domain Exon19 in Complex with a Repressor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hou, Zhiqiang; Su, Lijing; Pei, Jimin

In the canonical clock model, CLOCK:BMAL1-mediated transcriptional activation is feedback regulated by its repressors CRY and PER and, in association with other coregulators, ultimately generates oscillatory gene expression patterns. How CLOCK:BMAL1 interacts with coregulator(s) is not well understood. Here we report the crystal structures of the mouse CLOCK transactivating domain Exon19 in complex with CIPC, a potent circadian repressor that functions independently of CRY and PER. The Exon19:CIPC complex adopts a three-helical coiled-coil bundle conformation containing two Exon19 helices and one CIPC. Unique to Exon19:CIPC, three highly conserved polar residues, Asn341 of CIPC and Gln544 of the two Exon19 helices,more » are located at the mid-section of the coiled-coil bundle interior and form hydrogen bonds with each other. Combining results from protein database search, sequence analysis, and mutagenesis studies, we discovered for the first time that CLOCK Exon19:CIPC interaction is a conserved transcription regulatory mechanism among mammals, fish, flies, and other invertebrates.« less

JAIL: a structure-based interface library for macromolecules.

PubMed

Günther, Stefan; von Eichborn, Joachim; May, Patrick; Preissner, Robert

2009-01-01

The increasing number of solved macromolecules provides a solid number of 3D interfaces, if all types of molecular contacts are being considered. JAIL annotates three different kinds of macromolecular interfaces, those between interacting protein domains, interfaces of different protein chains and interfaces between proteins and nucleic acids. This results in a total number of about 184,000 database entries. All the interfaces can easily be identified by a detailed search form or by a hierarchical tree that describes the protein domain architectures classified by the SCOP database. Visual inspection of the interfaces is possible via an interactive protein viewer. Furthermore, large scale analyses are supported by an implemented sequential and by a structural clustering. Similar interfaces as well as non-redundant interfaces can be easily picked out. Additionally, the sequential conservation of binding sites was also included in the database and is retrievable via Jmol. A comprehensive download section allows the composition of representative data sets with user defined parameters. The huge data set in combination with various search options allow a comprehensive view on all interfaces between macromolecules included in the Protein Data Bank (PDB). The download of the data sets supports numerous further investigations in macromolecular recognition. JAIL is publicly available at http://bioinformatics.charite.de/jail.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases

PubMed Central

Truong, Kevin; Ikura, Mitsuhiko

2003-01-01

Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020
AIM: a comprehensive Arabidopsis interactome module database and related interologs in plants.

PubMed

Wang, Yi; Thilmony, Roger; Zhao, Yunjun; Chen, Guoping; Gu, Yong Q

2014-01-01

Systems biology analysis of protein modules is important for understanding the functional relationships between proteins in the interactome. Here, we present a comprehensive database named AIM for Arabidopsis (Arabidopsis thaliana) interactome modules. The database contains almost 250,000 modules that were generated using multiple analysis methods and integration of microarray expression data. All the modules in AIM are well annotated using multiple gene function knowledge databases. AIM provides a user-friendly interface for different types of searches and offers a powerful graphical viewer for displaying module networks linked to the enrichment annotation terms. Both interactive Venn diagram and power graph viewer are integrated into the database for easy comparison of modules. In addition, predicted interologs from other plant species (homologous proteins from different species that share a conserved interaction module) are available for each Arabidopsis module. AIM is a powerful systems biology platform for obtaining valuable insights into the function of proteins in Arabidopsis and other plants using the modules of the Arabidopsis interactome. Database URL:http://probes.pw.usda.gov/AIM Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.
Phylogenomic and Domain Analysis of Iterative Polyketide Synthases in Aspergillus Species

PubMed Central

Lin, Shu-Hsi; Yoshimoto, Miwa; Lyu, Ping-Chiang; Tang, Chuan-Yi; Arita, Masanori

2012-01-01

Aspergillus species are industrially and agriculturally important as fermentors and as producers of various secondary metabolites. Among them, fungal polyketides such as lovastatin and melanin are considered a gold mine for bioactive compounds. We used a phylogenomic approach to investigate the distribution of iterative polyketide synthases (PKS) in eight sequenced Aspergilli and classified over 250 fungal genes. Their genealogy by the conserved ketosynthase (KS) domain revealed three large groups of nonreducing PKS, one group inside bacterial PKS, and more than 9 small groups of reducing PKS. Polyphyly of nonribosomal peptide synthase (NRPS)-PKS genes raised questions regarding the recruitment of the elegant conjugation machinery. High rates of gene duplication and divergence were frequent. All data are accessible through our web database at http://metabolomics.jp/wiki/Category:PK. PMID:22844193
Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

PubMed

Truong, Kevin; Ikura, Mitsuhiko

2003-05-06

Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.
Defining and predicting structurally conserved regions in protein superfamilies

PubMed Central

Huang, Ivan K.; Grishin, Nick V.

2013-01-01

Motivation: The structures of homologous proteins are generally better conserved than their sequences. This phenomenon is demonstrated by the prevalence of structurally conserved regions (SCRs) even in highly divergent protein families. Defining SCRs requires the comparison of two or more homologous structures and is affected by their availability and divergence, and our ability to deduce structurally equivalent positions among them. In the absence of multiple homologous structures, it is necessary to predict SCRs of a protein using information from only a set of homologous sequences and (if available) a single structure. Accurate SCR predictions can benefit homology modelling and sequence alignment. Results: Using pairwise DaliLite alignments among a set of homologous structures, we devised a simple measure of structural conservation, termed structural conservation index (SCI). SCI was used to distinguish SCRs from non-SCRs. A database of SCRs was compiled from 386 SCOP superfamilies containing 6489 protein domains. Artificial neural networks were then trained to predict SCRs with various features deduced from a single structure and homologous sequences. Assessment of the predictions via a 5-fold cross-validation method revealed that predictions based on features derived from a single structure perform similarly to ones based on homologous sequences, while combining sequence and structural features was optimal in terms of accuracy (0.755) and Matthews correlation coefficient (0.476). These results suggest that even without information from multiple structures, it is still possible to effectively predict SCRs for a protein. Finally, inspection of the structures with the worst predictions pinpoints difficulties in SCR definitions. Availability: The SCR database and the prediction server can be found at http://prodata.swmed.edu/SCR. Contact: 91huangi@gmail.com or grishin@chop.swmed.edu Supplementary information: Supplementary data are available at Bioinformatics Online PMID:23193223
A bioinformatic survey of distribution, conservation, and probable functions of LuxR solo regulators in bacteria.

PubMed

Subramoni, Sujatha; Florez Salcedo, Diana Vanessa; Suarez-Moreno, Zulma R

2015-01-01

LuxR solo transcriptional regulators contain both an autoinducer binding domain (ABD; N-terminal) and a DNA binding Helix-Turn-Helix domain (HTH; C-terminal), but are not associated with a cognate N-acyl homoserine lactone (AHL) synthase coding gene in the same genome. Although a few LuxR solos have been characterized, their distributions as well as their role in bacterial signal perception and other processes are poorly understood. In this study we have carried out a systematic survey of distribution of all ABD containing LuxR transcriptional regulators (QS domain LuxRs) available in the InterPro database (IPR005143), and identified those lacking a cognate AHL synthase. These LuxR solos were then analyzed regarding their taxonomical distribution, predicted functions of neighboring genes and the presence of complete AHL-QS systems in the genomes that carry them. Our analyses reveal the presence of one or multiple predicted LuxR solos in many proteobacterial genomes carrying QS domain LuxRs, some of them harboring genes for one or more AHL-QS circuits. The presence of LuxR solos in bacteria occupying diverse environments suggests potential ecological functions for these proteins beyond AHL and interkingdom signaling. Based on gene context and the conservation levels of invariant amino acids of ABD, we have classified LuxR solos into functionally meaningful groups or putative orthologs. Surprisingly, putative LuxR solos were also found in a few non-proteobacterial genomes which are not known to carry AHL-QS systems. Multiple predicted LuxR solos in the same genome appeared to have different levels of conservation of invariant amino acid residues of ABD questioning their binding to AHLs. In summary, this study provides a detailed overview of distribution of LuxR solos and their probable roles in bacteria with genome sequence information.
A bioinformatic survey of distribution, conservation, and probable functions of LuxR solo regulators in bacteria

PubMed Central

Subramoni, Sujatha; Florez Salcedo, Diana Vanessa; Suarez-Moreno, Zulma R.

2015-01-01

LuxR solo transcriptional regulators contain both an autoinducer binding domain (ABD; N-terminal) and a DNA binding Helix-Turn-Helix domain (HTH; C-terminal), but are not associated with a cognate N-acyl homoserine lactone (AHL) synthase coding gene in the same genome. Although a few LuxR solos have been characterized, their distributions as well as their role in bacterial signal perception and other processes are poorly understood. In this study we have carried out a systematic survey of distribution of all ABD containing LuxR transcriptional regulators (QS domain LuxRs) available in the InterPro database (IPR005143), and identified those lacking a cognate AHL synthase. These LuxR solos were then analyzed regarding their taxonomical distribution, predicted functions of neighboring genes and the presence of complete AHL-QS systems in the genomes that carry them. Our analyses reveal the presence of one or multiple predicted LuxR solos in many proteobacterial genomes carrying QS domain LuxRs, some of them harboring genes for one or more AHL-QS circuits. The presence of LuxR solos in bacteria occupying diverse environments suggests potential ecological functions for these proteins beyond AHL and interkingdom signaling. Based on gene context and the conservation levels of invariant amino acids of ABD, we have classified LuxR solos into functionally meaningful groups or putative orthologs. Surprisingly, putative LuxR solos were also found in a few non-proteobacterial genomes which are not known to carry AHL-QS systems. Multiple predicted LuxR solos in the same genome appeared to have different levels of conservation of invariant amino acid residues of ABD questioning their binding to AHLs. In summary, this study provides a detailed overview of distribution of LuxR solos and their probable roles in bacteria with genome sequence information. PMID:25759807
Metabolic pathway reconstruction of eugenol to vanillin bioconversion in Aspergillus niger

PubMed Central

Srivastava, Suchita; Luqman, Suaib; Khan, Feroz; Chanotiya, Chandan S; Darokar, Mahendra P

2010-01-01

Identification of missing genes or proteins participating in the metabolic pathways as enzymes are of great interest. One such class of pathway is involved in the eugenol to vanillin bioconversion. Our goal is to develop an integral approach for identifying the topology of a reference or known pathway in other organism. We successfully identify the missing enzymes and then reconstruct the vanillin biosynthetic pathway in Aspergillus niger. The procedure combines enzyme sequence similarity searched through BLAST homology search and orthologs detection through COG & KEGG databases. Conservation of protein domains and motifs was searched through CDD, PFAM & PROSITE databases. Predictions regarding how proteins act in pathway were validated experimentally and also compared with reported data. The bioconversion of vanillin was screened on UV-TLC plates and later confirmed through GC and GC-MS techniques. We applied a procedure for identifying missing enzymes on the basis of conserved functional motifs and later reconstruct the metabolic pathway in target organism. Using the vanillin biosynthetic pathway of Pseudomonas fluorescens as a case study, we indicate how this approach can be used to reconstruct the reference pathway in A. niger and later results were experimentally validated through chromatography and spectroscopy techniques. PMID:20978605
The impact of p53 protein core domain structural alteration on ovarian cancer survival.

PubMed

Rose, Stephen L; Robertson, Andrew D; Goodheart, Michael J; Smith, Brian J; DeYoung, Barry R; Buller, Richard E

2003-09-15

Although survival with a p53 missense mutation is highly variable, p53-null mutation is an independent adverse prognostic factor for advanced stage ovarian cancer. By evaluating ovarian cancer survival based upon a structure function analysis of the p53 protein, we tested the hypothesis that not all missense mutations are equivalent. The p53 gene was sequenced from 267 consecutive ovarian cancers. The effect of individual missense mutations on p53 structure was analyzed using the International Agency for Research on Cancer p53 Mutational Database, which specifies the effects of p53 mutations on p53 core domain structure. Mutations in the p53 core domain were classified as either explained or not explained in structural or functional terms by their predicted effects on protein folding, protein-DNA contacts, or mutation in highly conserved residues. Null mutations were classified by their mechanism of origin. Mutations were sequenced from 125 tumors. Effects of 62 of the 82 missense mutations (76%) could be explained by alterations in the p53 protein. Twenty-three (28%) of the explained mutations occurred in highly conserved regions of the p53 core protein. Twenty-two nonsense point mutations and 21 frameshift null mutations were sequenced. Survival was independent of missense mutation type and mechanism of null mutation. The hypothesis that not all missense mutations are equivalent is, therefore, rejected. Furthermore, p53 core domain structural alteration secondary to missense point mutation is not functionally equivalent to a p53-null mutation. The poor prognosis associated with p53-null mutation is independent of the mutation mechanism.
SALAD database: a motif-based database of protein annotations for plant comparative genomics

PubMed Central

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209 529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named ‘SALAD on ARRAYs’ to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis. PMID:19854933
SALAD database: a motif-based database of protein annotations for plant comparative genomics.

PubMed

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.
Multidisciplinary Information System of Assyrian Cuneiform Tablets Enhancing New Research Possibilities via Heterogeneous Data in Records

NASA Astrophysics Data System (ADS)

Valach, J.; Štefcová, P.; Bruna, R.; Zemánek, P.

2017-08-01

This paper outlines recently started project dedicated to creation and development of information system for cuneiform tablets. The contribution deals with the architecture of a virtual collection of cuneiform tablets, conceived as a complex system combining and integrating several domains of information obtained from various types of analyses. The research team includes experts from the field of collection conservation with philologists and researchers in the 3D scanning and physical measurement. Multidisciplinary databases like the one described, represent a new tool in digital humanities and help to improve accessibility of collections to public and researchers.
DIMA 3.0: Domain Interaction Map.

PubMed

Luo, Qibin; Pagel, Philipp; Vilne, Baiba; Frishman, Dmitrij

2011-01-01

Domain Interaction MAp (DIMA, available at http://webclu.bio.wzw.tum.de/dima) is a database of predicted and known interactions between protein domains. It integrates 5807 structurally known interactions imported from the iPfam and 3did databases and 46,900 domain interactions predicted by four computational methods: domain phylogenetic profiling, domain pair exclusion algorithm correlated mutations and domain interaction prediction in a discriminative way. Additionally predictions are filtered to exclude those domain pairs that are reported as non-interacting by the Negatome database. The DIMA Web site allows to calculate domain interaction networks either for a domain of interest or for entire organisms, and to explore them interactively using the Flash-based Cytoscape Web software.
Coiled-coil length: Size does matter.

PubMed

Surkont, Jaroslaw; Diekmann, Yoan; Ryder, Pearl V; Pereira-Leal, Jose B

2015-12-01

Protein evolution is governed by processes that alter primary sequence but also the length of proteins. Protein length may change in different ways, but insertions, deletions and duplications are the most common. An optimal protein size is a trade-off between sequence extension, which may change protein stability or lead to acquisition of a new function, and shrinkage that decreases metabolic cost of protein synthesis. Despite the general tendency for length conservation across orthologous proteins, the propensity to accept insertions and deletions is heterogeneous along the sequence. For example, protein regions rich in repetitive peptide motifs are well known to extensively vary their length across species. Here, we analyze length conservation of coiled-coils, domains formed by an ubiquitous, repetitive peptide motif present in all domains of life, that frequently plays a structural role in the cell. We observed that, despite the repetitive nature, the length of coiled-coil domains is generally highly conserved throughout the tree of life, even when the remaining parts of the protein change, including globular domains. Length conservation is independent of primary amino acid sequence variation, and represents a conservation of domain physical size. This suggests that the conservation of domain size is due to functional constraints. © 2015 Wiley Periodicals, Inc.
Bioinformatics analysis of the phytoene synthase gene in cabbage (Brassica oleracea var. capitata)

NASA Astrophysics Data System (ADS)

Sun, Bo; Jiang, Min; Xue, Shengling; Zheng, Aihong; Zhang, Fen; Tang, Haoru

2018-04-01

Phytoene Synthase (PSY) is an important enzyme in carotenoid biosynthesis. Here, the Brassica oleracea var. capitata PSY (BocPSY) gene sequences were obtained from Brassica database (BRAD), and preformed for bioinformatics analysis. The BocPSY1, BocPSY2 and BocPSY3 genes mapped to chromosomes 2,3 and 9, and contains an open reading frame of 1,248 bp, 1,266 bp and 1,275 bp that encodes a 415, 421, 424 amino acid protein, respectively. Subcellular localization predicted all BocPSY genes were in the chloroplast. The conserved domain of the BocPSY protein is PLN02632. Homology analysis indicates that the levels of identity among BocPSYs were all more than 85%, and the PSY protein is apparently conserved during plant evolution. The findings of the present study provide a molecular basis for the elucidation of PSY gene function in cabbage.
Improving pairwise comparison of protein sequences with domain co-occurrence

PubMed Central

Gascuel, Olivier

2018-01-01

Comparing and aligning protein sequences is an essential task in bioinformatics. More specifically, local alignment tools like BLAST are widely used for identifying conserved protein sub-sequences, which likely correspond to protein domains or functional motifs. However, to limit the number of false positives, these tools are used with stringent sequence-similarity thresholds and hence can miss several hits, especially for species that are phylogenetically distant from reference organisms. A solution to this problem is then to integrate additional contextual information to the procedure. Here, we propose to use domain co-occurrence to increase the sensitivity of pairwise sequence comparisons. Domain co-occurrence is a strong feature of proteins, since most protein domains tend to appear with a limited number of other domains on the same protein. We propose a method to take this information into account in a typical BLAST analysis and to construct new domain families on the basis of these results. We used Plasmodium falciparum as a case study to evaluate our method. The experimental findings showed an increase of 14% of the number of significant BLAST hits and an increase of 25% of the proteome area that can be covered with a domain. Our method identified 2240 new domains for which, in most cases, no model of the Pfam database could be linked. Moreover, our study of the quality of the new domains in terms of alignment and physicochemical properties show that they are close to that of standard Pfam domains. Source code of the proposed approach and supplementary data are available at: https://gite.lirmm.fr/menichelli/pairwise-comparison-with-cooccurrence PMID:29293498
A conserved domain in the NH2 terminus important for assembly and functional expression of pacemaker channels.

PubMed

Tran, Neil; Proenza, Catherine; Macri, Vincenzo; Petigara, Fiona; Sloan, Erin; Samler, Shannon; Accili, Eric A

2002-11-15

Pacemaker channels are formed by co-assembly of hyperpolarization-activated cyclic nucleotide-gated (HCN) subunits. Previously, we suggested that the NH(2) termini of the mouse HCN2 isoform were important for subunit co-assembly and functional channel expression. Using an alignment strategy together with yeast two-hybrid assays, patch clamp electrophysiology, and confocal imaging, we have now identified a domain within the NH(2) terminus of the HCN2 subunit that is responsible for interactions between NH(2) termini and promoting the trafficking of functional channels to the plasma membrane. This domain is composed of 52 amino acids, is located adjacent to the putative first transmembrane segment, and is highly conserved among the mammalian HCN isoforms. This conserved domain, but not the remaining unconserved NH(2)-terminal regions of HCN2, specifically interacted with itself in yeast two-hybrid assays. Moreover, the conserved domain was important for expression of currents. Whereas relatively normal whole cell HCN2 currents were produced by channels containing only the conserved domain, further deletion of this region, leaving only a more polar and putative coiled-coil segment, eliminated HCN2 currents and resulted in proteins that localized predominantly in perinuclear compartments. Thus, we suggest that this conserved domain is the critical NH(2)-terminal determinant of subunit co-assembly and trafficking of pacemaker channels.
Analysis of the linker region joining the adenylation and carrier protein domains of the modular nonribosomal peptide synthetases.

PubMed

Miller, Bradley R; Sundlov, Jesse A; Drake, Eric J; Makin, Thomas A; Gulick, Andrew M

2014-10-01

Nonribosomal peptide synthetases (NRPSs) are multimodular proteins capable of producing important peptide natural products. Using an assembly line process, the amino acid substrate and peptide intermediates are passed between the active sites of different catalytic domains of the NRPS while bound covalently to a peptidyl carrier protein (PCP) domain. Examination of the linker sequences that join the NRPS adenylation and PCP domains identified several conserved proline residues that are not found in standalone adenylation domains. We examined the roles of these proline residues and neighboring conserved sequences through mutagenesis and biochemical analysis of the reaction catalyzed by the adenylation domain and the fully reconstituted NRPS pathway. In particular, we identified a conserved LPxP motif at the start of the adenylation-PCP linker. The LPxP motif interacts with a region on the adenylation domain to stabilize a critical catalytic lysine residue belonging to the A10 motif that immediately precedes the linker. Further, this interaction with the C-terminal subdomain of the adenylation domain may coordinate movement of the PCP with the conformational change of the adenylation domain. Through this work, we extend the conserved A10 motif of the adenylation domain and identify residues that enable proper adenylation domain function. © 2014 Wiley Periodicals, Inc.
A protein relational database and protein family knowledge bases to facilitate structure-based design analyses.

PubMed

Mobilio, Dominick; Walker, Gary; Brooijmans, Natasja; Nilakantan, Ramaswamy; Denny, R Aldrin; Dejoannis, Jason; Feyfant, Eric; Kowticwar, Rupesh K; Mankala, Jyoti; Palli, Satish; Punyamantula, Sairam; Tatipally, Maneesh; John, Reji K; Humblet, Christine

2010-08-01

The Protein Data Bank is the most comprehensive source of experimental macromolecular structures. It can, however, be difficult at times to locate relevant structures with the Protein Data Bank search interface. This is particularly true when searching for complexes containing specific interactions between protein and ligand atoms. Moreover, searching within a family of proteins can be tedious. For example, one cannot search for some conserved residue as residue numbers vary across structures. We describe herein three databases, Protein Relational Database, Kinase Knowledge Base, and Matrix Metalloproteinase Knowledge Base, containing protein structures from the Protein Data Bank. In Protein Relational Database, atom-atom distances between protein and ligand have been precalculated allowing for millisecond retrieval based on atom identity and distance constraints. Ring centroids, centroid-centroid and centroid-atom distances and angles have also been included permitting queries for pi-stacking interactions and other structural motifs involving rings. Other geometric features can be searched through the inclusion of residue pair and triplet distances. In Kinase Knowledge Base and Matrix Metalloproteinase Knowledge Base, the catalytic domains have been aligned into common residue numbering schemes. Thus, by searching across Protein Relational Database and Kinase Knowledge Base, one can easily retrieve structures wherein, for example, a ligand of interest is making contact with the gatekeeper residue.

NovelFam3000 – Uncharacterized human protein domains conserved across model organisms

PubMed Central

Kemmer, Danielle; Podowski, Raf M; Arenillas, David; Lim, Jonathan; Hodges, Emily; Roth, Peggy; Sonnhammer, Erik LL; Höög, Christer; Wasserman, Wyeth W

2006-01-01

Background Despite significant efforts from the research community, an extensive portion of the proteins encoded by human genes lack an assigned cellular function. Most metazoan proteins are composed of structural and/or functional domains, of which many appear in multiple proteins. Once a domain is characterized in one protein, the presence of a similar sequence in an uncharacterized protein serves as a basis for inference of function. Thus knowledge of a domain's function, or the protein within which it arises, can facilitate the analysis of an entire set of proteins. Description From the Pfam domain database, we extracted uncharacterized protein domains represented in proteins from humans, worms, and flies. A data centre was created to facilitate the analysis of the uncharacterized domain-containing proteins. The centre both provides researchers with links to dispersed internet resources containing gene-specific experimental data and enables them to post relevant experimental results or comments. For each human gene in the system, a characterization score is posted, allowing users to track the progress of characterization over time or to identify for study uncharacterized domains in well-characterized genes. As a test of the system, a subset of 39 domains was selected for analysis and the experimental results posted to the NovelFam3000 system. For 25 human protein members of these 39 domain families, detailed sub-cellular localizations were determined. Specific observations are presented based on the analysis of the integrated information provided through the online NovelFam3000 system. Conclusion Consistent experimental results between multiple members of a domain family allow for inferences of the domain's functional role. We unite bioinformatics resources and experimental data in order to accelerate the functional characterization of scarcely annotated domain families. PMID:16533400
Analysis of hepatitis B virus preS1 variability and prevalence of the rs2296651 polymorphism in a Spanish population

PubMed Central

Casillas, Rosario; Tabernero, David; Gregori, Josep; Belmonte, Irene; Cortese, Maria Francesca; González, Carolina; Riveiro-Barciela, Mar; López, Rosa Maria; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco

2018-01-01

AIM To determine the variability/conservation of the domain of hepatitis B virus (HBV) preS1 region that interacts with sodium-taurocholate cotransporting polypeptide (hereafter, NTCP-interacting domain) and the prevalence of the rs2296651 polymorphism (S267F, NTCP variant) in a Spanish population. METHODS Serum samples from 246 individuals were included and divided into 3 groups: patients with chronic HBV infection (CHB) (n = 41, 73% Caucasians), patients with resolved HBV infection (n = 100, 100% Caucasians) and an HBV-uninfected control group (n = 105, 100% Caucasians). Variability/conservation of the amino acid (aa) sequences of the NTCP-interacting domain, (aa 2-48 in viral genotype D) and a highly conserved preS1 domain associated with virion morphogenesis (aa 92-103 in viral genotype D) were analyzed by next-generation sequencing and compared in 18 CHB patients with viremia > 4 log IU/mL. The rs2296651 polymorphism was determined in all individuals in all 3 groups using an in-house real-time PCR melting curve analysis. RESULTS The HBV preS1 NTCP-interacting domain showed a high degree of conservation among the examined viral genomes especially between aa 9 and 21 (in the genotype D consensus sequence). As compared with the virion morphogenesis domain, the NTCP-interacting domain had a smaller proportion of HBV genotype-unrelated changes comprising > 1% of the quasispecies (25.5% vs 31.8%), but a larger proportion of genotype-associated viral polymorphisms (34% vs 27.3%), according to consensus sequences from GenBank patterns of HBV genotypes A to H. Variation/conservation in both domains depended on viral genotype, with genotype C being the most highly conserved and genotype E the most variable (limited finding, only 2 genotype E included). Of note, proline residues were highly conserved in both domains, and serine residues showed changes only to threonine or tyrosine in the virion morphogenesis domain. The rs2296651 polymorphism was not detected in any participant. CONCLUSION In our CHB population, the NTCP-interacting domain was highly conserved, particularly the proline residues and essential amino acids related with the NTCP interaction, and the prevalence of rs2296651 was low/null. PMID:29456407
Genome empowerment for the Puerto Rican parrot – Amazona vittata

PubMed Central

2012-01-01

A unique community-funded project in Puerto Rico has launched whole-genome sequencing of the critically endangered Puerto Rican Parrot (Amazona vittata), with interpretation by genome bioinformaticians and students, and deposition into public online databases. This is the first article that focuses on the whole genome of a parrot species, one endemic to the USA and recently threatened with extinction. It provides invaluable conservation tools and a vivid example of hopeful prospects for future genome assessment of so many new species. It also demonstrates inventive ways for smaller institutions to contribute to a field largely considered the domain of large sequencing centers. PMID:23587407
D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

PubMed Central

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-01-01

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. DMATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the coregulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sosbox cisregulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. DMATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861
D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.

PubMed

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-07-27

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/
Development of 2010 national land cover database for the Nepal.

PubMed

Uddin, Kabir; Shrestha, Him Lal; Murthy, M S R; Bajracharya, Birendra; Shrestha, Basanta; Gilani, Hammad; Pradhan, Sudip; Dangol, Bikash

2015-01-15

Land cover and its change analysis across the Hindu Kush Himalayan (HKH) region is realized as an urgent need to support diverse issues of environmental conservation. This study presents the first and most complete national land cover database of Nepal prepared using public domain Landsat TM data of 2010 and replicable methodology. The study estimated that 39.1% of Nepal is covered by forests and 29.83% by agriculture. Patch and edge forests constituting 23.4% of national forest cover revealed proximate biotic interferences over the forests. Core forests constituted 79.3% of forests of Protected areas where as 63% of area was under core forests in the outside protected area. Physiographic regions wise forest fragmentation analysis revealed specific conservation requirements for productive hill and mid mountain regions. Comparative analysis with Landsat TM based global land cover product showed difference of the order of 30-60% among different land cover classes stressing the need for significant improvements for national level adoption. The online web based land cover validation tool is developed for continual improvement of land cover product. The potential use of the data set for national and regional level sustainable land use planning strategies and meeting several global commitments also highlighted. Copyright © 2014 Elsevier Ltd. All rights reserved.
Identification of hidden relationships from the coupling of hydrophobic cluster analysis and domain architecture information.

PubMed

Faure, Guilhem; Callebaut, Isabelle

2013-07-15

Describing domain architecture is a critical step in the functional characterization of proteins. However, some orphan domains do not match any profile stored in dedicated domain databases and are thereby difficult to analyze. We present here an original novel approach, called TREMOLO-HCA, for the analysis of orphan domain sequences and inspired from our experience in the use of Hydrophobic Cluster Analysis (HCA). Hidden relationships between protein sequences can be more easily identified from the PSI-BLAST results, using information on domain architecture, HCA plots and the conservation degree of amino acids that may participate in the protein core. This can lead to reveal remote relationships with known families of domains, as illustrated here with the identification of a hidden Tudor tandem in the human BAHCC1 protein and a hidden ET domain in the Saccharomyces cerevisiae Taf14p and human AF9 proteins. The results obtained in such a way are consistent with those provided by HHPRED, based on pairwise comparisons of HHMs. Our approach can, however, be applied even in absence of domain profiles or known 3D structures for the identification of novel families of domains. It can also be used in a reverse way for refining domain profiles, by starting from known protein domain families and identifying highly divergent members, hitherto considered as orphan. We provide a possible integration of this approach in an open TREMOLO-HCA package, which is fully implemented in python v2.7 and is available on request. Instructions are available at http://www.impmc.upmc.fr/∼callebau/tremolohca.html. isabelle.callebaut@impmc.upmc.fr Supplementary Data are available at Bioinformatics online.
Domain duplication, divergence, and loss events in vertebrate Msx paralogs reveal phylogenomically informed disease markers

PubMed Central

Finnerty, John R; Mazza, Maureen E; Jezewski, Peter A

2009-01-01

Background Msx originated early in animal evolution and is implicated in human genetic disorders. To reconstruct the functional evolution of Msx and inform the study of human mutations, we analyzed the phylogeny and synteny of 46 metazoan Msx proteins and tracked the duplication, diversification and loss of conserved motifs. Results Vertebrate Msx sequences sort into distinct Msx1, Msx2 and Msx3 clades. The sister-group relationship between MSX1 and MSX2 reflects their derivation from the 4p/5q chromosomal paralogon, a derivative of the original "MetaHox" cluster. We demonstrate physical linkage between Msx and other MetaHox genes (Hmx, NK1, Emx) in a cnidarian. Seven conserved domains, including two Groucho repression domains (N- and C-terminal), were present in the ancestral Msx. In cnidarians, the Groucho domains are highly similar. In vertebrate Msx1, the N-terminal Groucho domain is conserved, while the C-terminal domain diverged substantially, implying a novel function. In vertebrate Msx2 and Msx3, the C-terminal domain was lost. MSX1 mutations associated with ectodermal dysplasia or orofacial clefting disorders map to conserved domains in a non-random fashion. Conclusion Msx originated from a MetaHox ancestor that also gave rise to Tlx, Demox, NK, and possibly EHGbox, Hox and ParaHox genes. Duplication, divergence or loss of domains played a central role in the functional evolution of Msx. Duplicated domains allow pleiotropically expressed proteins to evolve new functions without disrupting existing interaction networks. Human missense sequence variants reside within evolutionarily conserved domains, likely disrupting protein function. This phylogenomic evaluation of candidate disease markers will inform clinical and functional studies. PMID:19154605
Domain duplication, divergence, and loss events in vertebrate Msx paralogs reveal phylogenomically informed disease markers.

PubMed

Finnerty, John R; Mazza, Maureen E; Jezewski, Peter A

2009-01-20

Msx originated early in animal evolution and is implicated in human genetic disorders. To reconstruct the functional evolution of Msx and inform the study of human mutations, we analyzed the phylogeny and synteny of 46 metazoan Msx proteins and tracked the duplication, diversification and loss of conserved motifs. Vertebrate Msx sequences sort into distinct Msx1, Msx2 and Msx3 clades. The sister-group relationship between MSX1 and MSX2 reflects their derivation from the 4p/5q chromosomal paralogon, a derivative of the original "MetaHox" cluster. We demonstrate physical linkage between Msx and other MetaHox genes (Hmx, NK1, Emx) in a cnidarian. Seven conserved domains, including two Groucho repression domains (N- and C-terminal), were present in the ancestral Msx. In cnidarians, the Groucho domains are highly similar. In vertebrate Msx1, the N-terminal Groucho domain is conserved, while the C-terminal domain diverged substantially, implying a novel function. In vertebrate Msx2 and Msx3, the C-terminal domain was lost. MSX1 mutations associated with ectodermal dysplasia or orofacial clefting disorders map to conserved domains in a non-random fashion. Msx originated from a MetaHox ancestor that also gave rise to Tlx, Demox, NK, and possibly EHGbox, Hox and ParaHox genes. Duplication, divergence or loss of domains played a central role in the functional evolution of Msx. Duplicated domains allow pleiotropically expressed proteins to evolve new functions without disrupting existing interaction networks. Human missense sequence variants reside within evolutionarily conserved domains, likely disrupting protein function. This phylogenomic evaluation of candidate disease markers will inform clinical and functional studies.
The effectiveness of marine reserve systems constructed using different surrogates of biodiversity.

PubMed

Sutcliffe, P R; Klein, C J; Pitcher, C R; Possingham, H P

2015-06-01

Biological sampling in marine systems is often limited, and the cost of acquiring new data is high. We sought to assess whether systematic reserves designed using abiotic domains adequately conserve a comprehensive range of species in a tropical marine inter-reef system. We based our assessment on data from the Great Barrier Reef, Australia. We designed reserve systems aiming to conserve 30% of each species based on 4 abiotic surrogate types (abiotic domains; weighted abiotic domains; pre-defined bioregions; and random selection of areas). We evaluated each surrogate in scenarios with and without cost (cost to fishery) and clumping (size of conservation area) constraints. To measure the efficacy of each reserve system for conservation purposes, we evaluated how well 842 species collected at 1155 sites across the Great Barrier Reef seabed were represented in each reserve system. When reserve design included both cost and clumping constraints, the mean proportion of species reaching the conservation target was 20-27% higher for reserve systems that were biologically informed than reserves designed using unweighted environmental data. All domains performed substantially better than random, except when there were no spatial or economic constraints placed on the system design. Under the scenario with no constraints, the mean proportion of species reaching the conservation target ranged from 98.5% to 99.99% across all surrogate domains, whereas the range was 90-96% across all domains when both cost and clumping were considered. This proportion did not change considerably between scenarios where one constraint was imposed and scenarios where both cost and clumping constraints were considered. We conclude that representative reserve systems can be designed using abiotic domains; however, there are substantial benefits if some biological information is incorporated. © 2015 Society for Conservation Biology.
BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results.

PubMed

Worley, K C; Wiese, B A; Smith, R F

1995-09-01

BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).
A conserved tryptophan within the WRDPLVDID domain of yeast Pah1 phosphatidate phosphatase is required for its in vivo function in lipid metabolism.

PubMed

Park, Yeonhee; Han, Gil-Soo; Carman, George M

2017-12-01

PAH1 -encoded phosphatidate phosphatase, which catalyzes the dephosphorylation of phosphatidate to produce diacylglycerol at the endoplasmic reticulum membrane, plays a major role in controlling the utilization of phosphatidate for the synthesis of triacylglycerol or membrane phospholipids. The conserved N-LIP and haloacid dehalogenase-like domains of Pah1 are required for phosphatidate phosphatase activity and the in vivo function of the enzyme. Its non-conserved regions, which are located between the conserved domains and at the C terminus, contain sites for phosphorylation by multiple protein kinases. Truncation analyses of the non-conserved regions showed that they are not essential for the catalytic activity of Pah1 and its physiological functions ( e.g. triacylglycerol synthesis). This analysis also revealed that the C-terminal region contains a previously unrecognized WRDPLVDID domain (residues 637-645) that is conserved in yeast, mice, and humans. The deletion of this domain had no effect on the catalytic activity of Pah1 but caused the loss of its in vivo function. Site-specific mutational analyses of the conserved residues within WRDPLVDID indicated that Trp-637 plays a crucial role in Pah1 function. This work also demonstrated that the catalytic activity of Pah1 is required but is not sufficient for its in vivo functions. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
The Metarhizium anisopliae trp1 gene: cloning and regulatory analysis.

PubMed

Staats, Charley Christian; Silva, Marcia Suzana Nunes; Pinto, Paulo Marcos; Vainstein, Marilene Henning; Schrank, Augusto

2004-07-01

The trp1 gene from the entomopathogenic fungus Metarhizium anisopliae, cloned by heterologous hybridization with the plasmid carrying the trpC gene from Aspergillus nidulans, was sequence characterized. The predicted translation product has the conserved catalytic domains of glutamine amidotransferase (G domain), indoleglycerolphosphate synthase (C domain), and phosphoribosyl anthranilate isomerase (F domain) organized as NH2-G-C-F-COOH. The ORF is interrupted by a single intron of 60 nt that is position conserved in relation to trp genes from Ascomycetes and length conserved in relation to Basidiomycetes species. RT-PCR analysis suggests constitutive expression of trp1 gene in M. anisopliae.
The WRKY transcription factor family in Brachypodium distachyon.

PubMed

Tripathi, Prateek; Rabara, Roel C; Langum, Tanner J; Boken, Ashley K; Rushton, Deena L; Boomsma, Darius D; Rinerson, Charles I; Rabara, Jennifer; Reese, R Neil; Chen, Xianfeng; Rohila, Jai S; Rushton, Paul J

2012-06-22

A complete assembled genome sequence of wheat is not yet available. Therefore, model plant systems for wheat are very valuable. Brachypodium distachyon (Brachypodium) is such a system. The WRKY family of transcription factors is one of the most important families of plant transcriptional regulators with members regulating important agronomic traits. Studies of WRKY transcription factors in Brachypodium and wheat therefore promise to lead to new strategies for wheat improvement. We have identified and manually curated the WRKY transcription factor family from Brachypodium using a pipeline designed to identify all potential WRKY genes. 86 WRKY transcription factors were found, a total higher than all other current databases. We therefore propose that our numbering system (BdWRKY1-BdWRKY86) becomes the standard nomenclature. In the JGI v1.0 assembly of Brachypodium with the MIPS/JGI v1.0 annotation, nine of the transcription factors have no gene model and eleven gene models are probably incorrectly predicted. In total, twenty WRKY transcription factors (23.3%) do not appear to have accurate gene models. To facilitate use of our data, we have produced The Database of Brachypodium distachyon WRKY Transcription Factors. Each WRKY transcription factor has a gene page that includes predicted protein domains from MEME analyses. These conserved protein domains reflect possible input and output domains in signaling. The database also contains a BLAST search function where a large dataset of WRKY transcription factors, published genes, and an extensive set of wheat ESTs can be searched. We also produced a phylogram containing the WRKY transcription factor families from Brachypodium, rice, Arabidopsis, soybean, and Physcomitrella patens, together with published WRKY transcription factors from wheat. This phylogenetic tree provides evidence for orthologues, co-orthologues, and paralogues of Brachypodium WRKY transcription factors. The description of the WRKY transcription factor family in Brachypodium that we report here provides a framework for functional genomics studies in an important model system. Our database is a resource for both Brachypodium and wheat studies and ultimately projects aimed at improving wheat through manipulation of WRKY transcription factors.
The WRKY transcription factor family in Brachypodium distachyon

PubMed Central

2012-01-01

Background A complete assembled genome sequence of wheat is not yet available. Therefore, model plant systems for wheat are very valuable. Brachypodium distachyon (Brachypodium) is such a system. The WRKY family of transcription factors is one of the most important families of plant transcriptional regulators with members regulating important agronomic traits. Studies of WRKY transcription factors in Brachypodium and wheat therefore promise to lead to new strategies for wheat improvement. Results We have identified and manually curated the WRKY transcription factor family from Brachypodium using a pipeline designed to identify all potential WRKY genes. 86 WRKY transcription factors were found, a total higher than all other current databases. We therefore propose that our numbering system (BdWRKY1-BdWRKY86) becomes the standard nomenclature. In the JGI v1.0 assembly of Brachypodium with the MIPS/JGI v1.0 annotation, nine of the transcription factors have no gene model and eleven gene models are probably incorrectly predicted. In total, twenty WRKY transcription factors (23.3%) do not appear to have accurate gene models. To facilitate use of our data, we have produced The Database of Brachypodium distachyon WRKY Transcription Factors. Each WRKY transcription factor has a gene page that includes predicted protein domains from MEME analyses. These conserved protein domains reflect possible input and output domains in signaling. The database also contains a BLAST search function where a large dataset of WRKY transcription factors, published genes, and an extensive set of wheat ESTs can be searched. We also produced a phylogram containing the WRKY transcription factor families from Brachypodium, rice, Arabidopsis, soybean, and Physcomitrella patens, together with published WRKY transcription factors from wheat. This phylogenetic tree provides evidence for orthologues, co-orthologues, and paralogues of Brachypodium WRKY transcription factors. Conclusions The description of the WRKY transcription factor family in Brachypodium that we report here provides a framework for functional genomics studies in an important model system. Our database is a resource for both Brachypodium and wheat studies and ultimately projects aimed at improving wheat through manipulation of WRKY transcription factors. PMID:22726208
Conserved Domains in the Transformer Protein Act Complementary to Regulate Sex-Specific Splicing of Its Own Pre-mRNA.

PubMed

Tanaka, Arisa; Aoki, Fugaku; Suzuki, Masataka G

2018-05-26

The transformer (tra) gene, which is a female-determining master gene in the housefly Musca domestica, acts as a memory device for sex determination via its auto-regulatory function, i.e., through the contribution of the TRA protein to female-specific splicing of its own pre-mRNA. The TRA protein contains 4 small domains that are specifically conserved among TRA proteins (domains 1-4). Domain 2, also named TRA-CAM domain, is the most conserved, but its function remains unknown. To examine whether these domains are involved in the auto-regulatory function, we performed in vitro splicing assays using a tra minigene containing a partial genomic sequence of the M. domestica tra (Mdtra) gene. Co-transfection of the Mdtra minigene and an MdTRA protein expression vector into cultured insect cells strongly induced female-specific splicing of the minigene. A series of deletion mutation analyses demonstrated that these domains act complementarily to induce female-specific splicing. Domain 1 and the TRA-CAM domain were necessary for the female-specific splicing when the MdTRA protein lacked both domains 3 and 4. In this situation, mutation of the well-conserved 3 amino acids (GEG) in the TRA-CAM domain significantly reduced the female-specific splicing activity of MdTRA. GST-pull down analyses demonstrated that the MdTRA protein specifically enriched on the male-specific exonic region (exon 2b), which contains the putative TRA/TRA-2 binding sites, and that the GEG mutation disrupts this enrichment. Since the MdTRA protein interacts with its own pre-mRNA through TRA-2, our findings suggest that the conserved amino acid residues in the TRA-CAM domain may be crucial for the interaction between MdTRA and TRA-2, enhancing MdTRA recruitment on its pre-mRNA to induce female-specific splicing of tra in the housefly. © 2018 S. Karger AG, Basel.
Bioinformatics analysis of the ς-carotene desaturase gene in cabbage (Brassica oleracea var. capitata)

NASA Astrophysics Data System (ADS)

Sun, Bo; Zheng, Aihong; Jiang, Min; Xue, Shengling; Zhang, Fen; Tang, Haoru

2018-04-01

ς-carotene desaturase (ZDS) is an important enzyme in carotenoid biosynthesis. Here, the Brassica oleracea var. capitata ZDS (BocZDS) gene sequences were obtained from Brassica database (BRAD), and preformed for bioinformatics analysis. The BocZDS gene mapped to Scaffold000363, and contains an open reading frame of 1,686 bp that encodes a 561-amino acid protein with a calculated molecular mass of 62.00 kD and an isoelectric point (pI) of 8.2. Subcellular localization predicted the BocZDS gene was in the chloroplast. The conserved domain of the BocZDS protein is PLN02487, indicating that it belongs the member of zeta-carotene desaturase. Homology analysis indicates that the ZDS protein is apparently conserved during plant evolution and is most closely related to B. oleracea var. oleracea, B. napus, and B. rapa. The findings of the present study provide a molecular basis for the elucidation of ZDS gene function in cabbage.
RNA polymerase II conserved protein domains as platforms for protein-protein interactions

PubMed Central

García-López, M Carmen

2011-01-01

RNA polymerase II establishes many protein-protein interactions with transcriptional regulators to coordinate gene expression, but little is known about protein domains involved in the contact with them. We use a new approach to look for conserved regions of the RNA pol II of S. cerevisiae located at the surface of the structure of the complex, hypothesizing that they might be involved in the interaction with transcriptional regulators. We defined five different conserved domains and demonstrate that all of them make contact with transcriptional regulators. PMID:21922063
ECOD: An Evolutionary Classification of Protein Domains

PubMed Central

Kinch, Lisa N.; Pei, Jimin; Shi, Shuoyong; Kim, Bong-Hyun; Grishin, Nick V.

2014-01-01

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies. PMID:25474468
ECOD: an evolutionary classification of protein domains.

PubMed

Cheng, Hua; Schaeffer, R Dustin; Liao, Yuxing; Kinch, Lisa N; Pei, Jimin; Shi, Shuoyong; Kim, Bong-Hyun; Grishin, Nick V

2014-12-01

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or "fold"). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.

Structure of the SPRY domain of the human RNA helicase DDX1, a putative interaction platform within a DEAD-box protein

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kellner, Julian N.; Meinhart, Anton, E-mail: anton.meinhart@mpimf-heidelberg.mpg.de

The structure of the SPRY domain of the human RNA helicase DDX1 was determined at 2.0 Å resolution. The SPRY domain provides a putative protein–protein interaction platform within DDX1 that differs from other SPRY domains in its structure and conserved regions. The human RNA helicase DDX1 in the DEAD-box family plays an important role in RNA processing and has been associated with HIV-1 replication and tumour progression. Whereas previously described DEAD-box proteins have a structurally conserved core, DDX1 shows a unique structural feature: a large SPRY-domain insertion in its RecA-like consensus fold. SPRY domains are known to function as protein–proteinmore » interaction platforms. Here, the crystal structure of the SPRY domain of human DDX1 (hDSPRY) is reported at 2.0 Å resolution. The structure reveals two layers of concave, antiparallel β-sheets that stack onto each other and a third β-sheet beneath the β-sandwich. A comparison with SPRY-domain structures from other eukaryotic proteins showed that the general β-sandwich fold is conserved; however, differences were detected in the loop regions, which were identified in other SPRY domains to be essential for interaction with cognate partners. In contrast, in hDSPRY these loop regions are not strictly conserved across species. Interestingly, though, a conserved patch of positive surface charge is found that may replace the connecting loops as a protein–protein interaction surface. The data presented here comprise the first structural information on DDX1 and provide insights into the unique domain architecture of this DEAD-box protein. By providing the structure of a putative interaction domain of DDX1, this work will serve as a basis for further studies of the interaction network within the hetero-oligomeric complexes of DDX1 and of its recruitment to the HIV-1 Rev protein as a viral replication factor.« less
Methods and apparatus for constructing and implementing a universal extension module for processing objects in a database

NASA Technical Reports Server (NTRS)

Li, Chung-Sheng (Inventor); Smith, John R. (Inventor); Chang, Yuan-Chi (Inventor); Jhingran, Anant D. (Inventor); Padmanabhan, Sriram K. (Inventor); Hsiao, Hui-I (Inventor); Choy, David Mun-Hien (Inventor); Lin, Jy-Jine James (Inventor); Fuh, Gene Y. C. (Inventor); Williams, Robin (Inventor)

2004-01-01

Methods and apparatus for providing a multi-tier object-relational database architecture are disclosed. In one illustrative embodiment of the present invention, a multi-tier database architecture comprises an object-relational database engine as a top tier, one or more domain-specific extension modules as a bottom tier, and one or more universal extension modules as a middle tier. The individual extension modules of the bottom tier operationally connect with the one or more universal extension modules which, themselves, operationally connect with the database engine. The domain-specific extension modules preferably provide such functions as search, index, and retrieval services of images, video, audio, time series, web pages, text, XML, spatial data, etc. The domain-specific extension modules may include one or more IBM DB2 extenders, Oracle data cartridges and/or Informix datablades, although other domain-specific extension modules may be used.
Comprehensively Surveying Structure and Function of RING Domains from Drosophila melanogaster

PubMed Central

Wu, Yuehao; Wan, Fusheng; Huang, Chunhong; Jie, Kemin

2011-01-01

Using a complete set of RING domains from Drosophila melanogaster, all the solved RING domains and cocrystal structures of RING-containing ubiquitin-ligases (RING-E3) and ubiquitin-conjugating enzyme (E2) pairs, we analyzed RING domains structures from their primary to quarternary structures. The results showed that: i) putative orthologs of RING domains between Drosophila melanogaster and the human largely occur (118/139, 84.9%); ii) of the 118 orthologous pairs from Drosophila melanogaster and the human, 117 pairs (117/118, 99.2%) were found to retain entirely uniform domain architectures, only Iap2/Diap2 experienced evolutionary expansion of domain architecture; iii) 4 evolutionary structurally conserved regions (SCRs) are responsible for homologous folding of RING domains at the superfamily level; iv) besides the conserved Cys/His chelating zinc ions, 6 equivalent residues (4 hydrophobic and 2 polar residues) in the SCRs possess good-consensus and conservation- these 4 SCRs function in the structural positioning of 6 equivalent residues as determinants for RING-E3 catalysis; v) members of these RING proteins located nucleus, multiple subcellular compartments, membrane protein and mitochondrion are respectively 42 (42/139, 30.2%), 71 (71/139, 51.1%), 22 (22/139, 15.8%) and 4 (4/139, 2.9%); vi) CG15104 (Topors) and CG1134 (Mul1) in C3HC4, and CG3929 (Deltex) in C3H2C3 seem to display broader E2s binding profiles than other RING-E3s; vii) analyzing intermolecular interfaces of E2/RING-E3 complexes indicate that residues directly interacting with E2s are all from the SCRs in RING domains. Of the 6 residues, 2 hydrophobic ones contribute to constructing the conserved hydrophobic core, while the 2 hydrophobic and 2 polar residues directly participate in E2/RING-E3 interactions. Based on sequence and structural data, SCRs, conserved equivalent residues and features of intermolecular interfaces were extracted, highlighting the presence of a nucleus for RING domain fold and formation of catalytic core in which related residues and regions exhibit preferential evolutionary conservation. PMID:21912646
The structural role of the zinc ion can be dispensable in prokaryotic zinc-finger domains

PubMed Central

Baglivo, Ilaria; Russo, Luigi; Esposito, Sabrina; Malgieri, Gaetano; Renda, Mario; Salluzzo, Antonio; Di Blasio, Benedetto; Isernia, Carla; Fattorusso, Roberto; Pedone, Paolo V.

2009-01-01

The recent characterization of the prokaryotic Cys2His2 zinc-finger domain, identified in Ros protein from Agrobacterium tumefaciens, has demonstrated that, although possessing a similar zinc coordination sphere, this domain is structurally very different from its eukaryotic counterpart. A search in the databases has identified ≈300 homologues with a high sequence identity to the Ros protein, including the amino acids that form the extensive hydrophobic core in Ros. Surprisingly, the Cys2His2 zinc coordination sphere is generally poorly conserved in the Ros homologues, raising the question of whether the zinc ion is always preserved in these proteins. Here, we present a functional and structural study of a point mutant of Ros protein, Ros56–142C82D, in which the second coordinating cysteine is replaced by an aspartate, 5 previously-uncharacterized representative Ros homologues from Mesorhizobium loti, and 2 mutants of the homologues. Our results indicate that the prokaryotic zinc-finger domain, which in Ros protein tetrahedrally coordinates Zn(II) through the typical Cys2His2 coordination, in Ros homologues can either exploit a CysAspHis2 coordination sphere, previously never described in DNA binding zinc finger domains to our knowledge, or lose the metal, while still preserving the DNA-binding activity. We demonstrate that this class of prokaryotic zinc-finger domains is structurally very adaptable, and surprisingly single mutations can transform a zinc-binding domain into a nonzinc-binding domain and vice versa, without affecting the DNA-binding ability. In light of our findings an evolutionary link between the prokaryotic and eukaryotic zinc-finger domains, based on bacteria-to-eukaryota horizontal gene transfer, is discussed. PMID:19369210
A proposed model for the flowering signaling pathway of sugarcane under photoperiodic control.

PubMed

Coelho, C P; Costa Netto, A P; Colasanti, J; Chalfun-Júnior, A

2013-04-25

Molecular analysis of floral induction in Arabidopsis has identified several flowering time genes related to 4 response networks defined by the autonomous, gibberellin, photoperiod, and vernalization pathways. Although grass flowering processes include ancestral functions shared by both mono- and dicots, they have developed their own mechanisms to transmit floral induction signals. Despite its high production capacity and its important role in biofuel production, almost no information is available about the flowering process in sugarcane. We searched the Sugarcane Expressed Sequence Tags database to look for elements of the flowering signaling pathway under photoperiodic control. Sequences showing significant similarity to flowering time genes of other species were clustered, annotated, and analyzed for conserved domains. Multiple alignments comparing the sequences found in the sugarcane database and those from other species were performed and their phylogenetic relationship assessed using the MEGA 4.0 software. Electronic Northerns were run with Cluster and TreeView programs, allowing us to identify putative members of the photoperiod-controlled flowering pathway of sugarcane.
Regulated expression of a novel TCP domain transcription factor indicates an involvement in the control of meristem activation processes in Solanum tuberosum.

PubMed

Faivre-Rampant, Odile; Bryan, Glenn J; Roberts, Alison G; Milbourne, Daniel; Viola, Roberto; Taylor, Mark A

2004-04-01

In this study, the aim was to determine whether TCP transcription factors are implicated in meristem activation in potato (Solanum tuberosum). By searching a database of potato EST sequences, with a sequence characteristically conserved in TCP domains, a potato tcp gene was identified. A BAC clone containing the tcp sequence was isolated and the genomic sequence was determined. Using a CAPS marker assay, the potato tcp gene (sttcp1) was mapped to chromosome 8. In dormant buds, relatively high levels of sttcp1-specific transcript were detected by in situ hybridization. By contrast, in sprouting buds, no expression of the sttcp1 could be detected. Furthermore, an inverse relationship between axillary bud size and the steady-state level of the sstcp1 transcript was demonstrated. In non-growing buds exhibiting correlative inhibition, sttcpI-specific transcript levels were also relatively high, but rapidly decreased when apical dominance was removed by excision of the apical bud.
When a domain isn’t a domain, and why it’s important to properly filter proteins in databases

PubMed Central

Towse, Clare-Louise; Daggett, Valerie

2013-01-01

Summary Membership in a protein domain database does not a domain make; a feature we realized when generating a consensus view of protein fold space with our Consensus Domain Dictionary (CDD). This dictionary was used to select representative structures for characterization of the protein dynameome: the Dynameomics initiative. Through this endeavor we rejected a surprising 40% of the 1695 folds in the CDD as being non-autonomous folding units. Although some of this was due to the challenges of grouping similar fold topologies, the dissonance between the cataloguing and structural qualification of protein domains remains surprising. Another potential factor is previously overlooked intrinsic disorder; predicted estimates suggest 40% of proteins to have either local or global disorder. One thing is clear, filtering a structural database and ensuring a consistent definition for protein domains is crucial, and caution is prescribed when generalizations of globular domains are drawn from unfiltered protein domain datasets. PMID:23108912
Large ensemble and large-domain hydrologic modeling: Insights from SUMMA applications in the Columbia River Basin

NASA Astrophysics Data System (ADS)

Ou, G.; Nijssen, B.; Nearing, G. S.; Newman, A. J.; Mizukami, N.; Clark, M. P.

2016-12-01

The Structure for Unifying Multiple Modeling Alternatives (SUMMA) provides a unifying modeling framework for process-based hydrologic modeling by defining a general set of conservation equations for mass and energy, with the capability to incorporate multiple choices for spatial discretizations and flux parameterizations. In this study, we provide a first demonstration of large-scale hydrologic simulations using SUMMA through an application to the Columbia River Basin (CRB) in the northwestern United States and Canada for a multi-decadal simulation period. The CRB is discretized into 11,723 hydrologic response units (HRUs) according to the United States Geologic Service Geospatial Fabric. The soil parameters are derived from the Natural Resources Conservation Service Soil Survey Geographic (SSURGO) Database. The land cover parameters are based on the National Land Cover Database from the year 2001 created by the Multi-Resolution Land Characteristics (MRLC) Consortium. The forcing data, including hourly air pressure, temperature, specific humidity, wind speed, precipitation, shortwave and longwave radiations, are based on Phase 2 of the North American Land Data Assimilation System (NLDAS-2) and averaged for each HRU. The simulation results are compared to simulations with the Variable Infiltration Capacity (VIC) model and the Precipitation Runoff Modeling System (PRMS). We are particularly interested in SUMMA's capability to mimic model behaviors of the other two models through the selection of appropriate model parameterizations in SUMMA.
Prevalence of the F-type lectin domain.

PubMed

Bishnoi, Ritika; Khatri, Indu; Subramanian, Srikrishna; Ramya, T N C

2015-08-01

F-type lectins are fucolectins with characteristic fucose and calcium-binding sequence motifs and a unique lectin fold (the "F-type" fold). F-type lectins are phylogenetically widespread with selective distribution. Several eukaryotic F-type lectins have been biochemically and structurally characterized, and the F-type lectin domain (FLD) has also been studied in the bacterial proteins, Streptococcus mitis lectinolysin and Streptococcus pneumoniae SP2159. However, there is little knowledge about the extent of occurrence of FLDs and their domain organization, especially, in bacteria. We have now mined the extensive genomic sequence information available in the public databases with sensitive sequence search techniques in order to exhaustively survey prokaryotic and eukaryotic FLDs. We report 437 FLD sequence clusters (clustered at 80% sequence identity) from eukaryotic, eubacterial and viral proteins. Domain architectures are diverse but mostly conserved in closely related organisms, and domain organizations of bacterial FLD-containing proteins are very different from their eukaryotic counterparts, suggesting unique specialization of FLDs to suit different requirements. Several atypical phylogenetic associations hint at lateral transfer. Among eukaryotes, we observe an expansion of FLDs in terms of occurrence and domain organization diversity in the taxa Mollusca, Hemichordata and Branchiostomi, perhaps coinciding with greater emphasis on innate immune strategies in these organisms. The naturally occurring FLDs with diverse domain organizations that we have identified here will be useful for future studies aimed at creating designer molecular platforms for directing desired biological activities to fucosylated glycoconjugates in target niches. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
C&RE-SLC: Database for conservation and renewable energy activities

NASA Astrophysics Data System (ADS)

Cavallo, J. D.; Tompkins, M. M.; Fisher, A. G.

1992-08-01

The Western Area Power Administration (Western) requires all its long-term power customers to implement programs that promote the conservation of electric energy or facilitate the use of renewable energy resources. The hope is that these measures could significantly reduce the amount of environmental damage associated with electricity production. As part of preparing the environmental impact statement for Western's Electric Power Marketing Program, Argonne National Laboratory constructed a database of the conservation and renewable energy activities in which Western's Salt Lake City customers are involved. The database provides information on types of conservation and renewable energy activities and allows for comparisons of activities being conducted at different utilities in the Salt Lake City region. Sorting the database allows Western's Salt Lake City customers to be classified so the various activities offered by different classes of utilities can be identified; for example, comparisons can be made between municipal utilities and cooperatives or between large and small customers. The information included in the database was collected from customer planning documents in the files of Western's Salt Lake City office.
Conservation and diversification of Msx protein in metazoan evolution.

PubMed

Takahashi, Hirokazu; Kamiya, Akiko; Ishiguro, Akira; Suzuki, Atsushi C; Saitou, Naruya; Toyoda, Atsushi; Aruga, Jun

2008-01-01

Msx (/msh) family genes encode homeodomain (HD) proteins that control ontogeny in many animal species. We compared the structures of Msx genes from a wide range of Metazoa (Porifera, Cnidaria, Nematoda, Arthropoda, Tardigrada, Platyhelminthes, Mollusca, Brachiopoda, Annelida, Echiura, Echinodermata, Hemichordata, and Chordata) to gain an understanding of the role of these genes in phylogeny. Exon-intron boundary analysis suggested that the position of the intron located N-terminally to the HDs was widely conserved in all the genes examined, including those of cnidarians. Amino acid (aa) sequence comparison revealed 3 new evolutionarily conserved domains, as well as very strong conservation of the HDs. Two of the three domains were associated with Groucho-like protein binding in both a vertebrate and a cnidarian Msx homolog, suggesting that the interaction between Groucho-like proteins and Msx proteins was established in eumetazoan ancestors. Pairwise comparison among the collected HDs and their C-flanking aa sequences revealed that the degree of sequence conservation varied depending on the animal taxa from which the sequences were derived. Highly conserved Msx genes were identified in the Vertebrata, Cephalochordata, Hemichordata, Echinodermata, Mollusca, Brachiopoda, and Anthozoa. The wide distribution of the conserved sequences in the animal phylogenetic tree suggested that metazoan ancestors had already acquired a set of conserved domains of the current Msx family genes. Interestingly, although strongly conserved sequences were recovered from the Vertebrata, Cephalochordata, and Anthozoa, the sequences from the Urochordata and Hydrozoa showed weak conservation. Because the Vertebrata-Cephalochordata-Urochordata and Anthozoa-Hydrozoa represent sister groups in the Chordata and Cnidaria, respectively, Msx sequence diversification may have occurred differentially in the course of evolution. We speculate that selective loss of the conserved domains in Msx family proteins contributed to the diversification of animal body organization.
Using complementary approaches to identify trans-domain nuclear gene transfers in the extremophile Galdieria sulphuraria (Rhodophyta).

PubMed

Pandey, Ravi S; Saxena, Garima; Bhattacharya, Debashish; Qiu, Huan; Azad, Rajeev K

2017-02-01

Identification of horizontal gene transfers (HGTs) has primarily relied on phylogenetic tree based methods, which require a rich sampling of sequenced genomes to ensure a reliable inference. Because the success of phylogenetic approaches depends on the breadth and depth of the database, researchers usually apply stringent filters to detect only the most likely gene transfers in the genomes of interest. One such study focused on a highly conservative estimate of trans-domain gene transfers in the extremophile eukaryote, Galdieria sulphuraria (Galdieri) Merola (Rhodophyta), by applying multiple filters in their phylogenetic pipeline. This led to the identification of 75 inter-domain acquisitions from Bacteria or Archaea. Because of the evolutionary, ecological, and potential biotechnological significance of foreign genes in algae, alternative approaches and pipelines complementing phylogenetics are needed for a more comprehensive assessment of HGT. We present here a novel pipeline that uncovered 17 novel foreign genes of prokaryotic origin in G. sulphuraria, results that are supported by multiple lines of evidence including composition-based, comparative data, and phylogenetics. These genes encode a variety of potentially adaptive functions, from metabolite transport to DNA repair. © 2016 Phycological Society of America.
The C-terminal region of Ge-1 presents conserved structural features required for P-body localization.

PubMed

Jinek, Martin; Eulalio, Ana; Lingel, Andreas; Helms, Sigrun; Conti, Elena; Izaurralde, Elisa

2008-10-01

The removal of the 5' cap structure by the DCP1-DCP2 decapping complex irreversibly commits eukaryotic mRNAs to degradation. In human cells, the interaction between DCP1 and DCP2 is bridged by the Ge-1 protein. Ge-1 contains an N-terminal WD40-repeat domain connected by a low-complexity region to a conserved C-terminal domain. It was reported that the C-terminal domain interacts with DCP2 and mediates Ge-1 oligomerization and P-body localization. To understand the molecular basis for these functions, we determined the three-dimensional crystal structure of the most conserved region of the Drosophila melanogaster Ge-1 C-terminal domain. The region adopts an all alpha-helical fold related to ARM- and HEAT-repeat proteins. Using structure-based mutants we identified an invariant surface residue affecting P-body localization. The conservation of critical surface and structural residues suggests that the C-terminal region adopts a similar fold with conserved functions in all members of the Ge-1 protein family.
[Conserved motifs in voltage sensing proteins].

PubMed

Wang, Chang-He; Xie, Zhen-Li; Lv, Jian-Wei; Yu, Zhi-Dan; Shao, Shu-Li

2012-08-25

This paper was aimed to study conserved motifs of voltage sensing proteins (VSPs) and establish a voltage sensing model. All VSPs were collected from the Uniprot database using a comprehensive keyword search followed by manual curation, and the results indicated that there are only two types of known VSPs, voltage gated ion channels and voltage dependent phosphatases. All the VSPs have a common domain of four helical transmembrane segments (TMS, S1-S4), which constitute the voltage sensing module of the VSPs. The S1 segment was shown to be responsible for membrane targeting and insertion of these proteins, while S2-S4 segments, which can sense membrane potential, for protein properties. Conserved motifs/residues and their functional significance of each TMS were identified using profile-to-profile sequence alignments. Conserved motifs in these four segments are strikingly similar for all VSPs, especially, the conserved motif [RK]-X(2)-R-X(2)-R-X(2)-[RK] was presented in all the S4 segments, with positively charged arginine (R) alternating with two hydrophobic or uncharged residues. Movement of these arginines across the membrane electric field is the core mechanism by which the VSPs detect changes in membrane potential. The negatively charged aspartate (D) in the S3 segment is universally conserved in all the VSPs, suggesting that the aspartate residue may be involved in voltage sensing properties of VSPs as well as the electrostatic interactions with the positively charged residues in the S4 segment, which may enhance the thermodynamic stability of the S4 segments in plasma membrane.
The BID Domain of Type IV Secretion Substrates Forms a Conserved Four-Helix Bundle Topped with a Hook.

PubMed

Stanger, Frédéric V; de Beer, Tjaart A P; Dranow, David M; Schirmer, Tilman; Phan, Isabelle; Dehio, Christoph

2017-01-03

The BID (Bep intracellular delivery) domain functions as secretion signal in a subfamily of protein substrates of bacterial type IV secretion (T4S) systems. It mediates transfer of (1) relaxases and the attached DNA during bacterial conjugation, and (2) numerous Bartonella effector proteins (Beps) during protein transfer into host cells infected by pathogenic Bartonella species. Furthermore, BID domains of Beps have often evolved secondary effector functions within host cells. Here, we provide crystal structures for three representative BID domains and describe a novel conserved fold characterized by a compact, antiparallel four-helix bundle topped with a hook. The conserved hydrophobic core provides a rigid scaffold to a surface that, despite a few conserved exposed residues and similarities in charge distribution, displays significant variability. We propose that the genuine function of BID domains as T4S signal may primarily depend on their rigid structure, while the plasticity of their surface may facilitate adaptation to secondary effector functions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Geospatial database for heritage building conservation

NASA Astrophysics Data System (ADS)

Basir, W. N. F. W. A.; Setan, H.; Majid, Z.; Chong, A.

2014-02-01

Heritage buildings are icons from the past that exist in present time. Through heritage architecture, we can learn about economic issues and social activities of the past. Nowadays, heritage buildings are under threat from natural disaster, uncertain weather, pollution and others. In order to preserve this heritage for the future generation, recording and documenting of heritage buildings are required. With the development of information system and data collection technique, it is possible to create a 3D digital model. This 3D information plays an important role in recording and documenting heritage buildings. 3D modeling and virtual reality techniques have demonstrated the ability to visualize the real world in 3D. It can provide a better platform for communication and understanding of heritage building. Combining 3D modelling with technology of Geographic Information System (GIS) will create a database that can make various analyses about spatial data in the form of a 3D model. Objectives of this research are to determine the reliability of Terrestrial Laser Scanning (TLS) technique for data acquisition of heritage building and to develop a geospatial database for heritage building conservation purposes. The result from data acquisition will become a guideline for 3D model development. This 3D model will be exported to the GIS format in order to develop a database for heritage building conservation. In this database, requirements for heritage building conservation process are included. Through this research, a proper database for storing and documenting of the heritage building conservation data will be developed.
The ASTRAL Compendium in 2004

DOE R&D Accomplishments Database

Chandonia, John-Marc; Hon, Gary; Walker, Nigel S.; Lo Conte, Loredana; Koehl, Patrice; Levitt, Michael; Brenner, Steven E.

2003-09-15

The ASTRAL compendium provides several databases and tools to aid in the analysis of protein structures, particularly through the use of their sequences. Partially derived from the SCOP database of protein structure domains, it includes sequences for each domain and other resources useful for studying these sequences and domain structures. The current release of ASTRAL contains 54,745 domains, more than three times as many as the initial release four years ago. ASTRAL has undergone major transformations in the past two years. In addition to several complete updates each year, ASTRAL is now updated on a weekly basis with preliminary classifications of domains from newly released PDB structures. These classifications are available as a stand-alone database, as well as available integrated into other ASTRAL databases such as representative subsets. To enhance the utility of ASTRAL to structural biologists, all SCOP domains are now made available as PDB-style coordinate files as well as sequences. In addition to sequences and representative subsets based on SCOP domains, sequences and subsets based on PDB chains are newly included in ASTRAL. Several search tools have been added to ASTRAL to facilitate retrieval of data by individual users and automated methods.
SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein–Protein Interactions

PubMed Central

Jefferson, Emily R.; Walsh, Thomas P.; Roberts, Timothy J.; Barton, Geoffrey J.

2007-01-01

SNAPPI-DB, a high performance database of Structures, iNterfaces and Alignments of Protein–Protein Interactions, and its associated Java Application Programming Interface (API) is described. SNAPPI-DB contains structural data, down to the level of atom co-ordinates, for each structure in the Protein Data Bank (PDB) together with associated data including SCOP, CATH, Pfam, SWISSPROT, InterPro, GO terms, Protein Quaternary Structures (PQS) and secondary structure information. Domain–domain interactions are stored for multiple domain definitions and are classified by their Superfamily/Family pair and interaction interface. Each set of classified domain–domain interactions has an associated multiple structure alignment for each partner. The API facilitates data access via PDB entries, domains and domain–domain interactions. Rapid development, fast database access and the ability to perform advanced queries without the requirement for complex SQL statements are provided via an object oriented database and the Java Data Objects (JDO) API. SNAPPI-DB contains many features which are not available in other databases of structural protein–protein interactions. It has been applied in three studies on the properties of protein–protein interactions and is currently being employed to train a protein–protein interaction predictor and a functional residue predictor. The database, API and manual are available for download at: . PMID:17202171
New insight into the architecture of oxy-anion pocket in unliganded conformation of GAT domains: A MD-simulation study.

PubMed

Bairagya, Hridoy R; Bansal, Manju

2016-03-01

Human Guanine Monophosphate Synthetase (hGMPS) converts XMP to GMP, and acts as a bifunctional enzyme with N-terminal "glutaminase" (GAT) and C-terminal "synthetase" domain. The enzyme is identified as a potential target for anti-cancer and immunosuppressive therapies. GAT domain of enzyme plays central role in metabolism, and contains conserved catalytic residues Cys104, His190, and Glu192. MD simulation studies on GAT domain suggest that position of oxyanion in unliganded conformation is occupied by one conserved water molecule (W1), which also stabilizes that pocket. This position is occupied by a negatively charged atom of the substrate or ligand in ligand bound crystal structures. In fact, MD simulation study of Ser75 to Val indicates that W1 conserved water molecule is stabilized by Ser75, while Thr152, and His190 also act as anchor residues to maintain appropriate architecture of oxyanion pocket through water mediated H-bond interactions. Possibly, four conserved water molecules stabilize oxyanion hole in unliganded state, but they vacate these positions when the enzyme (hGMPS)-substrate complex is formed. Thus this study not only reveals functionally important role of conserved water molecules in GAT domain, but also highlights essential role of other non-catalytic residues such as Ser75 and Thr152 in this enzymatic domain. The results from this computational study could be of interest to experimental community and provide a testable hypothesis for experimental validation. Conserved sites of water molecules near and at oxyanion hole highlight structural importance of water molecules and suggest a rethink of the conventional definition of chemical geometry of inhibitor binding site. © 2016 Wiley Periodicals, Inc.
Mobile Element Evolution Playing Jigsaw - SINEs in Gastropod and Bivalve Mollusks.

PubMed

Matetovici, Irina; Sajgo, Szilard; Ianc, Bianca; Ochis, Cornelia; Bulzu, Paul; Popescu, Octavian; Damert, Annette

2016-01-06

SINEs (Short INterspersed Elements) are widely distributed among eukaryotes. Some SINE families are organized in superfamilies characterized by a shared central domain. These central domains are conserved across species, classes, and even phyla. Here we report the identification of two novel such superfamilies in the genomes of gastropod and bivalve mollusks. The central conserved domain of the first superfamily is present in SINEs in Caenogastropoda and Vetigastropoda as well as in all four subclasses of Bivalvia. We designated the domain MESC (Romanian for MElc-snail and SCoica-mussel) because it appears to be restricted to snails and mussels. The second superfamily is restricted to Caenogastropoda. Its central conserved domain-Snail-is related to the Nin-DC domain. Furthermore, we provide evidence that a 40-bp subdomain of the SINE V-domain is conserved in SINEs in mollusks and arthropods. It is predicted to form a stable stem-loop structure that is preserved in the context of the overall SINE RNA secondary structure in invertebrates. Our analysis also recovered short retrotransposons with a Long INterspersed Element (LINE)-derived 5' end. These share the body and/or the tail with transfer RNA (tRNA)-derived SINEs within and across species. Finally, we identified CORE SINEs in gastropods and bivalves-extending the distribution range of this superfamily. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Agricultural Conservation Planning Framework: 3. Land Use and Field Boundary Database Development and Structure.

PubMed

Tomer, Mark D; James, David E; Sandoval-Green, Claudette M J

2017-05-01

Conservation planning information is important for identifying options for watershed water quality improvement and can be developed for use at field, farm, and watershed scales. Translation across scales is a key issue impeding progress at watershed scales because watershed improvement goals must be connected with implementation of farm- and field-level conservation practices to demonstrate success. This is particularly true when examining alternatives for "trap and treat" practices implemented at agricultural-field edges to control (or influence) water flows through fields, landscapes, and riparian corridors within agricultural watersheds. We propose that database structures used in developing conservation planning information can achieve translation across conservation-planning scales, and we developed the Agricultural Conservation Planning Framework (ACPF) to enable practical planning applications. The ACPF comprises a planning concept, a database to facilitate field-level and watershed-scale analyses, and an ArcGIS toolbox with Python scripts to identify specific options for placement of conservation practices. This paper appends two prior publications and describes the structure of the ACPF database, which contains land use, crop history, and soils information and is available for download for 6091 HUC12 watersheds located across Iowa, Illinois, Minnesota, and parts of Kansas, Missouri, Nebraska, and Wisconsin and comprises information on 2.74 × 10 agricultural fields (available through /). Sample results examining land use trends across Iowa and Illinois are presented here to demonstrate potential uses of the database. While designed for use with the ACPF toolbox, users are welcome to use the ACPF watershed data in a variety of planning and modeling approaches. Copyright © by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America, Inc.
Mutational analysis of TRAF6 reveals a conserved functional role of the RING dimerization interface and a potentially necessary but insufficient role of RING-dependent TRAF6 polyubiquitination towards NF-κB activation

PubMed Central

Megas, Charilaos; Hatzivassiliou, Eudoxia G.; Yin, Qian; Vignali, Dario A.A.; Mosialos, George

2011-01-01

TRAF6 is an E3 ubiquitin ligase that plays a pivotal role in the activation of NF-κB by innate and adaptive immunity stimuli. TRAF6 consists of a highly conserved carboxyl terminal TRAF-C domain which is preceded by a coiled coil domain and an amino terminal region that contains a RING domain and a series of putative zinc-finger motifs. The TRAF-C domain contributes to TRAF6 oligomerization and mediates the interaction of TRAF6 with upstream signaling molecules whereas the RING domain comprises the core of the ubiquitin ligase catalytic domain. In order to identify structural elements that are important for TRAF6-induced NF-κB activation, mutational analysis of the TRAF-C and RING domains was performed. Alterations of highly conserved residues of the TRAF-C domain of TRAF6 did not affect significantly the ability of the protein to activate NF-κB. On the other hand a number of functionally important residues (L77, Q82, R88, F118, N121 and E126) for the activation of NF-κB were identified within the RING domain of TRAF6. Interestingly, several homologues of these residues in TRAF2 were shown to have a conserved functional role in TRAF2-induced NF-κB activation and lie at the dimerization interface of the RING domain. Finally, whereas alteration of Q82, R88 and F118 compromised both the K63-linked polyubiquitination of TRAF6 and its ability to activate NF-κB, alteration of L77, N121 and E126 diminished the NF-κB activating function of TRAF6 without affecting TRAF6 K63-linked polyubiquitination. Our results support a conserved functional role of the TRAF RING domain dimerization interface and a potentially necessary but insufficient role for RING-dependent TRAF6 K63-linked polyubiquitination towards NF-κB activation in cells. PMID:21185369
Mutational analysis of TRAF6 reveals a conserved functional role of the RING dimerization interface and a potentially necessary but insufficient role of RING-dependent TRAF6 polyubiquitination towards NF-κB activation.

PubMed

Megas, Charilaos; Hatzivassiliou, Eudoxia G; Yin, Qian; Marinopoulou, Elli; Hadweh, Paul; Vignali, Dario A A; Mosialos, George

2011-05-01

TRAF6 is an E3 ubiquitin ligase that plays a pivotal role in the activation of NF-κB by innate and adaptive immunity stimuli. TRAF6 consists of a highly conserved carboxyl terminal TRAF-C domain which is preceded by a coiled coil domain and an amino terminal region that contains a RING domain and a series of putative zinc-finger motifs. The TRAF-C domain contributes to TRAF6 oligomerization and mediates the interaction of TRAF6 with upstream signaling molecules whereas the RING domain comprises the core of the ubiquitin ligase catalytic domain. In order to identify structural elements that are important for TRAF6-induced NF-κB activation, mutational analysis of the TRAF-C and RING domains was performed. Alterations of highly conserved residues of the TRAF-C domain of TRAF6 did not affect significantly the ability of the protein to activate NF-κB. On the other hand a number of functionally important residues (L77, Q82, R88, F118, N121 and E126) for the activation of NF-κB were identified within the RING domain of TRAF6. Interestingly, several homologues of these residues in TRAF2 were shown to have a conserved functional role in TRAF2-induced NF-κB activation and lie at the dimerization interface of the RING domain. Finally, whereas alteration of Q82, R88 and F118 compromised both the K63-linked polyubiquitination of TRAF6 and its ability to activate NF-κB, alteration of L77, N121 and E126 diminished the NF-κB activating function of TRAF6 without affecting TRAF6 K63-linked polyubiquitination. Our results support a conserved functional role of the TRAF RING domain dimerization interface and a potentially necessary but insufficient role for RING-dependent TRAF6 K63-linked polyubiquitination towards NF-κB activation in cells. Copyright © 2010 Elsevier Inc. All rights reserved.
Mobile Element Evolution Playing Jigsaw—SINEs in Gastropod and Bivalve Mollusks

PubMed Central

Matetovici, Irina; Sajgo, Szilard; Ianc, Bianca; Ochis, Cornelia; Bulzu, Paul; Popescu, Octavian; Damert, Annette

2016-01-01

SINEs (Short INterspersed Elements) are widely distributed among eukaryotes. Some SINE families are organized in superfamilies characterized by a shared central domain. These central domains are conserved across species, classes, and even phyla. Here we report the identification of two novel such superfamilies in the genomes of gastropod and bivalve mollusks. The central conserved domain of the first superfamily is present in SINEs in Caenogastropoda and Vetigastropoda as well as in all four subclasses of Bivalvia. We designated the domain MESC (Romanian for MElc—snail and SCoica—mussel) because it appears to be restricted to snails and mussels. The second superfamily is restricted to Caenogastropoda. Its central conserved domain—Snail—is related to the Nin-DC domain. Furthermore, we provide evidence that a 40-bp subdomain of the SINE V-domain is conserved in SINEs in mollusks and arthropods. It is predicted to form a stable stem-loop structure that is preserved in the context of the overall SINE RNA secondary structure in invertebrates. Our analysis also recovered short retrotransposons with a Long INterspersed Element (LINE)-derived 5′ end. These share the body and/or the tail with transfer RNA (tRNA)-derived SINEs within and across species. Finally, we identified CORE SINEs in gastropods and bivalves—extending the distribution range of this superfamily. PMID:26739168
A web-based system architecture for ontology-based data integration in the domain of IT benchmarking

NASA Astrophysics Data System (ADS)

Pfaff, Matthias; Krcmar, Helmut

2018-03-01

In the domain of IT benchmarking (ITBM), a variety of data and information are collected. Although these data serve as the basis for business analyses, no unified semantic representation of such data yet exists. Consequently, data analysis across different distributed data sets and different benchmarks is almost impossible. This paper presents a system architecture and prototypical implementation for an integrated data management of distributed databases based on a domain-specific ontology. To preserve the semantic meaning of the data, the ITBM ontology is linked to data sources and functions as the central concept for database access. Thus, additional databases can be integrated by linking them to this domain-specific ontology and are directly available for further business analyses. Moreover, the web-based system supports the process of mapping ontology concepts to external databases by introducing a semi-automatic mapping recommender and by visualizing possible mapping candidates. The system also provides a natural language interface to easily query linked databases. The expected result of this ontology-based approach of knowledge representation and data access is an increase in knowledge and data sharing in this domain, which will enhance existing business analysis methods.
Insights into the immune manipulation mechanisms of pollen allergens by protein domain profiling.

PubMed

Patel, Seema; Rani, Aruna; Goyal, Arun

2017-10-01

Plant pollens are airborne allergens, as their inhalation causes immune activation, leading to rhinitis, conjunctivitis, sinusitis and oral allergy syndrome. A myriad of pollen proteins belonging to profilin, expansin, polygalacturonase, glucan endoglucosidase, pectin esterase, and lipid transfer protein class have been identified. In the present in silico study, the protein domains of fifteen pollen sequences were extracted from the UniProt database and submitted to the interactive web tool SMART (Simple Modular Architecture Research Tool), for finding the protein domain profiles. Analysis of the data based on custom-made scripts revealed the conservation of pathogenic domains such as OmpH, PROF, PreSET, Bet_v_1, Cpl-7 and GAS2. Further, the retention of critical domains like CHASE2, Galanin, Dak2, DALR_1, HAMP, PWI, EFh, Excalibur, CT, PbH1, HELICc, and Kelch in pollen proteins, much like cockroach allergens and lethal viruses (such as HIV, HCV, Ebola, Dengue and Zika) was observed. Based on the shared motifs in proteins of taxonomicall-ydispersed organisms, it can be hypothesized that allergens and pathogens manipulate the human immune system in a similar manner. Allergens, being inanimate, cannot replicate in human body, and are neutralized by immune system. But, when the allergens are unremitting, the immune system becomes persistently hyper-sensitized, creating an inflammatory milieu. This study is expected to contribute to the understanding of pollen allergenicity and pathogenicity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Missense Mutations in the N-Terminal Domain of Human Phenylalanine Hydroxylase Interfere with Binding of Regulatory Phenylalanine

PubMed Central

Gjetting, Torben; Petersen, Marie; Guldberg, Per; Güttler, Flemming

2001-01-01

Hyperphenylalaninemia due to a deficiency of phenylalanine hydroxylase (PAH) is an autosomal recessive disorder caused by >400 mutations in the PAH gene. Recent work has suggested that the majority of PAH missense mutations impair enzyme activity by causing increased protein instability and aggregation. In this study, we describe an alternative mechanism by which some PAH mutations may render PAH defective. Database searches were used to identify regions in the N-terminal domain of PAH with homology to the regulatory domain of prephenate dehydratase (PDH), the rate-limiting enzyme in the bacterial phenylalanine biosynthesis pathway. Naturally occurring N-terminal PAH mutations are distributed in a nonrandom pattern and cluster within residues 46–48 (GAL) and 65–69 (IESRP), two motifs highly conserved in PDH. To examine whether N-terminal PAH mutations affect the ability of PAH to bind phenylalanine at the regulatory domain, wild-type and five mutant (G46S, A47V, T63P/H64N, I65T, and R68S) forms of the N-terminal domain (residues 2–120) of human PAH were expressed as fusion proteins in Escherichia coli. Binding studies showed that the wild-type form of this domain specifically binds phenylalanine, whereas all mutations abolished or significantly reduced this phenylalanine-binding capacity. Our data suggest that impairment of phenylalanine-mediated activation of PAH may be an important disease-causing mechanism of some N-terminal PAH mutations, which may explain some well-documented genotype-phenotype discrepancies in PAH deficiency. PMID:11326337
Motif discovery with data mining in 3D protein structure databases: discovery, validation and prediction of the U-shape zinc binding ("Huf-Zinc") motif.

PubMed

Maurer-Stroh, Sebastian; Gao, He; Han, Hao; Baeten, Lies; Schymkowitz, Joost; Rousseau, Frederic; Zhang, Louxin; Eisenhaber, Frank

2013-02-01

Data mining in protein databases, derivatives from more fundamental protein 3D structure and sequence databases, has considerable unearthed potential for the discovery of sequence motif--structural motif--function relationships as the finding of the U-shape (Huf-Zinc) motif, originally a small student's project, exemplifies. The metal ion zinc is critically involved in universal biological processes, ranging from protein-DNA complexes and transcription regulation to enzymatic catalysis and metabolic pathways. Proteins have evolved a series of motifs to specifically recognize and bind zinc ions. Many of these, so called zinc fingers, are structurally independent globular domains with discontinuous binding motifs made up of residues mostly far apart in sequence. Through a systematic approach starting from the BRIX structure fragment database, we discovered that there exists another predictable subset of zinc-binding motifs that not only have a conserved continuous sequence pattern but also share a characteristic local conformation, despite being included in totally different overall folds. While this does not allow general prediction of all Zn binding motifs, a HMM-based web server, Huf-Zinc, is available for prediction of these novel, as well as conventional, zinc finger motifs in protein sequences. The Huf-Zinc webserver can be freely accessed through this URL (http://mendel.bii.a-star.edu.sg/METHODS/hufzinc/).
Comparative Sequence and X-Inactivation Analyses of a Domain of Escape in Human Xp11.2 and the Conserved Segment in Mouse

PubMed Central

Tsuchiya, Karen D.; Greally, John M.; Yi, Yajun; Noel, Kevin P.; Truong, Jean-Pierre; Disteche, Christine M.

2004-01-01

We have performed X-inactivation and sequence analyses on 350 kb of sequence from human Xp11.2, a region shown previously to contain a cluster of genes that escape X inactivation, and we compared this region with the region of conserved synteny in mouse. We identified several new transcripts from this region in human and in mouse, which defined the full extent of the domain escaping X inactivation in both species. In human, escape from X inactivation involves an uninterrupted 235-kb domain of multiple genes. Despite highly conserved gene content and order between the two species, Smcx is the only mouse gene from the conserved segment that escapes inactivation. As repetitive sequences are believed to facilitate spreading of X inactivation along the chromosome, we compared the repetitive sequence composition of this region between the two species. We found that long terminal repeats (LTRs) were decreased in the human domain of escape, but not in the majority of the conserved mouse region adjacent to Smcx in which genes were subject to X inactivation, suggesting that these repeats might be excluded from escape domains to prevent spreading of silencing. Our findings indicate that genomic context, as well as gene-specific regulatory elements, interact to determine expression of a gene from the inactive X-chromosome. PMID:15197169
ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation.

PubMed

Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

2017-01-04

The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of 'index' orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Evolutionary dynamics of protein domain architecture in plants

PubMed Central

2012-01-01

Background Protein domains are the structural, functional and evolutionary units of the protein. Protein domain architectures are the linear arrangements of domain(s) in individual proteins. Although the evolutionary history of protein domain architecture has been extensively studied in microorganisms, the evolutionary dynamics of domain architecture in the plant kingdom remains largely undefined. To address this question, we analyzed the lineage-based protein domain architecture content in 14 completed green plant genomes. Results Our analyses show that all 14 plant genomes maintain similar distributions of species-specific, single-domain, and multi-domain architectures. Approximately 65% of plant domain architectures are universally present in all plant lineages, while the remaining architectures are lineage-specific. Clear examples are seen of both the loss and gain of specific protein architectures in higher plants. There has been a dynamic, lineage-wise expansion of domain architectures during plant evolution. The data suggest that this expansion can be largely explained by changes in nuclear ploidy resulting from rounds of whole genome duplications. Indeed, there has been a decrease in the number of unique domain architectures when the genomes were normalized into a presumed ancestral genome that has not undergone whole genome duplications. Conclusions Our data show the conservation of universal domain architectures in all available plant genomes, indicating the presence of an evolutionarily conserved, core set of protein components. However, the occurrence of lineage-specific domain architectures indicates that domain architecture diversity has been maintained beyond these core components in plant genomes. Although several features of genome-wide domain architecture content are conserved in plants, the data clearly demonstrate lineage-wise, progressive changes and expansions of individual protein domain architectures, reinforcing the notion that plant genomes have undergone dynamic evolution. PMID:22252370
FishTraits Database

USGS Publications Warehouse

Angermeier, Paul L.; Frimpong, Emmanuel A.

2009-01-01

The need for integrated and widely accessible sources of species traits data to facilitate studies of ecology, conservation, and management has motivated development of traits databases for various taxa. In spite of the increasing number of traits-based analyses of freshwater fishes in the United States, no consolidated database of traits of this group exists publicly, and much useful information on these species is documented only in obscure sources. The largely inaccessible and unconsolidated traits information makes large-scale analysis involving many fishes and/or traits particularly challenging. FishTraits is a database of >100 traits for 809 (731 native and 78 exotic) fish species found in freshwaters of the conterminous United States, including 37 native families and 145 native genera. The database contains information on four major categories of traits: (1) trophic ecology, (2) body size and reproductive ecology (life history), (3) habitat associations, and (4) salinity and temperature tolerances. Information on geographic distribution and conservation status is also included. Together, we refer to the traits, distribution, and conservation status information as attributes. Descriptions of attributes are available here. Many sources were consulted to compile attributes, including state and regional species accounts and other databases.
Topology and weights in a protein domain interaction network--a novel way to predict protein interactions.

PubMed

Wuchty, Stefan

2006-05-23

While the analysis of unweighted biological webs as diverse as genetic, protein and metabolic networks allowed spectacular insights in the inner workings of a cell, biological networks are not only determined by their static grid of links. In fact, we expect that the heterogeneity in the utilization of connections has a major impact on the organization of cellular activities as well. We consider a web of interactions between protein domains of the Protein Family database (PFAM), which are weighted by a probability score. We apply metrics that combine the static layout and the weights of the underlying interactions. We observe that unweighted measures as well as their weighted counterparts largely share the same trends in the underlying domain interaction network. However, we only find weak signals that weights and the static grid of interactions are connected entities. Therefore assuming that a protein interaction is governed by a single domain interaction, we observe strong and significant correlations of the highest scoring domain interaction and the confidence of protein interactions in the underlying interactions of yeast and fly. Modeling an interaction between proteins if we find a high scoring protein domain interaction we obtain 1, 428 protein interactions among 361 proteins in the human malaria parasite Plasmodium falciparum. Assessing their quality by a logistic regression method we observe that increasing confidence of predicted interactions is accompanied by high scoring domain interactions and elevated levels of functional similarity and evolutionary conservation. Our results indicate that probability scores are randomly distributed, allowing to treat static grid and weights of domain interactions as separate entities. In particular, these finding confirms earlier observations that a protein interaction is a matter of a single interaction event on domain level. As an immediate application, we show a simple way to predict potential protein interactions by utilizing expectation scores of single domain interactions.
Structural Basis for Endosomal Targeting by the Bro1 Domain

PubMed Central

Kim, Jaewon; Sitaraman, Sujatha; Hierro, Aitor; Beach, Bridgette M.; Odorizzi, Greg; Hurley, James H.

2010-01-01

Summary Proteins delivered to the lysosome or the yeast vacuole via late endosomes are sorted by the ESCRT complexes and by associated proteins, including Alix and its yeast homolog Bro1. Alix, Bro1, and several other late endosomal proteins share a conserved 160 residue Bro1 domain whose boundaries, structure, and function have not been characterized. The crystal structure of the Bro1 domain of Bro1 reveals a folded core of 367 residues. The extended Bro1 domain is necessary and sufficient for binding to the ESCRT-III subunit Snf7 and for the recruitment of Bro1 to late endosomes. The structure resembles a boomerang with its concave face filled in and contains a triple tetratricopeptide repeat domain as a substructure. Snf7 binds to a conserved hydrophobic patch on Bro1 that is required for protein complex formation and for the protein-sorting function of Bro1. These results define a conserved mechanism whereby Bro1 domain-containing proteins are targeted to endosomes by Snf7 and its orthologs. PMID:15935782
Structural Insight into the Mechanism of c-di-GMP hydrolysis by EAL domain phosphodiesterases.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tchigvintsev, A.; Xu, X.; Singer, A.

2010-08-01

Cyclic diguanylate (or bis-(3'-5') cyclic dimeric guanosine monophosphate; c-di-GMP) is a ubiquitous second messenger that regulates diverse cellular functions, including motility, biofilm formation, cell cycle progression, and virulence in bacteria. In the cell, degradation of c-di-GMP is catalyzed by highly specific EAL domain phosphodiesterases whose catalytic mechanism is still unclear. Here, we purified 13 EAL domain proteins from various organisms and demonstrated that their catalytic activity is associated with the presence of 10 conserved EAL domain residues. The crystal structure of the TBD1265 EAL domain was determined in free state (1.8 {angstrom}) and in complex with c-di-GMP (2.35 {angstrom}), andmore » unveiled the role of conserved residues in substrate binding and catalysis. The structure revealed the presence of two metal ions directly coordinated by six conserved residues, two oxygens of c-di-GMP phosphate, and potential catalytic water molecule. Our results support a two-metal-ion catalytic mechanism of c-di-GMP hydrolysis by EAL domain phosphodiesterases.« less
From data repositories to submission portals: rethinking the role of domain-specific databases in CollecTF.

PubMed

Kılıç, Sefa; Sagitova, Dinara M; Wolfish, Shoshannah; Bely, Benoit; Courtot, Mélanie; Ciufo, Stacy; Tatusova, Tatiana; O'Donovan, Claire; Chibucos, Marcus C; Martin, Maria J; Erill, Ivan

2016-01-01

Domain-specific databases are essential resources for the biomedical community, leveraging expert knowledge to curate published literature and provide access to referenced data and knowledge. The limited scope of these databases, however, poses important challenges on their infrastructure, visibility, funding and usefulness to the broader scientific community. CollecTF is a community-oriented database documenting experimentally validated transcription factor (TF)-binding sites in the Bacteria domain. In its quest to become a community resource for the annotation of transcriptional regulatory elements in bacterial genomes, CollecTF aims to move away from the conventional data-repository paradigm of domain-specific databases. Through the adoption of well-established ontologies, identifiers and collaborations, CollecTF has progressively become also a portal for the annotation and submission of information on transcriptional regulatory elements to major biological sequence resources (RefSeq, UniProtKB and the Gene Ontology Consortium). This fundamental change in database conception capitalizes on the domain-specific knowledge of contributing communities to provide high-quality annotations, while leveraging the availability of stable information hubs to promote long-term access and provide high-visibility to the data. As a submission portal, CollecTF generates TF-binding site information through direct annotation of RefSeq genome records, definition of TF-based regulatory networks in UniProtKB entries and submission of functional annotations to the Gene Ontology. As a database, CollecTF provides enhanced search and browsing, targeted data exports, binding motif analysis tools and integration with motif discovery and search platforms. This innovative approach will allow CollecTF to focus its limited resources on the generation of high-quality information and the provision of specialized access to the data.Database URL: http://www.collectf.org/. © The Author(s) 2016. Published by Oxford University Press.
Missing Modality Transfer Learning via Latent Low-Rank Constraint.

PubMed

Ding, Zhengming; Shao, Ming; Fu, Yun

2015-11-01

Transfer learning is usually exploited to leverage previously well-learned source domain for evaluating the unknown target domain; however, it may fail if no target data are available in the training stage. This problem arises when the data are multi-modal. For example, the target domain is in one modality, while the source domain is in another. To overcome this, we first borrow an auxiliary database with complete modalities, then consider knowledge transfer across databases and across modalities within databases simultaneously in a unified framework. The contributions are threefold: 1) a latent factor is introduced to uncover the underlying structure of the missing modality from the known data; 2) transfer learning in two directions allows the data alignment between both modalities and databases, giving rise to a very promising recovery; and 3) an efficient solution with theoretical guarantees to the proposed latent low-rank transfer learning algorithm. Comprehensive experiments on multi-modal knowledge transfer with missing target modality verify that our method can successfully inherit knowledge from both auxiliary database and source modality, and therefore significantly improve the recognition performance even when test modality is inaccessible in the training stage.
Structure-Templated Predictions of Novel Protein Interactions from Sequence Information

PubMed Central

Betel, Doron; Breitkreuz, Kevin E; Isserlin, Ruth; Dewar-Darch, Danielle; Tyers, Mike; Hogue, Christopher W. V

2007-01-01

The multitude of functions performed in the cell are largely controlled by a set of carefully orchestrated protein interactions often facilitated by specific binding of conserved domains in the interacting proteins. Interacting domains commonly exhibit distinct binding specificity to short and conserved recognition peptides called binding profiles. Although many conserved domains are known in nature, only a few have well-characterized binding profiles. Here, we describe a novel predictive method known as domain–motif interactions from structural topology (D-MIST) for elucidating the binding profiles of interacting domains. A set of domains and their corresponding binding profiles were derived from extant protein structures and protein interaction data and then used to predict novel protein interactions in yeast. A number of the predicted interactions were verified experimentally, including new interactions of the mitotic exit network, RNA polymerases, nucleotide metabolism enzymes, and the chaperone complex. These results demonstrate that new protein interactions can be predicted exclusively from sequence information. PMID:17892321
Mechanism of Mediator recruitment by tandem Gcn4 activation domains and three Gal11 activator-binding domains.

PubMed

Herbig, Eric; Warfield, Linda; Fish, Lisa; Fishburn, James; Knutson, Bruce A; Moorefield, Beth; Pacheco, Derek; Hahn, Steven

2010-05-01

Targets of the tandem Gcn4 acidic activation domains in transcription preinitiation complexes were identified by site-specific cross-linking. The individual Gcn4 activation domains cross-link to three common targets, Gal11/Med15, Taf12, and Tra1, which are subunits of four conserved coactivator complexes, Mediator, SAGA, TFIID, and NuA4. The Gcn4 N-terminal activation domain also cross-links to the Mediator subunit Sin4/Med16. The contribution of the two Gcn4 activation domains to transcription was gene specific and varied from synergistic to less than additive. Gcn4-dependent genes had a requirement for Gal11 ranging from 10-fold dependence to complete Gal11 independence, while the Gcn4-Taf12 interaction did not significantly contribute to the expression of any gene studied. Complementary methods identified three conserved Gal11 activator-binding domains that bind each Gcn4 activation domain with micromolar affinity. These Gal11 activator-binding domains contribute additively to transcription activation and Mediator recruitment at Gcn4- and Gal11-dependent genes. Although we found that the conserved Gal11 KIX domain contributes to Gal11 function, we found no evidence of specific Gcn4-KIX interaction and conclude that the Gal11 KIX domain does not function by specific interaction with Gcn4. Our combined results show gene-specific coactivator requirements, a surprising redundancy in activator-target interactions, and an activator-coactivator interaction mediated by multiple low-affinity protein-protein interactions.
Differential sequence diversity at merozoite surface protein-1 locus of Plasmodium knowlesi from humans and macaques in Thailand.

PubMed

Putaporntip, Chaturong; Thongaree, Siriporn; Jongwutiwes, Somchai

2013-08-01

To determine the genetic diversity and potential transmission routes of Plasmodium knowlesi, we analyzed the complete nucleotide sequence of the gene encoding the merozoite surface protein-1 of this simian malaria (Pkmsp-1), an asexual blood-stage vaccine candidate, from naturally infected humans and macaques in Thailand. Analysis of Pkmsp-1 sequences from humans (n=12) and monkeys (n=12) reveals five conserved and four variable domains. Most nucleotide substitutions in conserved domains were dimorphic whereas three of four variable domains contained complex repeats with extensive sequence and size variation. Besides purifying selection in conserved domains, evidence of intragenic recombination scattering across Pkmsp-1 was detected. The number of haplotypes, haplotype diversity, nucleotide diversity and recombination sites of human-derived sequences exceeded that of monkey-derived sequences. Phylogenetic networks based on concatenated conserved sequences of Pkmsp-1 displayed a character pattern that could have arisen from sampling process or the presence of two independent routes of P. knowlesi transmission, i.e. from macaques to human and from human to humans in Thailand. Copyright © 2013 Elsevier B.V. All rights reserved.

Pictorial materials database: 1200 combinations of pigments, dyes, binders and varnishes designed as a tool for heritage science and conservation

NASA Astrophysics Data System (ADS)

Cavaleri, Tiziana; Buscaglia, Paola; Migliorini, Simonetta; Nervo, Marco; Piccablotto, Gabriele; Piccirillo, Anna; Pisani, Marco; Puglisi, Davide; Vaudan, Dario; Zucco, Massimo

2017-06-01

The conservation of artworks requires a profound knowledge about pictorial materials, their chemical and physical properties and their interaction and/or degradation processes. For this reason, pictorial materials databases are widely used to study and investigate cultural heritage. At Centre for Conservation and Restoration La Venaria Reale, we prepared a set of about 1200 mock-ups with 173 different pigments and/or dyes, used across all the historical times or as products for conservation, four binders, two varnishes and four different materials for underdrawings. In collaboration with the Laboratorio Analisi Scientifiche of Regione Autonoma Valle d'Aosta, the National Institute of Metrological Research and the Department of Architecture and Design of the Polytechnic of Turin, we created a scientific database that is now available online (http://www.centrorestaurovenaria.it/en/areas/diagnostic/pictorial-materials-database) designed as a tool for heritage science and conservation. Here, we present a focus on materials for pictorial retouching where the hyperspectral imaging application, conducted with a prototype of new technology, allowed to provide a list of pigments that could be more suitable for conservation treatments and pictorial retouching. Then we present the case study of the industrial painting Notte Barbara (1962) by Pinot Gallizio where the use of the database including modern and contemporary art materials showed to be very useful and where the fibre optics reflectance spectroscopy technique was decisive for pigment identification purpose. Later in this research, the mock-ups will be exploited to study degradation processes, e.g., the lightfastness, or the possible formation of interaction products, e.g., metal carboxylates.
Identification of functional domains in Arabidopsis thaliana mRNA decapping enzyme (AtDcp2)

PubMed Central

Gunawardana, Dilantha; Cheng, Heung-Chin; Gayler, Kenwyn R.

2008-01-01

The Arabidopsis thaliana decapping enzyme (AtDcp2) was characterized by bioinformatics analysis and by biochemical studies of the enzyme and mutants produced by recombinant expression. Three functionally significant regions were detected: (i) a highly disordered C-terminal region with a putative PSD-95, Discs-large, ZO-1 (PDZ) domain-binding motif, (ii) a conserved Nudix box constituting the putative active site and (iii) a putative RNA binding domain consisting of the conserved Box B and a preceding loop region. Mutation of the putative PDZ domain-binding motif improved the stability of recombinant AtDcp2 and secondary mutants expressed in Escherichia coli. Such recombinant AtDcp2 specifically hydrolysed capped mRNA to produce 7-methyl GDP and decapped RNA. AtDcp2 activity was Mn2+- or Mg2+-dependent and was inhibited by the product 7-methyl GDP. Mutation of the conserved glutamate-154 and glutamate-158 in the Nudix box reduced AtDcp2 activity up to 400-fold and showed that AtDcp2 employs the catalytic mechanism conserved amongst Nudix hydrolases. Unlike many Nudix hydrolases, AtDcp2 is refractory to inhibition by fluoride ions. Decapping was dependent on binding to the mRNA moiety rather than to the 7-methyl diguanosine triphosphate cap of the substrate. Mutational analysis of the putative RNA-binding domain confirmed the functional significance of an 11-residue loop region and the conserved Box B. PMID:18025047
Mutation of domain III and domain VI in L gene conserved domain of Nipah virus

NASA Astrophysics Data System (ADS)

Jalani, Siti Aishah; Ibrahim, Nazlina

2016-11-01

Nipah virus (NiV) is the etiologic agent responsible for the respiratory illness and causes fatal encephalitis in human. NiV L protein subunit is thought to be responsible for the majority of enzymatic activities involved in viral transcription and replication. The L protein which is the viral RNA dependent RNA polymerase has high sequence homology among negative sense RNA viruses. In negative stranded RNA viruses, based on sequence alignment six conserved domain (domain I-IV) have been determined. Each domain is separated on variable regions that suggest the structure to consist concatenated functional domain. To directly address the roles of domains III and VI, site-directed mutations were constructed by the substitution of bases at sequences 2497, 2500, 5528 and 5532. Each mutated L gene can be used in future studies to test the ability for expression on in vitro translation.
Characterization of a novel domain ‘GATE’ in the ABC protein DrrA and its role in drug efflux by the DrrAB complex

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Han; Rahman, Sadia; Li, Wen

2015-03-27

A novel domain, GATE (Glycine-loop And Transducer Element), is identified in the ABC protein DrrA. This domain shows sequence and structural conservation among close homologs of DrrA as well as distantly-related ABC proteins. Among the highly conserved residues in this domain are three glycines, G215, G221 and G231, of which G215 was found to be critical for stable expression of the DrrAB complex. Other conserved residues, including E201, G221, K227 and G231, were found to be critical for the catalytic and transport functions of the DrrAB transporter. Structural analysis of both the previously published crystal structure of the DrrA homologmore » MalK and the modeled structure of DrrA showed that G215 makes close contacts with residues in and around the Walker A motif, suggesting that these interactions may be critical for maintaining the integrity of the ATP binding pocket as well as the complex. It is also shown that G215A or K227R mutation diminishes some of the atomic interactions essential for ATP catalysis and overall transport function. Therefore, based on both the biochemical and structural analyses, it is proposed that the GATE domain, located outside of the previously identified ATP binding and hydrolysis motifs, is an additional element involved in ATP catalysis. - Highlights: • A novel domain ‘GATE’ is identified in the ABC protein DrrA. • GATE shows high sequence and structural conservation among diverse ABC proteins. • GATE is located outside of the previously studied ATP binding and hydrolysis motifs. • Conserved GATE residues are critical for stability of DrrAB and for ATP catalysis.« less
Myosin MyTH4-FERM structures highlight important principles of convergent evolution.

PubMed

Planelles-Herrero, Vicente José; Blanc, Florian; Sirigu, Serena; Sirkia, Helena; Clause, Jeffrey; Sourigues, Yannick; Johnsrud, Daniel O; Amigues, Beatrice; Cecchini, Marco; Gilbert, Susan P; Houdusse, Anne; Titus, Margaret A

2016-05-24

Myosins containing MyTH4-FERM (myosin tail homology 4-band 4.1, ezrin, radixin, moesin, or MF) domains in their tails are found in a wide range of phylogenetically divergent organisms, such as humans and the social amoeba Dictyostelium (Dd). Interestingly, evolutionarily distant MF myosins have similar roles in the extension of actin-filled membrane protrusions such as filopodia and bind to microtubules (MT), suggesting that the core functions of these MF myosins have been highly conserved over evolution. The structures of two DdMyo7 signature MF domains have been determined and comparison with mammalian MF structures reveals that characteristic features of MF domains are conserved. However, across millions of years of evolution conserved class-specific insertions are seen to alter the surfaces and the orientation of subdomains with respect to each other, likely resulting in new sites for binding partners. The MyTH4 domains of Myo10 and DdMyo7 bind to MT with micromolar affinity but, surprisingly, their MT binding sites are on opposite surfaces of the MyTH4 domain. The structural analysis in combination with comparison of diverse MF myosin sequences provides evidence that myosin tail domain features can be maintained without strict conservation of motifs. The results illustrate how tuning of existing features can give rise to new structures while preserving the general properties necessary for myosin tails. Thus, tinkering with the MF domain enables it to serve as a multifunctional platform for cooperative recruitment of various partners, allowing common properties such as autoinhibition of the motor and microtubule binding to arise through convergent evolution.
dBBQs: dataBase of Bacterial Quality scores.

PubMed

Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

2017-12-28

It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.
RESIS-II: An Updated Version of the Original Reservoir Sedimentation Survey Information System (RESIS) Database

USGS Publications Warehouse

Ackerman, Katherine V.; Mixon, David M.; Sundquist, Eric T.; Stallard, Robert F.; Schwarz, Gregory E.; Stewart, David W.

2009-01-01

The Reservoir Sedimentation Survey Information System (RESIS) database, originally compiled by the Soil Conservation Service (now the Natural Resources Conservation Service) in collaboration with the Texas Agricultural Experiment Station, is the most comprehensive compilation of data from reservoir sedimentation surveys throughout the conterminous United States (U.S.). The database is a cumulative historical archive that includes data from as early as 1755 and as late as 1993. The 1,823 reservoirs included in the database range in size from farm ponds to the largest U.S. reservoirs (such as Lake Mead). Results from 6,617 bathymetric surveys are available in the database. This Data Series provides an improved version of the original RESIS database, termed RESIS-II, and a report describing RESIS-II. The RESIS-II relational database is stored in Microsoft Access and includes more precise location coordinates for most of the reservoirs than the original database but excludes information on reservoir ownership. RESIS-II is anticipated to be a template for further improvements in the database.
Prediction of Ras-effector interactions using position energy matrices.

PubMed

Kiel, Christina; Serrano, Luis

2007-09-01

One of the more challenging problems in biology is to determine the cellular protein interaction network. Progress has been made to predict protein-protein interactions based on structural information, assuming that structural similar proteins interact in a similar way. In a previous publication, we have determined a genome-wide Ras-effector interaction network based on homology models, with a high accuracy of predicting binding and non-binding domains. However, for a prediction on a genome-wide scale, homology modelling is a time-consuming process. Therefore, we here successfully developed a faster method using position energy matrices, where based on different Ras-effector X-ray template structures, all amino acids in the effector binding domain are sequentially mutated to all other amino acid residues and the effect on binding energy is calculated. Those pre-calculated matrices can then be used to score for binding any Ras or effector sequences. Based on position energy matrices, the sequences of putative Ras-binding domains can be scanned quickly to calculate an energy sum value. By calibrating energy sum values using quantitative experimental binding data, thresholds can be defined and thus non-binding domains can be excluded quickly. Sequences which have energy sum values above this threshold are considered to be potential binding domains, and could be further analysed using homology modelling. This prediction method could be applied to other protein families sharing conserved interaction types, in order to determine in a fast way large scale cellular protein interaction networks. Thus, it could have an important impact on future in silico structural genomics approaches, in particular with regard to increasing structural proteomics efforts, aiming to determine all possible domain folds and interaction types. All matrices are deposited in the ADAN database (http://adan-embl.ibmc.umh.es/). Supplementary data are available at Bioinformatics online.
Identification and characterization of mobile genetic elements LINEs from Brassica genome.

PubMed

Nouroz, Faisal; Noreen, Shumaila; Khan, Muhammad Fiaz; Ahmed, Shehzad; Heslop-Harrison, J S Pat

2017-09-05

Among transposable elements (TEs), the LTR retrotransposons are abundant followed by non-LTR retrotransposons in plant genomes, the lateral being represented by LINEs and SINEs. Computational and molecular approaches were used for the characterization of Brassica LINEs, their diversity and phylogenetic relationships. Four autonomous and four non-autonomous LINE families were identified and characterized from Brassica. Most of the autonomous LINEs displayed two open reading frames, ORF1 and ORF2, where ORF1 is a gag protein domain, while ORF2 encodes endonuclease (EN) and a reverse transcriptase (RT). Three of four families encoded an additional RNase H (RH) domain in pol gene common to 'R' and 'I' type of LINEs. The PCR analyses based on LINEs RT fragments indicate their high diversity and widespread occurrence in tested 40 Brassica cultivars. Database searches revealed the homology in LINE sequences in closely related genera Arabidopsis indicating their origin from common ancestors predating their separation. The alignment of 58 LINEs RT sequences from Brassica, Arabidopsis and other plants depicted 4 conserved domains (domain II-V) showing similarity to previously detected domains. Based on RT alignment of Brassica and 3 known LINEs from monocots, Brassicaceae LINEs clustered in separate clade, further resolving 4 Brassica-Arabidopsis specific families in 2 sub-clades. High similarities were observed in RT sequences in the members of same family, while low homology was detected in members across the families. The investigation led to the characterization of Brassica specific LINE families and their diversity across Brassica species and their cultivars. Copyright © 2017 Elsevier B.V. All rights reserved.
ExDom: an integrated database for comparative analysis of the exon–intron structures of protein domains in eukaryotes

PubMed Central

Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan

2009-01-01

We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/. PMID:18984624
The GP(Y/F) Domain of TF1 Integrase Multimerizes when Present in a Fragment, and Substitutions in This Domain Reduce Enzymatic Activity of the Full-length Protein*S⃞

PubMed Central

Ebina, Hirotaka; Chatterjee, Atreyi Ghatak; Judson, Robert L.; Levin, Henry L.

2008-01-01

Integrases (INs) of retroviruses and long terminal repeat retrotransposons possess a C-terminal domain with DNA binding activity. Other than this binding activity, little is known about how the C-terminal domain contributes to integration. A stretch of conserved amino acids called the GP(Y/F) domain has been identified within the C-terminal IN domains of two distantly related families, the γ-retroviruses and the metavirus retrotransposons. To enhance understanding of the C-terminal domain, we examined the function of the GP(Y/F) domain in the IN of Tf1, a long terminal repeat retrotransposon of Schizosaccharomyces pombe. The activities of recombinant IN were measured with an assay that modeled the reverse of integration called disintegration. Although deletion of the entire C-terminal domain disrupted disintegration activity, an alanine substitution (P365A) in a conserved amino acid of the GP(Y/F) domain did not significantly reduce disintegration. When assayed for the ability to join two molecules of DNA in a reaction that modeled forward integration, the P365A substitution disrupted activity. UV cross-linking experiments detected DNA binding activity in the C-terminal domain and found that this activity was not reduced by substitutions in two conserved amino acids of the GP(Y/F) domain, G364A and P365A. Gel filtration and cross-linking of a 71-amino acid fragment containing the GP(Y/F) domain revealed a surprising ability to form dimers, trimers, and tetramers that was disrupted by the G364A and P365A substitutions. These results suggest that the GP(Y/F) residues may play roles in promoting multimerization and intermolecular strand joining. PMID:18397885
The GP(Y/F) domain of TF1 integrase multimerizes when present in a fragment, and substitutions in this domain reduce enzymatic activity of the full-length protein.

PubMed

Ebina, Hirotaka; Chatterjee, Atreyi Ghatak; Judson, Robert L; Levin, Henry L

2008-06-06

Integrases (INs) of retroviruses and long terminal repeat retrotransposons possess a C-terminal domain with DNA binding activity. Other than this binding activity, little is known about how the C-terminal domain contributes to integration. A stretch of conserved amino acids called the GP(Y/F) domain has been identified within the C-terminal IN domains of two distantly related families, the gamma-retroviruses and the metavirus retrotransposons. To enhance understanding of the C-terminal domain, we examined the function of the GP(Y/F) domain in the IN of Tf1, a long terminal repeat retrotransposon of Schizosaccharomyces pombe. The activities of recombinant IN were measured with an assay that modeled the reverse of integration called disintegration. Although deletion of the entire C-terminal domain disrupted disintegration activity, an alanine substitution (P365A) in a conserved amino acid of the GP(Y/F) domain did not significantly reduce disintegration. When assayed for the ability to join two molecules of DNA in a reaction that modeled forward integration, the P365A substitution disrupted activity. UV cross-linking experiments detected DNA binding activity in the C-terminal domain and found that this activity was not reduced by substitutions in two conserved amino acids of the GP(Y/F) domain, G364A and P365A. Gel filtration and cross-linking of a 71-amino acid fragment containing the GP(Y/F) domain revealed a surprising ability to form dimers, trimers, and tetramers that was disrupted by the G364A and P365A substitutions. These results suggest that the GP(Y/F) residues may play roles in promoting multimerization and intermolecular strand joining.
Conservation of species, volume, and belief in patients with Alzheimer's disease: the issue of domain specificity and conceptual impairment.

PubMed

Zaitchik, Deborah; Solomon, Gregg E A

2009-09-01

Two studies investigated whether patients with Alzheimer's disease (AD) suffer high-level and category-specific impairment in the conceptual domain of living things. In Experiment 1, AD patients and healthy young and healthy elderly controls took part in three tasks: the conservation of species, volume, and belief. All 3 tasks required tracking an object's identity in the face of irrelevant but salient transformations. Healthy young and elderly controls performed at or near ceiling on all tasks. AD patients were at or near ceiling on the volume and belief tasks, but only about half succeeded on the species task. Experiment 2 demonstrated that the results were not due to simple task demands. AD patients' failure to conserve species indicates that they are impaired in their theoretical understanding of living things, and their success on the volume and belief tasks suggests that the impairment is domain-specific. Two hypotheses are put forward to explain the phenomenon: The first, a category-specific account, holds that the intuitive theory of biology undergoes pervasive degradation; the second, a hybrid domain-general/domain-specific account, holds that impairment to domain-general processes such as executive function interacts with core cognition, the primitive elements that are the foundation of domain-specific knowledge.
Conservation of species, volume, and belief in patients with Alzheimer’s disease: the issue of domain-specificity and conceptual impairment

PubMed Central

Zaitchik, Deborah; Solomon, Gregg E. A.

2009-01-01

Two studies investigated whether patients with Alzheimer’s disease (AD) suffer high-level and category-specific impairment in the conceptual domain of living things. In Study 1, AD patients and healthy young and healthy elderly controls took part in three tasks: the Conservation of Species, Volume, and Belief. All 3 tasks required tracking an object’s identity in the face of irrelevant but salient transformations. Healthy young and elderly controls performed at or near ceiling on all tasks. AD patients were at or near ceiling on the Volume and Belief tasks, but only about half succeeded on the Species task. Study 2 demonstrated that the results were not due to simple task demands. AD patients’ failure to conserve species indicates that they are impaired in their theoretical understanding of living things, and their success on the Volume and Belief tasks suggests that the impairment is domain-specific. Two hypotheses are put forward to explain the phenomenon: the first, a category-specific account, holds that the intuitive theory of biology undergoes pervasive degradation; the second, a hybrid domain-general/domain-specific account, holds that impairment to domain-general processes such as executive function interacts with core cognition, the primitive elements that are the foundation of domain-specific knowledge. PMID:20043252
The crystal structure of the regulatory domain of the human sodium-driven chloride/bicarbonate exchanger.

PubMed

Alvadia, Carolina M; Sommer, Theis; Bjerregaard-Andersen, Kaare; Damkier, Helle Hasager; Montrasio, Michele; Aalkjaer, Christian; Morth, J Preben

2017-09-21

The sodium-driven chloride/bicarbonate exchanger (NDCBE) is essential for maintaining homeostatic pH in neurons. The crystal structure at 2.8 Å resolution of the regulatory N-terminal domain of human NDCBE represents the first crystal structure of an electroneutral sodium-bicarbonate cotransporter. The crystal structure forms an equivalent dimeric interface as observed for the cytoplasmic domain of Band 3, and thus establishes that the consensus motif VTVLP is the key minimal dimerization motif. The VTVLP motif is highly conserved and likely to be the physiologically relevant interface for all other members of the SLC4 family. A novel conserved Zn 2+ -binding motif present in the N-terminal domain of NDCBE is identified and characterized in vitro. Cellular studies confirm the Zn 2+ dependent transport of two electroneutral bicarbonate transporters, NCBE and NBCn1. The Zn 2+ site is mapped to a cluster of histidines close to the conserved ETARWLKFEE motif and likely plays a role in the regulation of this important motif. The combined structural and bioinformatics analysis provides a model that predicts with additional confidence the physiologically relevant interface between the cytoplasmic domain and the transmembrane domain.
The Three Domains of Conservation Genetics: Case Histories from Hawaiian Waters

PubMed Central

2016-01-01

The scientific field of conservation biology is dominated by 3 specialties: phylogenetics, ecology, and evolution. Under this triad, phylogenetics is oriented towards the past history of biodiversity, conserving the divergent branches in the tree of life. The ecological component is rooted in the present, maintaining the contemporary life support systems for biodiversity. Evolutionary conservation (as defined here) is concerned with preserving the raw materials for generating future biodiversity. All 3 domains can be documented with genetic case histories in the waters of the Hawaiian Archipelago, an isolated chain of volcanic islands with 2 types of biodiversity: colonists, and new species that arose from colonists. This review demonstrates that 1) phylogenetic studies have identified previously unknown branches in the tree of life that are endemic to Hawaiian waters; 2) population genetic surveys define isolated marine ecosystems as management units, and 3) phylogeographic analyses illustrate the pathways of colonization that can enhance future biodiversity. Conventional molecular markers have advanced all 3 domains in conservation biology over the last 3 decades, and recent advances in genomics are especially valuable for understanding the foundations of future evolutionary diversity. PMID:27001936
DOE Office of Scientific and Technical Information (OSTI.GOV)

Madu, Ikenna G.; Belouzard, Sandrine; Whittaker, Gary R., E-mail: grw7@cornell.ed

The S2 domain of the coronavirus spike (S) protein is known to be responsible for mediating membrane fusion. In addition to a well-recognized cleavage site at the S1-S2 boundary, a second proteolytic cleavage site has been identified in the severe acute respiratory syndrome coronavirus (SARS-CoV) S2 domain (R797). C-terminal to this S2 cleavage site is a conserved region flanked by cysteine residues C822 and C833. Here, we investigated the importance of this well conserved region for SARS-CoV S-mediated fusion activation. We show that the residues between C822-C833 are well conserved across all coronaviruses. Mutagenic analysis of SARS-CoV S, combined withmore » cell-cell fusion and pseudotyped virion infectivity assays, showed a critical role for the core-conserved residues C822, D830, L831, and C833. Based on available predictive models, we propose that the conserved domain flanked by cysteines 822 and 833 forms a loop structure that interacts with components of the SARS-CoV S trimer to control the activation of membrane fusion.« less
Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

PubMed Central

Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

2012-01-01

Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086
A novel application of cultural consensus models to evaluate conservation education programs.

PubMed

Nekaris, K A I; McCabe, Sharon; Spaan, Denise; Ali, Muhammad Imron; Nijman, Vincent

2018-04-01

Conservation professionals recognize the need to evaluate education initiatives with a flexible approach that is culturally appropriate. Cultural-consensus theory (CCT) provides a framework for measuring the extent to which beliefs are communally held and has long been applied by social scientists. In a conservation-education context, we applied CCT and used free lists (i.e., a list of items on a topic stated in order of cultural importance) and domain analysis (analysis of how free lists go together within a cultural group) to evaluate a conservation education program in which we used a children's picture book to increase knowledge about and empathy for a critically endangered mammal, the Javan slow loris (Nycticebus javanicus). We extracted free lists of keywords generated by students (n = 580 in 18 schools) from essays they wrote before and after the education program. In 2 classroom sessions conducted approximately 18 weeks apart, we asked students to write an essay about their knowledge of the target species and then presented a book and several activities about slow loris ecology. Prior to the second session, we asked students to write a second essay. We generated free lists from both essays, quantified salience of terms used, and conducted minimal residuals factor analysis to determine presence of cultural domains surrounding slow lorises in each session. Students increased their use of words accurately associated with slow loris ecology and conservation from 43% in initial essays to 76% in final essays. Domain coherence increased from 22% to 47% across schools. Fifteen factors contributed to the domain slow loris. Between the first and second essays, factors that showed the greatest change were feeding ecology and slow loris as a forest protector, which increased 7-fold, and the humancentric factor, which decreased 5-fold. As demonstrated by knowledge retention and creation of unique stories and conservation opinions, children achieved all six levels of Bloom's taxonomy of learning domains. Free from the constraints of questionnaires and surveys, CCT methods provide a promising avenue to evaluate conservation education programs. © 2017 Society for Conservation Biology.
Rice Cellulose SynthaseA8 Plant-Conserved Region Is a Coiled-Coil at the Catalytic Core Entrance1[OPEN

PubMed Central

Rushton, Phillip S.; Olek, Anna T.; Makowski, Lee; Badger, John

2017-01-01

The crystallographic structure of a rice (Oryza sativa) cellulose synthase, OsCesA8, plant-conserved region (P-CR), one of two unique domains in the catalytic domain of plant CesAs, was solved to 2.4 Å resolution. Two antiparallel α-helices form a coiled-coil domain linked by a large extended connector loop containing a conserved trio of aromatic residues. The P-CR structure was fit into a molecular envelope for the P-CR domain derived from small-angle X-ray scattering data. The P-CR structure and molecular envelope, combined with a homology-based chain trace of the CesA8 catalytic core, were modeled into a previously determined CesA8 small-angle X-ray scattering molecular envelope to produce a detailed topological model of the CesA8 catalytic domain. The predicted position for the P-CR domain from the molecular docking models places the P-CR connector loop into a hydrophobic pocket of the catalytic core, with the coiled-coil aligned near the entrance of the substrate UDP-glucose into the active site. In this configuration, the P-CR coiled-coil alone is unlikely to regulate substrate access to the active site, but it could interact with other domains of CesA, accessory proteins, or other CesA catalytic domains to control substrate delivery. PMID:27879387

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mei, Yang; Ramanathan, Arvind; Glover, Karen

BECN1 is essential for autophagy, a critical eukaryotic cellular homeostasis pathway. Here we delineate a highly conserved BECN1 domain located between previously characterized BH3 and coiled-coil domains and elucidate its structure and role in autophagy. The 2.0 angstrom sulfur-single-wavelength anomalous dispersion X-ray crystal structure of this domain demonstrates that its N-terminal half is unstructured while its C-terminal half is helical; hence, we name it the flexible helical domain (FHD). Circular dichroism spectroscopy, double electron electron resonance electron paramagnetic resonance, and small-angle X-ray scattering (SAXS) analyses confirm that the FHD is partially disordered, even in the context of adjacent BECN1 domains.more » Molecular dynamic simulations fitted to SAXS data indicate that the FHD transiently samples more helical conformations. FHD helicity increases in 2,2,2-trifluoroethanol, suggesting it may become more helical upon binding. Lastly, cellular studies show that conserved FHD residues are required for starvation-induced autophagy. Thus, the FHD likely undergoes a binding-associated disorder to-helix transition, and conserved residues critical for this interaction are essential for starvation-induced autophagy.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Mei, Yang; Ramanathan, Arvind; Glover, Karen

BECN1 is essential for autophagy, a critical eukaryotic cellular homeostasis pathway. Here in this study, we delineate a highly conserved BECN1 domain located between previously characterized BH3 and coiled-coil domains and elucidate its structure and role in autophagy. The 2.0 Å sulfur-single-wavelength anomalous dispersion X-ray crystal structure of this domain demonstrates that its N-terminal half is unstructured while its C-terminal half is helical; hence, we name it the flexible helical domain (FHD). Circular dichroism spectroscopy, double electron–electron resonance–electron paramagnetic resonance, and small-angle X-ray scattering (SAXS) analyses confirm that the FHD is partially disordered, even in the context of adjacent BECN1more » domains. Molecular dynamic simulations fitted to SAXS data indicate that the FHD transiently samples more helical conformations. FHD helicity increases in 2,2,2-trifluoroethanol, suggesting it may become more helical upon binding. Finally, cellular studies show that conserved FHD residues are required for starvation-induced autophagy. Thus, the FHD likely undergoes a binding-associated disorder-to-helix transition, and conserved residues critical for this interaction are essential for starvation-induced autophagy.« less
The Enhancer of split complex arose prior to the diversification of schizophoran flies and is strongly conserved between Drosophila and stalk-eyed flies (Diopsidae)

PubMed Central

2011-01-01

Background In Drosophila, the Enhancer of split complex (E(spl)-C) comprises 11 bHLH and Bearded genes that function during Notch signaling to repress proneural identity in the developing peripheral nervous system. Comparison with other insects indicates that the basal state for Diptera is a single bHLH and Bearded homolog and that the expansion of the gene complex occurred in the lineage leading to Drosophila. However, comparative genomic data from other fly species that would elucidate the origin and sequence of gene duplication for the complex is lacking. Therefore, in order to examine the evolutionary history of the complex within Diptera, we reconstructed, using several fosmid clones, the entire E(spl)-complex in the stalk-eyed fly, Teleopsis dalmanni and collected additional homologs of E(spl)-C genes from searches of dipteran EST databases and the Glossina morsitans genome assembly. Results Comparison of the Teleopsis E(spl)-C gene organization with Drosophila indicates complete conservation in gene number and orientation between the species except that T. dalmanni contains a duplicated copy of E(spl)m5 that is not present in Drosophila. Phylogenetic analysis of E(spl)-complex bHLH and Bearded genes for several dipteran species clearly demonstrates that all members of the complex were present prior to the diversification of schizophoran flies. Comparison of upstream regulatory elements and 3' UTR domains between the species also reveals strong conservation for many of the genes and identifies several novel characteristics of E(spl)-C regulatory evolution including the discovery of a previously unidentified, highly conserved SPS+A domain between E(spl)mγ and E(spl)mβ. Conclusion Identifying the phylogenetic origin of E(spl)-C genes and their associated regulatory DNA is essential to understanding the functional significance of this well-studied gene complex. Results from this study provide numerous insights into the evolutionary history of the complex and will help refine the focus of studies examining the adaptive consequences of this gene expansion. PMID:22151427
FARE-CAFE: a database of functional and regulatory elements of cancer-associated fusion events.

PubMed

Korla, Praveen Kumar; Cheng, Jack; Huang, Chien-Hung; Tsai, Jeffrey J P; Liu, Yu-Hsuan; Kurubanjerdjit, Nilubon; Hsieh, Wen-Tsong; Chen, Huey-Yi; Ng, Ka-Lok

2015-01-01

Chromosomal translocation (CT) is of enormous clinical interest because this disorder is associated with various major solid tumors and leukemia. A tumor-specific fusion gene event may occur when a translocation joins two separate genes. Currently, various CT databases provide information about fusion genes and their genomic elements. However, no database of the roles of fusion genes, in terms of essential functional and regulatory elements in oncogenesis, is available. FARE-CAFE is a unique combination of CTs, fusion proteins, protein domains, domain-domain interactions, protein-protein interactions, transcription factors and microRNAs, with subsequent experimental information, which cannot be found in any other CT database. Genomic DNA information including, for example, manually collected exact locations of the first and second break points, sequences and karyotypes of fusion genes are included. FARE-CAFE will substantially facilitate the cancer biologist's mission of elucidating the pathogenesis of various types of cancer. This database will ultimately help to develop 'novel' therapeutic approaches. Database URL: http://ppi.bioinfo.asia.edu.tw/FARE-CAFE. © The Author(s) 2015. Published by Oxford University Press.
The study of co-citation analysis and knowledge structure on healthcare domain

NASA Astrophysics Data System (ADS)

Chu, Kuo-Chung; Liu, Wen-I.; Tsai, Ming-Yu

2012-11-01

With the prevalence of Internet and digital archives, the online e-journal database facilitates scholars to search literature in a research domain, or to cross-search an inter-disciplined field; the key literature can be efficiently traced out. This study intends to build a Web-based citation analysis system, which consists of four modules, they are: 1) literature search module; (2) statistics module; (3) articles analysis module; and (4) co-citation analysis module. The system focuses on PubMed Central dataset that has 170,000 records. In a research domain, a specific keyword searches in terms of authors, journals, and core issues. In addition, we use data mining techniques for co-citation analysis. The results assist researchers with in-depth understanding of the domain knowledge. Having an automated system for co-citation analysis, it helps to understand changes, trends, and knowledge structure of research domain. For the best of our knowledge, the proposed system differentiates from existing online electronic retrieval database analysis function. Perhaps, the proposed system is going to be a value-added database of healthcare domain, and hope to contribute the researchers.
Fullerene data mining using bibliometrics and database tomography

PubMed

Kostoff; Braun; Schubert; Toothman; Humenik

2000-01-01

Database tomography (DT) is a textual database analysis system consisting of two major components: (1) algorithms for extracting multiword phrase frequencies and phrase proximities (physical closeness of the multiword technical phrases) from any type of large textual database, to augment (2) interpretative capabilities of the expert human analyst. DT was used to derive technical intelligence from a fullerenes database derived from the Science Citation Index and the Engineering Compendex. Phrase frequency analysis by the technical domain experts provided the pervasive technical themes of the fullerenes database, and phrase proximity analysis provided the relationships among the pervasive technical themes. Bibliometric analysis of the fullerenes literature supplemented the DT results with author/journal/institution publication and citation data. Comparisons of fullerenes results with past analyses of similarly structured near-earth space, chemistry, hypersonic/supersonic flow, aircraft, and ship hydrodynamics databases are made. One important finding is that many of the normalized bibliometric distribution functions are extremely consistent across these diverse technical domains and could reasonably be expected to apply to broader chemical topics than fullerenes that span multiple structural classes. Finally, lessons learned about integrating the technical domain experts with the data mining tools are presented.
The sequence, structure and evolutionary features of HOTAIR in mammals

PubMed Central

2011-01-01

Background An increasing number of long noncoding RNAs (lncRNAs) have been identified recently. Different from all the others that function in cis to regulate local gene expression, the newly identified HOTAIR is located between HoxC11 and HoxC12 in the human genome and regulates HoxD expression in multiple tissues. Like the well-characterised lncRNA Xist, HOTAIR binds to polycomb proteins to methylate histones at multiple HoxD loci, but unlike Xist, many details of its structure and function, as well as the trans regulation, remain unclear. Moreover, HOTAIR is involved in the aberrant regulation of gene expression in cancer. Results To identify conserved domains in HOTAIR and study the phylogenetic distribution of this lncRNA, we searched the genomes of 10 mammalian and 3 non-mammalian vertebrates for matches to its 6 exons and the two conserved domains within the 1800 bp exon6 using Infernal. There was just one high-scoring hit for each mammal, but many low-scoring hits were found in both mammals and non-mammalian vertebrates. These hits and their flanking genes in four placental mammals and platypus were examined to determine whether HOTAIR contained elements shared by other lncRNAs. Several of the hits were within unknown transcripts or ncRNAs, many were within introns of, or antisense to, protein-coding genes, and conservation of the flanking genes was observed only between human and chimpanzee. Phylogenetic analysis revealed discrete evolutionary dynamics for orthologous sequences of HOTAIR exons. Exon1 at the 5' end and a domain in exon6 near the 3' end, which contain domains that bind to multiple proteins, have evolved faster in primates than in other mammals. Structures were predicted for exon1, two domains of exon6 and the full HOTAIR sequence. The sequence and structure of two fragments, in exon1 and the domain B of exon6 respectively, were identified to robustly occur in predicted structures of exon1, domain B of exon6 and the full HOTAIR in mammals. Conclusions HOTAIR exists in mammals, has poorly conserved sequences and considerably conserved structures, and has evolved faster than nearby HoxC genes. Exons of HOTAIR show distinct evolutionary features, and a 239 bp domain in the 1804 bp exon6 is especially conserved. These features, together with the absence of some exons and sequences in mouse, rat and kangaroo, suggest ab initio generation of HOTAIR in marsupials. Structure prediction identifies two fragments in the 5' end exon1 and the 3' end domain B of exon6, with sequence and structure invariably occurring in various predicted structures of exon1, the domain B of exon6 and the full HOTAIR. PMID:21496275
When Stroop helps Piaget: An inter-task positive priming paradigm in 9-year-old children.

PubMed

Linzarini, A; Houdé, O; Borst, G

2015-11-01

To determine whether inhibitory control is domain general or domain specific in school children, we asked 40 9-year-old children to perform an inter-task priming paradigm in which they responded to Stroop items on the primes and to Piaget number conservation items on the probes. The children were more efficient in the inhibition of a misleading "length-equals-number" heuristic in the number conservation task if they had successfully inhibited a previous prepotent reading response in the Stroop task. This study provides evidence that the inhibitory control ability of school children generalizes to distinct cognitive domains, that is, verbal for the Stroop task and logico-mathematical for Piaget's number conservation task. Copyright © 2015 Elsevier Inc. All rights reserved.
Genome-wide analysis of regulatory proteases sequences identified through bioinformatics data mining in Taenia solium.

PubMed

Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng

2014-06-04

Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases. Phylogenetic analysis using Bayes approach provided support for inferring functional divergence among regulatory cysteine and serine proteases. Numerous putative proteases were identified for the first time in T. solium, and important regulatory proteases have been predicted. This comprehensive analysis not only complements the growing knowledge base of proteolytic enzymes, but also provides a platform from which to expand knowledge of cestode proteases and to explore their biochemistry and potential as intervention targets.
Nuclear localization and transactivation by Vitis CBF transcription factors are regulated by combinations of conserved amino acid domains.

PubMed

Carlow, Chevonne E; Faultless, J Trent; Lee, Christine; Siddiqua, Mahbuba; Edge, Alison; Nassuth, Annette

2017-09-01

The highly conserved CBF pathway is crucial in the regulation of plant responses to low temperatures. Extensive analysis of Arabidopsis CBF proteins revealed that their functions rely on several conserved amino acid domains although the exact function of each domain is disputed. The question was what functions similar domains have in CBFs from other, overwintering woody plants such as Vitis, which likely have a more involved regulation than the model plant Arabidopsis. A total of seven CBF genes were cloned and sequenced from V. riparia and the less frost tolerant V. vinifera. The deduced species-specific amino acid sequences differ in only a few amino acids, mostly in non-conserved regions. Amino acid sequence comparison and phylogenetic analysis showed two distinct groups of Vitis CBFs. One group contains CBF1, CBF2, CBF3 and CBF8 and the other group contains CBF4, CBF5 and CBF6. Transient transactivation assays showed that all Vitis CBFs except CBF5 activate via a CRT or DRE promoter element, whereby Vitis CBF3 and 4 prefer a CRT element. The hydrophobic domains in the C-terminal end of VrCBF6 were shown to be important for how well it activates. The putative nuclear localization domain of Vitis CBF1 was shown to be sufficient for nuclear localization, in contrast to previous reports for AtCBF1, and also important for transactivation. The latter highlights the value of careful analysis of domain functions instead of reliance on computer predictions and published data for other related proteins. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
The LOTUS domain is a conserved DEAD-box RNA helicase regulator essential for the recruitment of Vasa to the germ plasm and nuage

PubMed Central

Jeske, Mandy; Müller, Christoph W.; Ephrussi, Anne

2017-01-01

DEAD-box RNA helicases play important roles in a wide range of metabolic processes. Regulatory proteins can stimulate or block the activity of DEAD-box helicases. Here, we show that LOTUS (Limkain, Oskar, and Tudor containing proteins 5 and 7) domains present in the germline proteins Oskar, TDRD5 (Tudor domain-containing 5), and TDRD7 bind and stimulate the germline-specific DEAD-box RNA helicase Vasa. Our crystal structure of the LOTUS domain of Oskar in complex with the C-terminal RecA-like domain of Vasa reveals that the LOTUS domain occupies a surface on a DEAD-box helicase not implicated previously in the regulation of the enzyme's activity. We show that, in vivo, the localization of Drosophila Vasa to the nuage and germ plasm depends on its interaction with LOTUS domain proteins. The binding and stimulation of Vasa DEAD-box helicases by LOTUS domains are widely conserved. PMID:28536148
Functional analysis of conserved aromatic amino acids in the discoidin domain of Paenibacillus β-1,3-glucanase

PubMed Central

2009-01-01

The 190-kDa Paenibacillus β-1,3-glucanase (LamA) contains a catalytic module of the glycoside hydrolase family 16 (GH16) and several auxiliary domains. Of these, a discoidin domain (DS domain), present in both eukaryotic and prokaryotic proteins with a wide variety of functions, exists at the carboxyl-terminus. To better understand the bacterial DS domain in terms of its structure and function, this domain alone was expressed in Escherichia coli and characterized. The results indicate that the DS domain binds various polysaccharides and enhances the biological activity of the GH16 module on composite substrates. We also investigated the importance of several conserved aromatic residues in the domain's stability and substrate-binding affinity. Both were affected by mutations of these residues; however, the effect on protein stability was more notable. In particular, the forces contributed by a sandwiched triad (W1688, R1756, and W1729) were critical for the presumable β-sandwich fold. PMID:19930717
Landscape features, standards, and semantics in U.S. national topographic mapping databases

USGS Publications Warehouse

Varanka, Dalia

2009-01-01

The objective of this paper is to examine the contrast between local, field-surveyed topographical representation and feature representation in digital, centralized databases and to clarify their ontological implications. The semantics of these two approaches are contrasted by examining the categorization of features by subject domains inherent to national topographic mapping. When comparing five USGS topographic mapping domain and feature lists, results indicate that multiple semantic meanings and ontology rules were applied to the initial digital database, but were lost as databases became more centralized at national scales, and common semantics were replaced by technological terms.
The Conservation Efforts Database: Improving our knowledge of landscape conservation actions

USGS Publications Warehouse

Heller, Matthew M.; Welty, Justin; Wiechman , Lief A.

2017-01-01

The Conservation Efforts Database (CED) is a secure, cloud-based tool that can be used to document and track conservation actions across landscapes. A recently released factsheet describes this tool ahead of the rollout of CED version 2.0. The CED was developed by the U.S. Fish and Wildlife Service, the USGS, and the Great Northern Landscape Conservation Cooperative to support the 2015 Endangered Species Act status review for greater sage-grouse. Currently, the CED accepts policy-level data, such as Land Use Plans, and treatment level data, such as conifer removals and post-fire recovery efforts, as custom spatial and non-spatial records. In addition to a species assessment tool, the CED can also be used to summarize the extent of restoration efforts within a specific area or to strategically site conservation actions based on the location of other implemented actions. The CED can be an important tool, along with post-conservation monitoring, for implementing landscape-scale adaptive management.
Genomic positional conservation identifies topological anchor point RNAs linked to developmental loci.

PubMed

Amaral, Paulo P; Leonardi, Tommaso; Han, Namshik; Viré, Emmanuelle; Gascoigne, Dennis K; Arias-Carrasco, Raúl; Büscher, Magdalena; Pandolfini, Luca; Zhang, Anda; Pluchino, Stefano; Maracaja-Coutinho, Vinicius; Nakaya, Helder I; Hemberg, Martin; Shiekhattar, Ramin; Enright, Anton J; Kouzarides, Tony

2018-03-15

The mammalian genome is transcribed into large numbers of long noncoding RNAs (lncRNAs), but the definition of functional lncRNA groups has proven difficult, partly due to their low sequence conservation and lack of identified shared properties. Here we consider promoter conservation and positional conservation as indicators of functional commonality. We identify 665 conserved lncRNA promoters in mouse and human that are preserved in genomic position relative to orthologous coding genes. These positionally conserved lncRNA genes are primarily associated with developmental transcription factor loci with which they are coexpressed in a tissue-specific manner. Over half of positionally conserved RNAs in this set are linked to chromatin organization structures, overlapping binding sites for the CTCF chromatin organiser and located at chromatin loop anchor points and borders of topologically associating domains (TADs). We define these RNAs as topological anchor point RNAs (tapRNAs). Characterization of these noncoding RNAs and their associated coding genes shows that they are functionally connected: they regulate each other's expression and influence the metastatic phenotype of cancer cells in vitro in a similar fashion. Furthermore, we find that tapRNAs contain conserved sequence domains that are enriched in motifs for zinc finger domain-containing RNA-binding proteins and transcription factors, whose binding sites are found mutated in cancers. This work leverages positional conservation to identify lncRNAs with potential importance in genome organization, development and disease. The evidence that many developmental transcription factors are physically and functionally connected to lncRNAs represents an exciting stepping-stone to further our understanding of genome regulation.
Dissociation of Paramyxovirus Interferon Evasion Activities: Universal and Virus-Specific Requirements for Conserved V Protein Amino Acids in MDA5 Interference ▿

PubMed Central

Ramachandran, Aparna; Horvath, Curt M.

2010-01-01

The V protein of the paramyxovirus subfamily Paramyxovirinae is an important virulence factor that can interfere with host innate immunity by inactivating the cytosolic pathogen recognition receptor MDA5. This interference is a result of a protein-protein interaction between the highly conserved carboxyl-terminal domain of the V protein and the helicase domain of MDA5. The V protein C-terminal domain (CTD) is an evolutionarily conserved 49- to 68-amino-acid region that coordinates two zinc atoms per protein chain. Site-directed mutagenesis of conserved residues in the V protein CTD has revealed both universal and virus-specific requirements for zinc coordination in MDA5 engagement and has also identified other conserved residues as critical for MDA5 interaction and interference. Mutation of these residues produces V proteins that are specifically defective for MDA5 interference and not impaired in targeting STAT1 for proteasomal degradation via the VDC ubiquitin ligase complex. Results demonstrate that mutation of conserved charged residues in the V proteins of Nipah virus, measles virus, and mumps virus also abolishes MDA5 interaction. These findings clearly define molecular determinants for MDA5 inhibition by the paramyxovirus V proteins. PMID:20719949
Conserved domains and SINE diversity during animal evolution.

PubMed

Luchetti, Andrea; Mantovani, Barbara

2013-10-01

Eukaryotic genomes harbour a number of mobile genetic elements (MGEs); moving from one genomic location to another, they are known to impact on the host genome. Short interspersed elements (SINEs) are well-represented, non-autonomous retroelements and they are likely the most diversified MGEs. In some instances, sequence domains conserved across unrelated SINEs have been identified; remarkably, one of these, called Nin, has been conserved since the Radiata-Bilateria splitting. Here we report on two new domains: Inv, derived from Nin, identified in insects and in deuterostomes, and Pln, restricted to polyneopteran insects. The identification of Inv and Pln sequences allowed us to retrieve new SINEs, two in insects and one in a hemichordate. The diverse structural combination of the different domains in different SINE families, during metazoan evolution, offers a clearer view of SINE diversity and their frequent de novo emergence through module exchange, possibly underlying the high evolutionary success of SINEs. © 2013 Elsevier Inc. All rights reserved.
The Prp19 WD40 Domain Contains a Conserved Protein Interaction Region Essential for its Function

PubMed Central

Vander Kooi, Craig W.; Ren, Liping; Xu, Ping; Ohi, Melanie D.; Gould, Kathleen L.; Chazin, Walter J.

2010-01-01

Summary Prp19 is a member of the WD40-repeat family of E3 ubiquitin ligases and a conserved eukaryotic RNA splicing factor essential for activation and stabilization of the spliceosome. To understand the role of the WD40 repeat domain of Prp19 we have determined its structure using X-ray crystallography. The domain has a distorted seven bladed WD40 architecture with significant asymmetry due to irregular packing of blades one and seven into the core of the WD40 domain. Structure-based mutagenesis identified a highly conserved surface centered around blade five that is required for the physical interaction between Prp19 and Cwc2, another essential splicing factor. This region is found to be required for Prp19 function and yeast viability. Experiments in vitro and in vivo demonstrate that two molecules of Cwc2 bind to the Prp19 tetramer. These coupled structural and functional studies provide a model for the functional architecture of Prp19. PMID:20462492
The structure of a conserved Piezo channel domain reveals a novel beta sandwich fold

PubMed Central

Kamajaya, Aron; Kaiser, Jens; Lee, Jonas; Reid, Michelle; Rees, Douglas C.

2014-01-01

Summary Piezo has recently been identified as a family of eukaryotic mechanosensitive channels composed of subunits containing over 2000 amino acids, without recognizable sequence similarity to other channels. Here, we present the crystal structure of a large, conserved extramembrane domain located just before the last predicted transmembrane helix of C. elegans PIEZO, which adopts a novel beta sandwich fold. The structure was also determined of a point mutation located on a conserved surface at the position equivalent to the human PIEZO1 mutation found in Dehydrated Hereditary Stomatocytosis (DHS) patients (M2225R). While the point mutation does not change the overall domain structure, it does alter the surface electrostatic potential that may perturb interactions with a yet-to-be identified ligand or protein. The lack of structural similarity between this domain and any previously characterized fold, including those of eukaryotic and bacterial channels, highlights the distinctive nature of the Piezo family of eukaryotic mechanosensitive channels. PMID:25242456
The structure of a conserved piezo channel domain reveals a topologically distinct β sandwich fold.

PubMed

Kamajaya, Aron; Kaiser, Jens T; Lee, Jonas; Reid, Michelle; Rees, Douglas C

2014-10-07

Piezo has recently been identified as a family of eukaryotic mechanosensitive channels composed of subunits containing over 2,000 amino acids, without recognizable sequence similarity to other channels. Here, we present the crystal structure of a large, conserved extramembrane domain located just before the last predicted transmembrane helix of C. elegans PIEZO, which adopts a topologically distinct β sandwich fold. The structure was also determined of a point mutation located on a conserved surface at the position equivalent to the human PIEZO1 mutation found in dehydrated hereditary stomatocytosis patients (M2225R). While the point mutation does not change the overall domain structure, it does alter the surface electrostatic potential that may perturb interactions with a yet-to-be-identified ligand or protein. The lack of structural similarity between this domain and any previously characterized fold, including those of eukaryotic and bacterial channels, highlights the distinctive nature of the Piezo family of eukaryotic mechanosensitive channels. Copyright © 2014 Elsevier Ltd. All rights reserved.

Using a Relational Database to Index Infectious Disease Information

PubMed Central

Brown, Jay A.

2010-01-01

Mapping medical knowledge into a relational database became possible with the availability of personal computers and user-friendly database software in the early 1990s. To create a database of medical knowledge, the domain expert works like a mapmaker to first outline the domain and then add the details, starting with the most prominent features. The resulting “intelligent database” can support the decisions of healthcare professionals. The intelligent database described in this article contains profiles of 275 infectious diseases. Users can query the database for all diseases matching one or more specific criteria (symptom, endemic region of the world, or epidemiological factor). Epidemiological factors include sources (patients, water, soil, or animals), routes of entry, and insect vectors. Medical and public health professionals could use such a database as a decision-support software tool. PMID:20623018
An expressed sequence tag (EST) data mining strategy succeeding in the discovery of new G-protein coupled receptors.

PubMed

Wittenberger, T; Schaller, H C; Hellebrand, S

2001-03-30

We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families. Copyright 2001 Academic Press.
Genome-wide analysis of basic helix-loop-helix (bHLH) transcription factors in Brachypodium distachyon.

PubMed

Niu, Xin; Guan, Yuxiang; Chen, Shoukun; Li, Haifeng

2017-08-15

As a superfamily of transcription factors (TFs), the basic helix-loop-helix (bHLH) proteins have been characterized functionally in many plants with a vital role in the regulation of diverse biological processes including growth, development, response to various stresses, and so on. However, no systemic analysis of the bHLH TFs has been reported in Brachypodium distachyon, an emerging model plant in Poaceae. A total of 146 bHLH TFs were identified in the Brachypodium distachyon genome and classified into 24 subfamilies. BdbHLHs in the same subfamily share similar protein motifs and gene structures. Gene duplication events showed a close relationship to rice, maize and sorghum, and segment duplications might play a key role in the expansion of this gene family. The amino acid sequence of the bHLH domains were quite conservative, especially Leu-27 and Leu-54. Based on the predicted binding activities, the BdbHLHs were divided into DNA binding and non-DNA binding types. According to the gene ontology (GO) analysis, BdbHLHs were speculated to function in homodimer or heterodimer manner. By integrating the available high throughput data in public database and results of quantitative RT-PCR, we found the expression profiles of BdbHLHs were different, implying their differentiated functions. One hundred fourty-six BdbHLHs were identified and their conserved domains, sequence features, phylogenetic relationship, chromosomal distribution, GO annotations, gene structures, gene duplication and expression profiles were investigated. Our findings lay a foundation for further evolutionary and functional elucidation of BdbHLH genes.
Parallel CE/SE Computations via Domain Decomposition

NASA Technical Reports Server (NTRS)

Himansu, Ananda; Jorgenson, Philip C. E.; Wang, Xiao-Yen; Chang, Sin-Chung

2000-01-01

This paper describes the parallelization strategy and achieved parallel efficiency of an explicit time-marching algorithm for solving conservation laws. The Space-Time Conservation Element and Solution Element (CE/SE) algorithm for solving the 2D and 3D Euler equations is parallelized with the aid of domain decomposition. The parallel efficiency of the resultant algorithm on a Silicon Graphics Origin 2000 parallel computer is checked.
Identification of influenza A nucleoprotein body domain residues essential for viral RNA expression expose antiviral target.

PubMed

Davis, Alicia M; Ramirez, Jose; Newcomb, Laura L

2017-02-07

Influenza A virus is controlled with yearly vaccination while emerging global pandemics are kept at bay with antiviral medications. Unfortunately, influenza A viruses have emerged resistance to approved influenza antivirals. Accordingly, there is an urgent need for novel antivirals to combat emerging influenza A viruses resistant to current treatments. Conserved viral proteins are ideal targets because conserved protein domains are present in most, if not all, influenza subtypes, and are presumed less prone to evolve viable resistant versions. The threat of an antiviral resistant influenza pandemic justifies our study to identify and characterize antiviral targets within influenza proteins that are highly conserved. Influenza A nucleoprotein (NP) is highly conserved and plays essential roles throughout the viral lifecycle, including viral RNA synthesis. Using NP crystal structure, we targeted accessible amino acids for substitution. To characterize the NP proteins, reconstituted viral ribonucleoproteins (vRNPs) were expressed in 293 T cells, RNA was isolated, and reverse transcription - quantitative PCR (RT-qPCR) was employed to assess viral RNA expressed from reconstituted vRNPs. Location was confirmed using cellular fractionation and western blot, along with observation of NP-GFP fusion proteins. Nucleic acid binding, oligomerization, and vRNP formation, were each assessed with native gel electrophoresis. Here we report characterization of an accessible and conserved five amino acid region within the NP body domain that plays a redundant but essential role in viral RNA synthesis. Our data demonstrate substitutions in this domain did not alter NP localization, oligomerization, or ability to bind nucleic acids, yet resulted in a defect in viral RNA expression. To define this region further, single and double amino acid substitutions were constructed and investigated. All NP single substitutions were functional, suggesting redundancy, yet different combinations of two amino acid substitutions resulted in a significant defect in RNA expression, confirming these accessible amino acids in the NP body domain play an important role in viral RNA synthesis. The identified conserved and accessible NP body domain represents a viable antiviral target to counter influenza replication and this research will contribute to the well-informed design of novel therapies to combat emerging influenza viruses.
Conservation and the 4 Rs, which are rescue, rehabilitation, release, and research.

PubMed

Pyke, Graham H; Szabo, Judit K

2018-02-01

Vertebrate animals can be injured or threatened with injury through human activities, thus warranting their "rescue." Details of wildlife rescue, rehabilitation, release, and associated research (our 4 Rs) are often recorded in large databases, resulting in a wealth of available information. This information has huge research potential and can contribute to understanding of animal biology, anthropogenic impacts on wildlife, and species conservation. However, such databases have been little used, few studies have evaluated factors influencing success of rehabilitation and/or release, recommended actions to conserve threatened species have rarely arisen, and direct benefits for species conservation are yet to be demonstrated. We therefore recommend that additional research be based on data from rescue, rehabilitation, and release of animals that is broader in scope than previous research and would have community support. © 2017 Society for Conservation Biology.
Development of five digits is controlled by a bipartite long-range cis-regulator.

PubMed

Lettice, Laura A; Williamson, Iain; Devenney, Paul S; Kilanowski, Fiona; Dorin, Julia; Hill, Robert E

2014-04-01

Conservation within intergenic DNA often highlights regulatory elements that control gene expression from a long range. How conservation within a single element relates to regulatory information and how internal composition relates to function is unknown. Here, we examine the structural features of the highly conserved ZRS (also called MFCS1) cis-regulator responsible for the spatiotemporal control of Shh in the limb bud. By systematically dissecting the ZRS, both in transgenic assays and within in the endogenous locus, we show that the ZRS is, in effect, composed of two distinct domains of activity: one domain directs spatiotemporal activity but functions predominantly from a short range, whereas a second domain is required to promote long-range activity. We show further that these two domains encode activities that are highly integrated and that the second domain is crucial in promoting the chromosomal conformational changes correlated with gene activity. During limb bud development, these activities encoded by the ZRS are interpreted differently by the fore limbs and the hind limbs; in the absence of the second domain there is no Shh activity in the fore limb, and in the hind limb low levels of Shh lead to a variant digit pattern ranging from two to four digits. Hence, in the embryo, the second domain stabilises the developmental programme providing a buffer for SHH morphogen activity and this ensures that five digits form in both sets of limbs.
Structure-function analysis of mouse Sry reveals dual essential roles of the C-terminal polyglutamine tract in sex determination.

PubMed

Zhao, Liang; Ng, Ee Ting; Davidson, Tara-Lynne; Longmuss, Enya; Urschitz, Johann; Elston, Marlee; Moisyadi, Stefan; Bowles, Josephine; Koopman, Peter

2014-08-12

The mammalian sex-determining factor SRY comprises a conserved high-mobility group (HMG) box DNA-binding domain and poorly conserved regions outside the HMG box. Mouse Sry is unusual in that it includes a C-terminal polyglutamine (polyQ) tract that is absent in nonrodent SRY proteins, and yet, paradoxically, is essential for male sex determination. To dissect the molecular functions of this domain, we generated a series of Sry mutants, and studied their biochemical properties in cell lines and transgenic mouse embryos. Sry protein lacking the polyQ domain was unstable, due to proteasomal degradation. Replacing this domain with irrelevant sequences stabilized the protein but failed to restore Sry's ability to up-regulate its key target gene SRY-box 9 (Sox9) and its sex-determining function in vivo. These functions were restored only when a VP16 transactivation domain was substituted. We conclude that the polyQ domain has important roles in protein stabilization and transcriptional activation, both of which are essential for male sex determination in mice. Our data disprove the hypothesis that the conserved HMG box domain is the only functional domain of Sry, and highlight an evolutionary paradox whereby mouse Sry has evolved a novel bifunctional module to activate Sox9 directly, whereas SRY proteins in other taxa, including humans, seem to lack this ability, presumably making them dependent on partner proteins(s) to provide this function.
Characterization of a Gene Coding for the Complement System Component FB from Loxosceles laeta Spider Venom Glands.

PubMed

Myamoto, Daniela Tiemi; Pidde-Queiroz, Giselle; Gonçalves-de-Andrade, Rute Maria; Pedroso, Aurélio; van den Berg, Carmen W; Tambourgi, Denise V

2016-01-01

The human complement system is composed of more than 30 proteins and many of these have conserved domains that allow tracing the phylogenetic evolution. The complement system seems to be initiated with the appearance of C3 and factor B (FB), the only components found in some protostomes and cnidarians, suggesting that the alternative pathway is the most ancient. Here, we present the characterization of an arachnid homologue of the human complement component FB from the spider Loxosceles laeta. This homologue, named Lox-FB, was identified from a total RNA L. laeta spider venom gland library and was amplified using RACE-PCR techniques and specific primers. Analysis of the deduced amino acid sequence and the domain structure showed significant similarity to the vertebrate and invertebrate FB/C2 family proteins. Lox-FB has a classical domain organization composed of a control complement protein domain (CCP), a von Willebrand Factor domain (vWFA), and a serine protease domain (SP). The amino acids involved in Mg2+ metal ion dependent adhesion site (MIDAS) found in the vWFA domain in the vertebrate C2/FB proteins are well conserved; however, the classic catalytic triad present in the serine protease domain is not conserved in Lox-FB. Similarity and phylogenetic analyses indicated that Lox-FB shares a major identity (43%) and has a close evolutionary relationship with the third isoform of FB-like protein (FB-3) from the jumping spider Hasarius adansoni belonging to the Family Salcitidae.
Characterization of a Gene Coding for the Complement System Component FB from Loxosceles laeta Spider Venom Glands

PubMed Central

Myamoto, Daniela Tiemi; Pidde-Queiroz, Giselle; Gonçalves-de-Andrade, Rute Maria; Pedroso, Aurélio; van den Berg, Carmen W.; Tambourgi, Denise V.

2016-01-01

The human complement system is composed of more than 30 proteins and many of these have conserved domains that allow tracing the phylogenetic evolution. The complement system seems to be initiated with the appearance of C3 and factor B (FB), the only components found in some protostomes and cnidarians, suggesting that the alternative pathway is the most ancient. Here, we present the characterization of an arachnid homologue of the human complement component FB from the spider Loxosceles laeta. This homologue, named Lox-FB, was identified from a total RNA L. laeta spider venom gland library and was amplified using RACE-PCR techniques and specific primers. Analysis of the deduced amino acid sequence and the domain structure showed significant similarity to the vertebrate and invertebrate FB/C2 family proteins. Lox-FB has a classical domain organization composed of a control complement protein domain (CCP), a von Willebrand Factor domain (vWFA), and a serine protease domain (SP). The amino acids involved in Mg2+ metal ion dependent adhesion site (MIDAS) found in the vWFA domain in the vertebrate C2/FB proteins are well conserved; however, the classic catalytic triad present in the serine protease domain is not conserved in Lox-FB. Similarity and phylogenetic analyses indicated that Lox-FB shares a major identity (43%) and has a close evolutionary relationship with the third isoform of FB-like protein (FB-3) from the jumping spider Hasarius adansoni belonging to the Family Salcitidae. PMID:26771533
Ermelin, an endoplasmic reticulum transmembrane protein, contains the novel HELP domain conserved in eukaryotes.

PubMed

Suzuki, Akiko; Endo, Takeshi

2002-02-06

We have cloned a cDNA encoding a novel protein referred to as ermelin from mouse C2 skeletal muscle cells. This protein contained six hydrophobic amino acid stretches corresponding to transmembrane domains, two histidine-rich sequences, and a sequence homologous to the fusion peptides of certain fusion proteins. Ermelin also contained a novel modular sequence, designated as HELP domain, which was highly conserved among eukaryotes, from yeast to higher plants and animals. All these HELP domain-containing proteins, including mouse KE4, Drosophila Catsup, and Arabidopsis IAR1, possessed multipass transmembrane domains and histidine-rich sequences. Ermelin was predominantly expressed in brain and testis, and induced during neuronal differentiation of N1E-115 neuroblastoma cells but downregulated during myogenic differentiation of C2 cells. The mRNA was accumulated in hippocampus and cerebellum of brain and central areas of seminiferous tubules in testis. Epitope-tagging experiments located ermelin and KE4 to a network structure throughout the cytoplasm. Staining with the fluorescent dye DiOC(6)(3) identified this structure as the endoplasmic reticulum. These results suggest that at least some, if not all, of the HELP domain-containing proteins are multipass endoplasmic reticulum membrane proteins with functions conserved among eukaryotes.
A single amino-acid substitution in the Ets domain alters core DNA binding specificity of Ets1 to that of the related transcription factors Elf1 and E74.

PubMed

Bosselut, R; Levin, J; Adjadj, E; Ghysdael, J

1993-11-11

Ets proteins form a family of sequence specific DNA binding proteins which bind DNA through a 85 aminoacids conserved domain, the Ets domain, whose sequence is unrelated to any other characterized DNA binding domain. Unlike all other known Ets proteins, which bind specific DNA sequences centered over either GGAA or GGAT core motifs, E74 and Elf1 selectively bind to GGAA corecontaining sites. Elf1 and E74 differ from other Ets proteins in three residues located in an otherwise highly conserved region of the Ets domain, referred to as conserved region III (CRIII). We show that a restricted selectivity for GGAA core-containing sites could be conferred to Ets1 upon changing a single lysine residue within CRIII to the threonine found in Elf1 and E74 at this position. Conversely, the reciprocal mutation in Elf1 confers to this protein the ability to bind to GGAT core containing EBS. This, together with the fact that mutation of two invariant arginine residues in CRIII abolishes DNA binding, indicates that CRIII plays a key role in Ets domain recognition of the GGAA/T core motif and lead us to discuss a model of Ets proteins--core motif interaction.
Structure-Based Sequence Alignment of the Transmembrane Domains of All Human GPCRs: Phylogenetic, Structural and Functional Implications

PubMed Central

Cvicek, Vaclav; Goddard, William A.; Abrol, Ravinder

2016-01-01

The understanding of G-protein coupled receptors (GPCRs) is undergoing a revolution due to increased information about their signaling and the experimental determination of structures for more than 25 receptors. The availability of at least one receptor structure for each of the GPCR classes, well separated in sequence space, enables an integrated superfamily-wide analysis to identify signatures involving the role of conserved residues, conserved contacts, and downstream signaling in the context of receptor structures. In this study, we align the transmembrane (TM) domains of all experimental GPCR structures to maximize the conserved inter-helical contacts. The resulting superfamily-wide GpcR Sequence-Structure (GRoSS) alignment of the TM domains for all human GPCR sequences is sufficient to generate a phylogenetic tree that correctly distinguishes all different GPCR classes, suggesting that the class-level differences in the GPCR superfamily are encoded at least partly in the TM domains. The inter-helical contacts conserved across all GPCR classes describe the evolutionarily conserved GPCR structural fold. The corresponding structural alignment of the inactive and active conformations, available for a few GPCRs, identifies activation hot-spot residues in the TM domains that get rewired upon activation. Many GPCR mutations, known to alter receptor signaling and cause disease, are located at these conserved contact and activation hot-spot residue positions. The GRoSS alignment places the chemosensory receptor subfamilies for bitter taste (TAS2R) and pheromones (Vomeronasal, VN1R) in the rhodopsin family, known to contain the chemosensory olfactory receptor subfamily. The GRoSS alignment also enables the quantification of the structural variability in the TM regions of experimental structures, useful for homology modeling and structure prediction of receptors. Furthermore, this alignment identifies structurally and functionally important residues in all human GPCRs. These residues can be used to make testable hypotheses about the structural basis of receptor function and about the molecular basis of disease-associated single nucleotide polymorphisms. PMID:27028541
Teaching Case: Adapting the Access Northwind Database to Support a Database Course

ERIC Educational Resources Information Center

Dyer, John N.; Rogers, Camille

2015-01-01

A common problem encountered when teaching database courses is that few large illustrative databases exist to support teaching and learning. Most database textbooks have small "toy" databases that are chapter objective specific, and thus do not support application over the complete domain of design, implementation and management concepts…
Comparison of S. cerevisiae F-BAR domain structures reveals a conserved inositol phosphate binding site

PubMed Central

Moravcevic, Katarina; Alvarado, Diego; Schmitz, Karl R.; Kenniston, Jon A.; Mendrola, Jeannine M.; Ferguson, Kathryn M.; Lemmon, Mark A.

2015-01-01

SUMMARY F-BAR domains control membrane interactions in endocytosis, cytokinesis, and cell signaling. Although generally thought to bind curved membranes containing negatively charged phospholipids, numerous functional studies argue that differences in lipid-binding selectivities of F-BAR domains are functionally important. Here, we compare membrane-binding properties of the S. cerevisiae F-BAR domains in vitro and in vivo. Whereas some F-BAR domains (such as Bzz1p and Hof1p F-BARs) bind equally well to all phospholipids, the F-BAR domain from the RhoGAP Rgd1p preferentially binds phosphoinositides. We determined X-ray crystal structures of F-BAR domains from Hof1p and Rgd1p, the latter bound to an inositol phosphate. The structures explain phospholipid-binding selectivity differences, and reveal an F-BAR phosphoinositide binding site that is fully conserved in a mammalian RhoGAP called Gmip, and is partly retained in certain other F-BAR domains. Our findings reveal previously unappreciated determinants of F-BAR domain lipid-binding specificity, and provide a basis for its prediction from sequence. PMID:25620000
The Three Domains of Conservation Genetics: Case Histories from Hawaiian Waters.

PubMed

Bowen, Brian W

2016-07-01

The scientific field of conservation biology is dominated by 3 specialties: phylogenetics, ecology, and evolution. Under this triad, phylogenetics is oriented towards the past history of biodiversity, conserving the divergent branches in the tree of life. The ecological component is rooted in the present, maintaining the contemporary life support systems for biodiversity. Evolutionary conservation (as defined here) is concerned with preserving the raw materials for generating future biodiversity. All 3 domains can be documented with genetic case histories in the waters of the Hawaiian Archipelago, an isolated chain of volcanic islands with 2 types of biodiversity: colonists, and new species that arose from colonists. This review demonstrates that 1) phylogenetic studies have identified previously unknown branches in the tree of life that are endemic to Hawaiian waters; 2) population genetic surveys define isolated marine ecosystems as management units, and 3) phylogeographic analyses illustrate the pathways of colonization that can enhance future biodiversity. Conventional molecular markers have advanced all 3 domains in conservation biology over the last 3 decades, and recent advances in genomics are especially valuable for understanding the foundations of future evolutionary diversity. © The American Genetic Association. 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Three Conservation Applications of Astronaut Photographs of Earth: Tidal Flat Loss (Japan), Elephant Impacts on Vegetation (Botswana), and Seagrass and Mangrove Monitoring (Australia)

NASA Technical Reports Server (NTRS)

Lulla, Kamlesh P.; Robinson, Julie A.; Minorukashiwagi; Maggiesuzuki; Duanenellis, M.; Bussing, Charles E.; Leelong, W. J.; McKenzie, Andlen J.

2000-01-01

NASA photographs taken from low Earth orbit can provide information relevant to conservation biology. This data source is now more accessible due to improvements in digitizing technology, Internet file transfer, and availability of image processing software. We present three examples of conservation-related projects that benefited from using orbital photographs. (1) A time series of photographs from the Space Shuttle showing wetland conversion in Japan was used as a tool for communicating about the impacts of tidal flat loss. Real-time communication with astronauts about a newsworthy event resulted in acquiring current imagery. These images and the availability of other high resolution digital images from NASA provided timely public information on the observed changes. (2) A Space Shuttle photograph of Chobe National Park in Botswana was digitally classified and analyzed to identify the locations of elephant-impacted woodland. Field validation later confirmed that areas identified on the image showed evidence of elephant impacts. (3) A summary map from intensive field surveys of seagrasses in Shoalwater Bay, Australia was used as reference data for a supervised classification of a digitized photograph taken from orbit. The classification was able to distinguish seagrasses, sediments and mangroves with accuracy approximating that in studies using other satellite remote sensing data. Orbital photographs are in the public domain and the database of nearly 400,000 photographs from the late 1960s to the present is available at a single searchable location on the Internet. These photographs can be used by conservation biologists for general information about the landscape and in quantitative applications.
In silico analysis of subtilisin from Glaciozyma antarctica PI12

NASA Astrophysics Data System (ADS)

Mustafha, Siti Mardhiah; Murad, Abdul Munir Abdul; Mahadi, Nor Muhammad; Kamaruddin, Shazilah; Bakar, Farah Diba Abu

2015-09-01

Subtilisin constitute as a major player in industrial enzymes that has a wide range of application especially in the detergent industry. In this study, a cDNA encoding for subtilisin (GaSUBT) was extracted from the psychrophilic yeast, Glaciozyma antarctica PI12, PCR amplified and sequenced. Various bioinformatics tools were used to characterize the GaSUBT. GaSUBT contains 1587 bp nucleotides encoding for 529 amino acids. The predicted molecular weight of the deduced protein is 55.34 kDa with an isoelectric point of 6.25. GaSUBT was predicted to possess a signal peptide and pro-peptide consisting of a peptidase inhibitor I9 sequence. From the sequence alignment analysis of deduced amino acids with other subtilisins in the NCBI database showed that the sequences surrounding the catalytic triad that forms the catalytic domain are well conserved.
The structure of the nucleoprotein binding domain of lyssavirus phosphoprotein reveals a structural relationship between the N-RNA binding domains of Rhabdoviridae and Paramyxoviridae.

PubMed

Delmas, Olivier; Assenberg, Rene; Grimes, Jonathan M; Bourhy, Hervé

2010-01-01

The phosphoprotein P of non-segmented negative-sense RNA viruses is an essential component of the replication and transcription complex and acts as a co-factor for the viral RNA-dependent RNA polymerase. P recruits the viral polymerase to the nucleoprotein-bound viral RNA (N-RNA) via an interaction between its C-terminal domain and the N-RNA complex. We have obtained the structure of the C-terminal domain of P of Mokola virus (MOKV), a lyssavirus that belongs to the Rhabdoviridae family and mapped at the amino acid level the crucial positions involved in interaction with N and in the formation of the viral replication complex. Comparison of the N-RNA binding domains of P solved to date suggests that the N-RNA binding domains are structurally conserved among paramyxoviruses and rhabdoviruses in spite of low sequence conservation. We also review the numerous other functions of this domain and more generally of the phosphoprotein.
Nonlinear (time domain) and linearized (time and frequency domain) solutions to the compressible Euler equations in conservation law form

NASA Technical Reports Server (NTRS)

Sreenivas, Kidambi; Whitfield, David L.

1995-01-01

Two linearized solvers (time and frequency domain) based on a high resolution numerical scheme are presented. The basic approach is to linearize the flux vector by expressing it as a sum of a mean and a perturbation. This allows the governing equations to be maintained in conservation law form. A key difference between the time and frequency domain computations is that the frequency domain computations require only one grid block irrespective of the interblade phase angle for which the flow is being computed. As a result of this and due to the fact that the governing equations for this case are steady, frequency domain computations are substantially faster than the corresponding time domain computations. The linearized equations are used to compute flows in turbomachinery blade rows (cascades) arising due to blade vibrations. Numerical solutions are compared to linear theory (where available) and to numerical solutions of the nonlinear Euler equations.

PlantTribes: a gene and gene family resource for comparative genomics in plants

PubMed Central

Wall, P. Kerr; Leebens-Mack, Jim; Müller, Kai F.; Field, Dawn; Altman, Naomi S.; dePamphilis, Claude W.

2008-01-01

The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study. PMID:18073194
The aquatic animals' transcriptome resource for comparative functional analysis.

PubMed

Chou, Chih-Hung; Huang, Hsi-Yuan; Huang, Wei-Chih; Hsu, Sheng-Da; Hsiao, Chung-Der; Liu, Chia-Yu; Chen, Yu-Hung; Liu, Yu-Chen; Huang, Wei-Yun; Lee, Meng-Lin; Chen, Yi-Chang; Huang, Hsien-Da

2018-05-09

Aquatic animals have great economic and ecological importance. Among them, non-model organisms have been studied regarding eco-toxicity, stress biology, and environmental adaptation. Due to recent advances in next-generation sequencing techniques, large amounts of RNA-seq data for aquatic animals are publicly available. However, currently there is no comprehensive resource exist for the analysis, unification, and integration of these datasets. This study utilizes computational approaches to build a new resource of transcriptomic maps for aquatic animals. This aquatic animal transcriptome map database dbATM provides de novo assembly of transcriptome, gene annotation and comparative analysis of more than twenty aquatic organisms without draft genome. To improve the assembly quality, three computational tools (Trinity, Oases and SOAPdenovo-Trans) were employed to enhance individual transcriptome assembly, and CAP3 and CD-HIT-EST software were then used to merge these three assembled transcriptomes. In addition, functional annotation analysis provides valuable clues to gene characteristics, including full-length transcript coding regions, conserved domains, gene ontology and KEGG pathways. Furthermore, all aquatic animal genes are essential for comparative genomics tasks such as constructing homologous gene groups and blast databases and phylogenetic analysis. In conclusion, we establish a resource for non model organism aquatic animals, which is great economic and ecological importance and provide transcriptomic information including functional annotation and comparative transcriptome analysis. The database is now publically accessible through the URL http://dbATM.mbc.nctu.edu.tw/ .
Functional Genomics Analysis of Singapore Grouper Iridovirus: Complete Sequence Determination and Proteomic Analysis

PubMed Central

Song, Wen Jun; Qin, Qi Wei; Qiu, Jin; Huang, Can Hua; Wang, Fan; Hew, Choy Leong

2004-01-01

Here we report the complete genome sequence of Singapore grouper iridovirus (SGIV). Sequencing of the random shotgun and restriction endonuclease genomic libraries showed that the entire SGIV genome consists of 140,131 nucleotide bp. One hundred sixty-two open reading frames (ORFs) from the sense and antisense DNA strands, coding for lengths varying from 41 to 1,268 amino acids, were identified. Computer-assisted analyses of the deduced amino acid sequences revealed that 77 of the ORFs exhibited homologies to known virus genes, 23 of which matched functional iridovirus proteins. Forty-two putative conserved domains or signatures were detected in the National Center for Biotechnology Information CD-Search database and PROSITE database. An assortment of enzyme activities involved in DNA replication, transcription, nucleotide metabolism, cell signaling, etc., were identified. Viruses were cultured on a cell line derived from the embryonated egg of the grouper Epinephelus tauvina, isolated, and purified by sucrose gradient ultracentrifugation. The protein extract from the purified virions was analyzed by polyacrylamide gel electrophoresis followed by in-gel digestion of protein bands. Matrix-assisted laser desorption ionization-time of flight mass spectrometry and database searching led to identification of 26 proteins. Twenty of these represented novel or previously unidentified genes, which were further confirmed by reverse transcription-PCR (RT-PCR) and DNA sequencing of their respective RT-PCR products. PMID:15507645
Pleurochrysome: A Web Database of Pleurochrysis Transcripts and Orthologs Among Heterogeneous Algae

PubMed Central

Fujiwara, Shoko; Takatsuka, Yukiko; Hirokawa, Yasutaka; Tsuzuki, Mikio; Takano, Tomoyuki; Kobayashi, Masaaki; Suda, Kunihiro; Asamizu, Erika; Yokoyama, Koji; Shibata, Daisuke; Tabata, Satoshi; Yano, Kentaro

2016-01-01

Pleurochrysis is a coccolithophorid genus, which belongs to the Coccolithales in the Haptophyta. The genus has been used extensively for biological research, together with Emiliania in the Isochrysidales, to understand distinctive features between the two coccolithophorid-including orders. However, molecular biological research on Pleurochrysis such as elucidation of the molecular mechanism behind coccolith formation has not made great progress at least in part because of lack of comprehensive gene information. To provide such information to the research community, we built an open web database, the Pleurochrysome (http://bioinf.mind.meiji.ac.jp/phapt/), which currently stores 9,023 unique gene sequences (designated as UNIGENEs) assembled from expressed sequence tag sequences of P. haptonemofera as core information. The UNIGENEs were annotated with gene sequences sharing significant homology, conserved domains, Gene Ontology, KEGG Orthology, predicted subcellular localization, open reading frames and orthologous relationship with genes of 10 other algal species, a cyanobacterium and the yeast Saccharomyces cerevisiae. This sequence and annotation information can be easily accessed via several search functions. Besides fundamental functions such as BLAST and keyword searches, this database also offers search functions to explore orthologous genes in the 12 organisms and to seek novel genes. The Pleurochrysome will promote molecular biological and phylogenetic research on coccolithophorids and other haptophytes by helping scientists mine data from the primary transcriptome of P. haptonemofera. PMID:26746174
@Caribbean_LCC | CARIBBEAN LANDSCAPE CONSERVATION COOPERATIVE (A2)

Science.gov Websites

Monitoring Data Ecosystem Governance Community Get involved Advisory Groups Scientific Community Practitioner ! Caribbean Agriculture, Forestry and Climate Governance Database Slide background LANDSCAPE Conservation Is Caribbean. Ecosystem Governance Discover our compendium of NGOs and coalition groups doing conservation
DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes

PubMed Central

Sebestyén, Endre; Nagy, Tibor; Suhai, Sándor; Barta, Endre

2009-01-01

Background The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s). Results We have developed a new tool called DoOPSearch for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program. Conclusion We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes. PMID:19534755
Conformational Flexibility Enables the Function of a BECN1 Region Essential for Starvation-Mediated Autophagy

DOE PAGES

Mei, Yang; Ramanathan, Arvind; Glover, Karen; ...

2016-03-03

BECN1 is essential for autophagy, a critical eukaryotic cellular homeostasis pathway. Here in this study, we delineate a highly conserved BECN1 domain located between previously characterized BH3 and coiled-coil domains and elucidate its structure and role in autophagy. The 2.0 Å sulfur-single-wavelength anomalous dispersion X-ray crystal structure of this domain demonstrates that its N-terminal half is unstructured while its C-terminal half is helical; hence, we name it the flexible helical domain (FHD). Circular dichroism spectroscopy, double electron–electron resonance–electron paramagnetic resonance, and small-angle X-ray scattering (SAXS) analyses confirm that the FHD is partially disordered, even in the context of adjacent BECN1more » domains. Molecular dynamic simulations fitted to SAXS data indicate that the FHD transiently samples more helical conformations. FHD helicity increases in 2,2,2-trifluoroethanol, suggesting it may become more helical upon binding. Finally, cellular studies show that conserved FHD residues are required for starvation-induced autophagy. Thus, the FHD likely undergoes a binding-associated disorder-to-helix transition, and conserved residues critical for this interaction are essential for starvation-induced autophagy.« less
Transcriptome analysis in cotton boll weevil (Anthonomus grandis) and RNA interference in insect pests.

PubMed

Firmino, Alexandre Augusto Pereira; Fonseca, Fernando Campos de Assis; de Macedo, Leonardo Lima Pepino; Coelho, Roberta Ramos; Antonino de Souza, José Dijair; Togawa, Roberto Coiti; Silva-Junior, Orzenil Bonfim; Pappas, Georgios Joannis; da Silva, Maria Cristina Mattar; Engler, Gilbert; Grossi-de-Sa, Maria Fatima

2013-01-01

Cotton plants are subjected to the attack of several insect pests. In Brazil, the cotton boll weevil, Anthonomus grandis, is the most important cotton pest. The use of insecticidal proteins and gene silencing by interference RNA (RNAi) as techniques for insect control are promising strategies, which has been applied in the last few years. For this insect, there are not much available molecular information on databases. Using 454-pyrosequencing methodology, the transcriptome of all developmental stages of the insect pest, A. grandis, was analyzed. The A. grandis transcriptome analysis resulted in more than 500.000 reads and a data set of high quality 20,841 contigs. After sequence assembly and annotation, around 10,600 contigs had at least one BLAST hit against NCBI non-redundant protein database and 65.7% was similar to Tribolium castaneum sequences. A comparison of A. grandis, Drosophila melanogaster and Bombyx mori protein families' data showed higher similarity to dipteran than to lepidopteran sequences. Several contigs of genes encoding proteins involved in RNAi mechanism were found. PAZ Domains sequences extracted from the transcriptome showed high similarity and conservation for the most important functional and structural motifs when compared to PAZ Domains from 5 species. Two SID-like contigs were phylogenetically analyzed and grouped with T. castaneum SID-like proteins. No RdRP gene was found. A contig matching chitin synthase 1 was mined from the transcriptome. dsRNA microinjection of a chitin synthase gene to A. grandis female adults resulted in normal oviposition of unviable eggs and malformed alive larvae that were unable to develop in artificial diet. This is the first study that characterizes the transcriptome of the coleopteran, A. grandis. A new and representative transcriptome database for this insect pest is now available. All data support the state of the art of RNAi mechanism in insects.
Transcriptome Analysis in Cotton Boll Weevil (Anthonomus grandis) and RNA Interference in Insect Pests

PubMed Central

Coelho, Roberta Ramos; Antonino de Souza Jr, José Dijair; Togawa, Roberto Coiti; Silva-Junior, Orzenil Bonfim; Pappas-Jr, Georgios Joannis; da Silva, Maria Cristina Mattar; Engler, Gilbert; Grossi-de-Sa, Maria Fatima

2013-01-01

Cotton plants are subjected to the attack of several insect pests. In Brazil, the cotton boll weevil, Anthonomus grandis, is the most important cotton pest. The use of insecticidal proteins and gene silencing by interference RNA (RNAi) as techniques for insect control are promising strategies, which has been applied in the last few years. For this insect, there are not much available molecular information on databases. Using 454-pyrosequencing methodology, the transcriptome of all developmental stages of the insect pest, A. grandis, was analyzed. The A. grandis transcriptome analysis resulted in more than 500.000 reads and a data set of high quality 20,841 contigs. After sequence assembly and annotation, around 10,600 contigs had at least one BLAST hit against NCBI non-redundant protein database and 65.7% was similar to Tribolium castaneum sequences. A comparison of A. grandis, Drosophila melanogaster and Bombyx mori protein families’ data showed higher similarity to dipteran than to lepidopteran sequences. Several contigs of genes encoding proteins involved in RNAi mechanism were found. PAZ Domains sequences extracted from the transcriptome showed high similarity and conservation for the most important functional and structural motifs when compared to PAZ Domains from 5 species. Two SID-like contigs were phylogenetically analyzed and grouped with T. castaneum SID-like proteins. No RdRP gene was found. A contig matching chitin synthase 1 was mined from the transcriptome. dsRNA microinjection of a chitin synthase gene to A. grandis female adults resulted in normal oviposition of unviable eggs and malformed alive larvae that were unable to develop in artificial diet. This is the first study that characterizes the transcriptome of the coleopteran, A. grandis. A new and representative transcriptome database for this insect pest is now available. All data support the state of the art of RNAi mechanism in insects. PMID:24386449
A Large Complement of the Predicted Arabidopsis ARM Repeat Proteins Are Members of the U-Box E3 Ubiquitin Ligase Family1[w

PubMed Central

Mudgil, Yashwanti; Shiu, Shin-Han; Stone, Sophia L.; Salt, Jennifer N.; Goring, Daphne R.

2004-01-01

The Arabidopsis genome was searched to identify predicted proteins containing armadillo (ARM) repeats, a motif known to mediate protein-protein interactions in a number of different animal proteins. Using domain database predictions and models generated in this study, 108 Arabidopsis proteins were identified that contained a minimum of two ARM repeats with the majority of proteins containing four to eight ARM repeats. Clustering analysis showed that the 108 predicted Arabidopsis ARM repeat proteins could be divided into multiple groups with wide differences in their domain compositions and organizations. Interestingly, 41 of the 108 Arabidopsis ARM repeat proteins contained a U-box, a motif present in a family of E3 ligases, and these proteins represented the largest class of Arabidopsis ARM repeat proteins. In 14 of these U-box/ARM repeat proteins, there was also a novel conserved domain identified in the N-terminal region. Based on the phylogenetic tree, representative U-box/ARM repeat proteins were selected for further study. RNA-blot analyses revealed that these U-box/ARM proteins are expressed in a variety of tissues in Arabidopsis. In addition, the selected U-box/ARM proteins were found to be functional E3 ubiquitin ligases. Thus, these U-box/ARM proteins represent a new family of E3 ligases in Arabidopsis. PMID:14657406
A large complement of the predicted Arabidopsis ARM repeat proteins are members of the U-box E3 ubiquitin ligase family.

PubMed

Mudgil, Yashwanti; Shiu, Shin-Han; Stone, Sophia L; Salt, Jennifer N; Goring, Daphne R

2004-01-01

The Arabidopsis genome was searched to identify predicted proteins containing armadillo (ARM) repeats, a motif known to mediate protein-protein interactions in a number of different animal proteins. Using domain database predictions and models generated in this study, 108 Arabidopsis proteins were identified that contained a minimum of two ARM repeats with the majority of proteins containing four to eight ARM repeats. Clustering analysis showed that the 108 predicted Arabidopsis ARM repeat proteins could be divided into multiple groups with wide differences in their domain compositions and organizations. Interestingly, 41 of the 108 Arabidopsis ARM repeat proteins contained a U-box, a motif present in a family of E3 ligases, and these proteins represented the largest class of Arabidopsis ARM repeat proteins. In 14 of these U-box/ARM repeat proteins, there was also a novel conserved domain identified in the N-terminal region. Based on the phylogenetic tree, representative U-box/ARM repeat proteins were selected for further study. RNA-blot analyses revealed that these U-box/ARM proteins are expressed in a variety of tissues in Arabidopsis. In addition, the selected U-box/ARM proteins were found to be functional E3 ubiquitin ligases. Thus, these U-box/ARM proteins represent a new family of E3 ligases in Arabidopsis.
Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

PubMed

Dong, Zheng; Zhou, Hongyu; Tao, Peng

2018-02-01

PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.
Evolution of the Structure and Chromosomal Distribution of Histidine Biosynthetic Genes

NASA Astrophysics Data System (ADS)

Fani, Renato; Mori, Elena; Tamburini, Elena; Lazcano, Antonio

1998-10-01

A database of more than 100 histidine biosynthetic genes from different organisms belonging to the three primary domains has been analyzed, including those found in the now completely sequenced genomes of Haemophilus influenzae, Mycoplasma genitalium, Synechocystis sp., Methanococcus jannaschii, and Saccharomyces cerevisiae. The ubiquity of his genes suggests that it is a highly conserved pathway that was probably already present in the last common ancestor of all extant life. The chromosomal distribution of the his genes shows that the enterobacterial histidine operon structure is not the only possible organization, and that there is a diversity of gene arrays for the his pathway. Analysis of the available sequences shows that gene fusions (like those involved in the origin of the Escherichia coli and Salmonella typhimurium hisIE and hisB gene structures) are not universal. In contrast, the elongation event that led to the extant hisA gene from two homologous ancestral modules, as well as the subsequent paralogous duplication that originated hisF, appear to be irreversible and are conserved in all known organisms. The available evidence supports the hypothesis that histidine biosynthesis was assembled by a gene recruitment process.
Role of conserved cysteine residues in Herbaspirillum seropedicae NifA activity.

PubMed

Oliveira, Marco A S; Baura, Valter A; Aquino, Bruno; Huergo, Luciano F; Kadowaki, Marco A S; Chubatsu, Leda S; Souza, Emanuel M; Dixon, Ray; Pedrosa, Fábio O; Wassem, Roseli; Monteiro, Rose A

2009-01-01

Herbaspirillum seropedicae is an endophytic diazotrophic bacterium that associates with economically important crops. NifA protein, the transcriptional activator of nif genes in H. seropedicae, binds to nif promoters and, together with RNA polymerase-sigma(54) holoenzyme, catalyzes the formation of open complexes to allow transcription initiation. The activity of H. seropedicae NifA is controlled by ammonium and oxygen levels, but the mechanisms of such control are unknown. Oxygen sensitivity is attributed to a conserved motif of cysteine residues in NifA that spans the central AAA+ domain and the interdomain linker that connects the AAA+ domain to the C-terminal DNA binding domain. Here we mutagenized this conserved motif of cysteines and assayed the activity of mutant proteins in vivo. We also purified the mutant variants of NifA and tested their capacity to bind to the nifB promoter region. Chimeric proteins between H. seropedicae NifA, an oxygen-sensitive protein, and Azotobacter vinelandii NifA, an oxygen-tolerant protein, were constructed and showed that the oxygen response is conferred by the central AAA+ and C-terminal DNA binding domains of H. seropedicae NifA. We conclude that the conserved cysteine motif is essential for NifA activity, although single cysteine-to-serine mutants are still competent at binding DNA.
Genome-wide identification and phylogenetic analysis of the AP2/ERF gene superfamily in sweet orange (Citrus sinensis).

PubMed

Ito, T M; Polido, P B; Rampim, M C; Kaschuk, G; Souza, S G H

2014-09-26

Sweet orange (Citrus sinensis) plays an important role in the economy of more than 140 countries, but it is grown in areas with intermittent stressful soil and climatic conditions. The stress tolerance could be addressed by manipulating the ethylene response factor (ERF) transcription factors because they orchestrate plant responses to environmental stress. We performed an in silico study on the ERFs in the expressed sequence tag database of C. sinensis to identify potential genes that regulate plant responses to stress. We identified 108 putative genes encoding protein sequences of the AP2/ERF superfamily distributed within 10 groups of amino acid sequences. Ninety-one genes were assembled from the ERF family containing only one AP2/ERF domain, 13 genes were assembled from the AP2 family containing two AP2/ERF domains, and four other genes were assembled from the RAV family containing one AP2/ERF domain and a B3 domain. Some conserved domains of the ERF family genes were disrupted into a few segments by introns. This irregular distribution of genes in the AP2/ERF superfamily in different plant species could be a result of genomic losses or duplication events in a common ancestor. The in silico gene expression revealed that 67% of AP2/ERF genes are expressed in tissues with usual plant development, and 14% were expressed in stressed tissues. Because the AP2/ERF superfamily is expressed in an orchestrated way, it is possible that the manipulation of only one gene may result in changes in the whole plant function, which could result in more tolerant crops.
MOCASSIN-prot: A multi-objective clustering approach for protein similarity networks

USDA-ARS?s Scientific Manuscript database

Motivation: Proteins often include multiple conserved domains. Various evolutionary events including duplication and loss of domains, domain shuffling, as well as sequence divergence contribute to generating complexities in protein structures, and consequently, in their functions. The evolutionary h...
Landuse and agricultural management practice web-service (LAMPS) for agroecosystem modeling and conservation planning

USDA-ARS?s Scientific Manuscript database

Agroecosystem models and conservation planning tools require spatially and temporally explicit input data about agricultural management operations. The USDA Natural Resources Conservation Service is developing a Land Management and Operation Database (LMOD) which contains potential model input, howe...
J domain independent functions of J proteins.

PubMed

Ajit Tamadaddi, Chetana; Sahi, Chandan

2016-07-01

Heat shock proteins of 40 kDa (Hsp40s), also called J proteins, are obligate partners of Hsp70s. Via their highly conserved and functionally critical J domain, J proteins interact and modulate the activity of their Hsp70 partners. Mutations in the critical residues in the J domain often result in the null phenotype for the J protein in question. However, as more J proteins have been characterized, it is becoming increasingly clear that a significant number of J proteins do not "completely" rely on their J domains to carry out their cellular functions, as previously thought. In some cases, regions outside the highly conserved J domain have become more important making the J domain dispensable for some, if not for all functions of a J protein. This has profound effects on the evolution of such J proteins. Here we present selected examples of J proteins that perform J domain independent functions and discuss this in the context of evolution of J proteins with dispensable J domains and J-like proteins in eukaryotes.
Structure and regulatory role of the C-terminal winged helix domain of the archaeal minichromosome maintenance complex

PubMed Central

Wiedemann, Christoph; Szambowska, Anna; Häfner, Sabine; Ohlenschläger, Oliver; Gührs, Karl-Heinz; Görlach, Matthias

2015-01-01

The minichromosome maintenance complex (MCM) represents the replicative DNA helicase both in eukaryotes and archaea. Here, we describe the solution structure of the C-terminal domains of the archaeal MCMs of Sulfolobus solfataricus (Sso) and Methanothermobacter thermautotrophicus (Mth). Those domains consist of a structurally conserved truncated winged helix (WH) domain lacking the two typical ‘wings’ of canonical WH domains. A less conserved N-terminal extension links this WH module to the MCM AAA+ domain forming the ATPase center. In the Sso MCM this linker contains a short α-helical element. Using Sso MCM mutants, including chimeric constructs containing Mth C-terminal domain elements, we show that the ATPase and helicase activity of the Sso MCM is significantly modulated by the short α-helical linker element and by N-terminal residues of the first α-helix of the truncated WH module. Finally, based on our structural and functional data, we present a docking-derived model of the Sso MCM, which implies an allosteric control of the ATPase center by the C-terminal domain. PMID:25712103
PHYSICO2: an UNIX based standalone procedure for computation of physicochemical, window-dependent and substitution based evolutionary properties of protein sequences along with automated block preparation tool, version 2.

PubMed

Banerjee, Shyamashree; Gupta, Parth Sarthi Sen; Nayek, Arnab; Das, Sunit; Sur, Vishma Pratap; Seth, Pratyay; Islam, Rifat Nawaz Ul; Bandyopadhyay, Amal K

2015-01-01

Automated genome sequencing procedure is enriching the sequence database very fast. To achieve a balance between the entry of sequences in the database and their analyses, efficient software is required. In this end PHYSICO2, compare to earlier PHYSICO and other public domain tools, is most efficient in that it i] extracts physicochemical, window-dependent and homologousposition-based-substitution (PWS) properties including positional and BLOCK-specific diversity and conservation, ii] provides users with optional-flexibility in setting relevant input-parameters, iii] helps users to prepare BLOCK-FASTA-file by the use of Automated Block Preparation Tool of the program, iv] performs fast, accurate and user-friendly analyses and v] redirects itemized outputs in excel format along with detailed methodology. The program package contains documentation describing application of methods. Overall the program acts as efficient PWS-analyzer and finds application in sequence-bioinformatics. PHYSICO2: is freely available at http://sourceforge.net/projects/physico2/ along with its documentation at https://sourceforge.net/projects/physico2/files/Documentation.pdf/download for all users.

PHYSICO2: an UNIX based standalone procedure for computation of physicochemical, window-dependent and substitution based evolutionary properties of protein sequences along with automated block preparation tool, version 2

PubMed Central

Banerjee, Shyamashree; Gupta, Parth Sarthi Sen; Nayek, Arnab; Das, Sunit; Sur, Vishma Pratap; Seth, Pratyay; Islam, Rifat Nawaz Ul; Bandyopadhyay, Amal K

2015-01-01

Automated genome sequencing procedure is enriching the sequence database very fast. To achieve a balance between the entry of sequences in the database and their analyses, efficient software is required. In this end PHYSICO2, compare to earlier PHYSICO and other public domain tools, is most efficient in that it i] extracts physicochemical, window-dependent and homologousposition-based-substitution (PWS) properties including positional and BLOCK-specific diversity and conservation, ii] provides users with optional-flexibility in setting relevant input-parameters, iii] helps users to prepare BLOCK-FASTA-file by the use of Automated Block Preparation Tool of the program, iv] performs fast, accurate and user-friendly analyses and v] redirects itemized outputs in excel format along with detailed methodology. The program package contains documentation describing application of methods. Overall the program acts as efficient PWS-analyzer and finds application in sequence-bioinformatics. Availability PHYSICO2: is freely available at http://sourceforge.net/projects/physico2/ along with its documentation at https://sourceforge.net/projects/physico2/files/Documentation.pdf/download for all users. PMID:26339154
TFIID TAF6-TAF9 Complex Formation Involves the HEAT Repeat-containing C-terminal Domain of TAF6 and Is Modulated by TAF5 Protein*

PubMed Central

Scheer, Elisabeth; Delbac, Frédéric; Tora, Laszlo; Moras, Dino; Romier, Christophe

2012-01-01

The general transcription factor TFIID recognizes specifically the core promoter of genes transcribed by eukaryotic RNA polymerase II, nucleating the assembly of the preinitiation complex at the transcription start site. However, the understanding in molecular terms of TFIID assembly and function remains poorly understood. Histone fold motifs have been shown to be extremely important for the heterodimerization of many TFIID subunits. However, these subunits display several evolutionary conserved noncanonical features when compared with histones, including additional regions whose role is unknown. Here we show that the conserved additional C-terminal region of TFIID subunit TAF6 can be divided into two domains: a small middle domain (TAF6M) and a large C-terminal domain (TAF6C). Our crystal structure of the TAF6C domain from Antonospora locustae at 1.9 Å resolution reveals the presence of five conserved HEAT repeats. Based on these data, we designed several mutants that were introduced into full-length human TAF6. Surprisingly, the mutants affect the interaction between TAF6 and TAF9, suggesting that the formation of the complex between these two TFIID subunits do not only depend on their histone fold motifs. In addition, the same mutants affect even more strongly the interaction between TAF6 and TAF9 in the context of a TAF5-TAF6-TAF9 complex. Expression of these mutants in HeLa cells reveals that most of them are unstable, suggesting their poor incorporation within endogenous TFIID. Taken together, our results suggest that the conserved additional domains in histone fold-containing subunits of TFIID and of co-activator SAGA are important for the assembly of these complexes. PMID:22696218
Comparative analysis of the L, M, and S RNA segments of Crimean-Congo haemorrhagic fever virus isolates from southern Africa.

PubMed

Goedhals, Dominique; Bester, Phillip A; Paweska, Janusz T; Swanepoel, Robert; Burt, Felicity J

2015-05-01

Crimean-Congo haemorrhagic fever virus (CCHFV) is a member of the Bunyaviridae family with a tripartite, negative sense RNA genome. This study used predictive software to analyse the L (large), M (medium), and S (small) segments of 14 southern African CCHFV isolates. The OTU-like cysteine protease domain and the RdRp domain of the L segment are highly conserved among southern African CCHFV isolates. The M segment encodes the structural glycoproteins, GN and GC, and the non-structural glycoproteins which are post-translationally cleaved at highly conserved furin and subtilase SKI-1 cleavage sites. All of the sites previously identified were shown to be conserved among southern African CCHFV isolates. The heavily O-glycosylated N-terminal variable mucin-like domain of the M segment shows the highest sequence variability of the CCHFV proteins. Five transmembrane domains are predicted in the M segment polyprotein resulting in three regions internal to and three regions external to the membrane across the G(N), NS(M) and G(C) glycoproteins. The corroboration of conserved genome domains and sequence identity among geographically diverse isolates may assist in the identification of protein function and pathogenic mechanisms, as well as the identification of potential targets for antiviral therapy and vaccine design. As detailed functional studies are lacking for many of the CCHFV proteins, identification of functional domains by prediction of protein structure, and identification of amino acid level similarity to functionally characterised proteins of related viruses or viruses with similar pathogenic mechanisms are a necessary step for selection of areas for further study. © 2015 Wiley Periodicals, Inc.
MyTH4-FERM myosins have an ancient and conserved role in filopod formation

PubMed Central

Goodson, Holly V.; Arthur, Ashley L.; Luxton, G. W. Gant; Houdusse, Anne; Titus, Margaret A.

2016-01-01

The formation of filopodia in Metazoa and Amoebozoa requires the activity of myosin 10 (Myo10) in mammalian cells and of Dictyostelium unconventional myosin 7 (DdMyo7) in the social amoeba Dictyostelium. However, the exact roles of these MyTH4-FERM myosins (myosin tail homology 4-band 4.1, ezrin, radixin, moesin; MF) in the initiation and elongation of filopodia are not well defined and may reflect conserved functions among phylogenetically diverse MF myosins. Phylogenetic analysis of MF myosin domains suggests that a single ancestral MF myosin existed with a structure similar to DdMyo7, which has two MF domains, and that subsequent duplications in the metazoan lineage produced its functional homolog Myo10. The essential functional features of the DdMyo7 myosin were identified using quantitative live-cell imaging to characterize the ability of various mutants to rescue filopod formation in myo7-null cells. The two MF domains were found to function redundantly in filopod formation with the C-terminal FERM domain regulating both the number of filopodia and their elongation velocity. DdMyo7 mutants consisting solely of the motor plus a single MyTH4 domain were found to be capable of rescuing the formation of filopodia, establishing the minimal elements necessary for the function of this myosin. Interestingly, a chimeric myosin with the Myo10 MF domain fused to the DdMyo7 motor also was capable of rescuing filopod formation in the myo7-null mutant, supporting fundamental functional conservation between these two distant myosins. Together, these findings reveal that MF myosins have an ancient and conserved role in filopod formation. PMID:27911821
CCProf: exploring conformational change profile of proteins

PubMed Central

Chang, Che-Wei; Chou, Chai-Wei; Chang, Darby Tien-Hao

2016-01-01

In many biological processes, proteins have important interactions with various molecules such as proteins, ions or ligands. Many proteins undergo conformational changes upon these interactions, where regions with large conformational changes are critical to the interactions. This work presents the CCProf platform, which provides conformational changes of entire proteins, named conformational change profile (CCP) in the context. CCProf aims to be a platform where users can study potential causes of novel conformational changes. It provides 10 biological features, including conformational change, potential binding target site, secondary structure, conservation, disorder propensity, hydropathy propensity, sequence domain, structural domain, phosphorylation site and catalytic site. All these information are integrated into a well-aligned view, so that researchers can capture important relevance between different biological features visually. The CCProf contains 986 187 protein structure pairs for 3123 proteins. In addition, CCProf provides a 3D view in which users can see the protein structures before and after conformational changes as well as binding targets that induce conformational changes. All information (e.g. CCP, binding targets and protein structures) shown in CCProf, including intermediate data are available for download to expedite further analyses. Database URL: http://zoro.ee.ncku.edu.tw/ccprof/ PMID:27016699
Costimulatory receptors in jawed vertebrates: Conserved CD28, odd CTLA4 and multiple BTLAs

USGS Publications Warehouse

Bernard, D.; Hansen, J.D.; Du, Pasquier L.; Lefranc, M.-P.; Benmansour, A.; Boudinot, P.

2007-01-01

CD28 family of costimulatory receptors is comprised of molecules with a single V-type extracellular Ig domain, a transmembrane and an intracytoplasmic region with signaling motifs. CD28 and cytotoxic T lymphocyte antigen-4 (CTLA4) homologs have been recently identified in rainbow trout. Other sequences similar to mammalian CD28 family members have now been identified using teleost, Xenopus and chicken databases. CD28- and CTLA4 homologs were found in all vertebrate classes whereas inducible costimulatory signal (ICOS) was restricted to tetrapods, and programmed cell death-1 (PD1) was limited to mammals and chicken. Multiple B and T Lymphocyte Attenuator (BTLA) sequences were found in teleosts, but not in Xenopus or in avian genomes. The intron/exon structure of btlas was different from that of cd28 and other members of the family. The Ig domain encoded in all the btla genes has features of the C-type structure, which suggests that BTLA does not belong to the CD28 family. The genomic localization of these genes in vertebrate genomes supports the split between the BTLA and CD28 families. ?? 2006 Elsevier Ltd. All rights reserved.
Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

PubMed Central

Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

2003-01-01

To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
Structural and functional determinants of conserved lipid interaction domains of inward rectifying Kir6.2 channels.

PubMed

Cukras, Catherine A; Jeliazkova, Iana; Nichols, Colin G

2002-06-01

All members of the inward rectifiier K(+) (Kir) channel family are activated by phosphoinositides and other amphiphilic lipids. To further elucidate the mechanistic basis, we examined the membrane association of Kir6.2 fragments of K(ATP) channels, and the effects of site-directed mutations of these fragments and full-length Kir6.2 on membrane association and K(ATP) channel activity, respectively. GFP-tagged Kir6.2 COOH terminus and GFP-tagged pleckstrin homology domain from phospholipase C delta1 both associate with isolated membranes, and association of each is specifically reduced by muscarinic m1 receptor-mediated phospholipid depletion. Kir COOH termini are predicted to contain multiple beta-strands and a conserved alpha-helix (residues approximately 306-311 in Kir6.2). Systematic mutagenesis of D307-F315 reveals a critical role of E308, I309, W311 and F315, consistent with residues lying on one side of a alpha-helix. Together with systematic mutation of conserved charges, the results define critical determinants of a conserved domain that underlies phospholipid interaction in Kir channels.
Proteomic Identification of Monoclonal Antibodies from Serum

PubMed Central

2015-01-01

Characterizing the in vivo dynamics of the polyclonal antibody repertoire in serum, such as that which might arise in response to stimulation with an antigen, is difficult due to the presence of many highly similar immunoglobulin proteins, each specified by distinct B lymphocytes. These challenges have precluded the use of conventional mass spectrometry for antibody identification based on peptide mass spectral matches to a genomic reference database. Recently, progress has been made using bottom-up analysis of serum antibodies by nanoflow liquid chromatography/high-resolution tandem mass spectrometry combined with a sample-specific antibody sequence database generated by high-throughput sequencing of individual B cell immunoglobulin variable domains (V genes). Here, we describe how intrinsic features of antibody primary structure, most notably the interspersed segments of variable and conserved amino acid sequences, generate recurring patterns in the corresponding peptide mass spectra of V gene peptides, greatly complicating the assignment of correct sequences to mass spectral data. We show that the standard method of decoy-based error modeling fails to account for the error introduced by these highly similar sequences, leading to a significant underestimation of the false discovery rate. Because of these effects, antibody-derived peptide mass spectra require increased stringency in their interpretation. The use of filters based on the mean precursor ion mass accuracy of peptide-spectrum matches is shown to be particularly effective in distinguishing between “true” and “false” identifications. These findings highlight important caveats associated with the use of standard database search and error-modeling methods with nonstandard data sets and custom sequence databases. PMID:24684310
Development of five digits is controlled by a bipartite long-range cis-regulator

PubMed Central

Lettice, Laura A.; Williamson, Iain; Devenney, Paul S.; Kilanowski, Fiona; Dorin, Julia; Hill, Robert E.

2014-01-01

Conservation within intergenic DNA often highlights regulatory elements that control gene expression from a long range. How conservation within a single element relates to regulatory information and how internal composition relates to function is unknown. Here, we examine the structural features of the highly conserved ZRS (also called MFCS1) cis-regulator responsible for the spatiotemporal control of Shh in the limb bud. By systematically dissecting the ZRS, both in transgenic assays and within in the endogenous locus, we show that the ZRS is, in effect, composed of two distinct domains of activity: one domain directs spatiotemporal activity but functions predominantly from a short range, whereas a second domain is required to promote long-range activity. We show further that these two domains encode activities that are highly integrated and that the second domain is crucial in promoting the chromosomal conformational changes correlated with gene activity. During limb bud development, these activities encoded by the ZRS are interpreted differently by the fore limbs and the hind limbs; in the absence of the second domain there is no Shh activity in the fore limb, and in the hind limb low levels of Shh lead to a variant digit pattern ranging from two to four digits. Hence, in the embryo, the second domain stabilises the developmental programme providing a buffer for SHH morphogen activity and this ensures that five digits form in both sets of limbs. PMID:24715461
ELMO Domains, Evolutionary and Functional Characterization of a Novel GTPase-activating Protein (GAP) Domain for Arf Protein Family GTPases*

PubMed Central

East, Michael P.; Bowzard, J. Bradford; Dacks, Joel B.; Kahn, Richard A.

2012-01-01

The human family of ELMO domain-containing proteins (ELMODs) consists of six members and is defined by the presence of the ELMO domain. Within this family are two subclassifications of proteins, based on primary sequence conservation, protein size, and domain architecture, deemed ELMOD and ELMO. In this study, we used homology searching and phylogenetics to identify ELMOD family homologs in genomes from across eukaryotic diversity. This demonstrated not only that the protein family is ancient but also that ELMOs are potentially restricted to the supergroup Opisthokonta (Metazoa and Fungi), whereas proteins with the ELMOD organization are found in diverse eukaryotes and thus were likely the form present in the last eukaryotic common ancestor. The segregation of the ELMO clade from the larger ELMOD group is consistent with their contrasting functions as unconventional Rac1 guanine nucleotide exchange factors and the Arf family GTPase-activating proteins, respectively. We used unbiased, phylogenetic sorting and sequence alignments to identify the most highly conserved residues within the ELMO domain to identify a putative GAP domain within the ELMODs. Three independent but complementary assays were used to provide an initial characterization of this domain. We identified a highly conserved arginine residue critical for both the biochemical and cellular GAP activity of ELMODs. We also provide initial evidence of the function of human ELMOD1 as an Arf family GAP at the Golgi. These findings provide the basis for the future study of the ELMOD family of proteins and a new avenue for the study of Arf family GTPases. PMID:23014990
Effects of agricultural conservation practices on N loads in the Mississippi-Atchafalya River Basin

USDA-ARS?s Scientific Manuscript database

A modeling framework consisting of a farm-scale model, Agricultural Policy Environmental Extender (APEX); a watershedscale model, Soil and Water Assessment Tool (SWAT); and databases was used in the Conservation Effects Assessment Project to quantify the environmental benefits of conservation practi...
Functional evidence for the critical amino-terminal conserved domain and key amino acids of Arabidopsis 4-HYDROXY-3-METHYLBUT-2-ENYL DIPHOSPHATE REDUCTASE.

PubMed

Hsieh, Wei-Yu; Sung, Tzu-Ying; Wang, Hsin-Tzu; Hsieh, Ming-Hsiun

2014-09-01

The plant 4-HYDROXY-3-METHYLBUT-2-ENYL DIPHOSPHATE REDUCTASE (HDR) catalyzes the last step of the methylerythritol phosphate pathway to synthesize isopentenyl diphosphate and its allyl isomer dimethylallyl diphosphate, which are common precursors for the synthesis of plastid isoprenoids. The Arabidopsis (Arabidopsis thaliana) genomic HDR transgene-induced gene-silencing lines are albino, variegated, or pale green, confirming that HDR is essential for plants. We used Escherichia coli isoprenoid synthesis H (Protein Data Bank code 3F7T) as a template for homology modeling to identify key amino acids of Arabidopsis HDR. The predicted model reveals that cysteine (Cys)-122, Cys-213, and Cys-350 are involved in iron-sulfur cluster formation and that histidine (His)-152, His-241, glutamate (Glu)-242, Glu-243, threonine (Thr)-244, Thr-312, serine-379, and asparagine-381 are related to substrate binding or catalysis. Glu-242 and Thr-244 are conserved only in cyanobacteria, green algae, and land plants, whereas the other key amino acids are absolutely conserved from bacteria to plants. We used site-directed mutagenesis and complementation assay to confirm that these amino acids, except His-152 and His-241, were critical for Arabidopsis HDR function. Furthermore, the Arabidopsis HDR contains an extra amino-terminal domain following the transit peptide that is highly conserved from cyanobacteria, and green algae to land plants but not existing in the other bacteria. We demonstrated that the amino-terminal conserved domain was essential for Arabidopsis and cyanobacterial HDR function. Further analysis of conserved amino acids in the amino-terminal conserved domain revealed that the tyrosine-72 residue was critical for Arabidopsis HDR. These results suggest that the structure and reaction mechanism of HDR evolution have become specific for oxygen-evolving photosynthesis organisms and that HDR probably evolved independently in cyanobacteria versus other prokaryotes. © 2014 American Society of Plant Biologists. All Rights Reserved.
Indication criteria for total hip or knee arthroplasty in osteoarthritis: a state-of-the-science overview.

PubMed

Gademan, Maaike G J; Hofstede, Stefanie N; Vliet Vlieland, Thea P M; Nelissen, Rob G H H; Marang-van de Mheen, Perla J

2016-11-09

This systematic review gives an overview of guidelines and original publications as well as the evidence on which the currently proposed indication criteria are based. Until now such a state-of-the-science overview was lacking. Websites of orthopaedic and arthritis organizations (English/Dutch language) were independently searched by two authors for THA/TKA guidelines for OA. Furthermore, a systematic search strategy in several databases through August 2014 was performed. Quality of the guidelines was assessed with the AGREE II instrument, which consists of 6 domains (maximum summed score of 6 indicating high quality). Also, the level of evidence of all included studies was assessed. We found 6 guidelines and 18 papers, out of 3065 references. The quality of the guidelines summed across 6 domains ranged from 0.46 to 4.78. In total, 12 THA, 10 TKA and 2 THA/TKA indication sets were found. Four studies stated that no evidence-based indication criteria are available. Indication criteria concerning THA/TKA consisted of the following domains: pain (in respectively 11 and 10 sets), function (12 and 7 sets), radiological changes (10 and 9 sets), failed conservative therapy (8 and 4 sets) and other indications (6 and 7 sets). Specific cut-off values or ranges were often not stated and the level of evidence was low. The indication criteria for THA/TKA are based on limited evidence. Empirical research is needed, especially regarding domain specific cut-off values or ranges at which the best postoperative outcomes are achieved for patients, taking into account the limited lifespan of a prosthesis.
Chimeric Saccharomyces cerevisiae Msh6 protein with an Msh3 mispair-binding domain combines properties of both proteins.

PubMed

Shell, Scarlet S; Putnam, Christopher D; Kolodner, Richard D

2007-06-26

Msh2-Msh3 and Msh2-Msh6 are two partially redundant mispair-recognition complexes that initiate mismatch repair in eukaryotes. Crystal structures of the prokaryotic homolog MutS suggest the mechanism by which Msh6 interacts with mispairs because key mispair-contacting residues are conserved in these two proteins. Because Msh3 lacks these conserved residues, we constructed a series of mutants to investigate the requirements for mispair interaction by Msh3. We found that a chimeric protein in which the mispair-binding domain (MBD) of Msh6 was replaced by the equivalent domain of Msh3 was functional for mismatch repair. This chimera possessed the mispair-binding specificity of Msh3 and revealed that communication between the MBD and the ATPase domain is conserved between Msh2-Msh3 and Msh2-Msh6. Further, the chimeric protein retained Msh6-like properties with respect to genetic interactions with the MutL homologs and an Msh2 MBD deletion mutant, indicating that Msh3-like behaviors beyond mispair specificity are not features controlled by the MBD.
Choice of population database for forensic DNA profile analysis.

PubMed

Steele, Christopher D; Balding, David J

2014-12-01

When evaluating the weight of evidence (WoE) for an individual to be a contributor to a DNA sample, an allele frequency database is required. The allele frequencies are needed to inform about genotype probabilities for unknown contributors of DNA to the sample. Typically databases are available from several populations, and a common practice is to evaluate the WoE using each available database for each unknown contributor. Often the most conservative WoE (most favourable to the defence) is the one reported to the court. However the number of human populations that could be considered is essentially unlimited and the number of contributors to a sample can be large, making it impractical to perform every possible WoE calculation, particularly for complex crime scene profiles. We propose instead the use of only the database that best matches the ancestry of the queried contributor, together with a substantial FST adjustment. To investigate the degree of conservativeness of this approach, we performed extensive simulations of one- and two-contributor crime scene profiles, in the latter case with, and without, the profile of the second contributor available for the analysis. The genotypes were simulated using five population databases, which were also available for the analysis, and evaluations of WoE using our heuristic rule were compared with several alternative calculations using different databases. Using FST=0.03, we found that our heuristic gave WoE more favourable to the defence than alternative calculations in well over 99% of the comparisons we considered; on average the difference in WoE was just under 0.2 bans (orders of magnitude) per locus. The degree of conservativeness of the heuristic rule can be adjusted through the FST value. We propose the use of this heuristic for DNA profile WoE calculations, due to its ease of implementation, and efficient use of the evidence while allowing a flexible degree of conservativeness. Copyright © 2014. Published by Elsevier Ireland Ltd.
Putting people on the map through an approach that integrates social data in conservation planning.

PubMed

Stephanson, Sheri L; Mascia, Michael B

2014-10-01

Conservation planning is integral to strategic and effective operations of conservation organizations. Drawing upon biological sciences, conservation planning has historically made limited use of social data. We offer an approach for integrating data on social well-being into conservation planning that captures and places into context the spatial patterns and trends in human needs and capacities. This hierarchical approach provides a nested framework for characterizing and mapping data on social well-being in 5 domains: economic well-being, health, political empowerment, education, and culture. These 5 domains each have multiple attributes; each attribute may be characterized by one or more indicators. Through existing or novel data that display spatial and temporal heterogeneity in social well-being, conservation scientists, planners, and decision makers may measure, benchmark, map, and integrate these data within conservation planning processes. Selecting indicators and integrating these data into conservation planning is an iterative, participatory process tailored to the local context and planning goals. Social well-being data complement biophysical and threat-oriented social data within conservation planning processes to inform decisions regarding where and how to conserve biodiversity, provide a structure for exploring socioecological relationships, and to foster adaptive management. Building upon existing conservation planning methods and insights from multiple disciplines, this approach to putting people on the map can readily merge with current planning practices to facilitate more rigorous decision making. © 2014 Society for Conservation Biology.
Searching Across the International Space Station Databases

NASA Technical Reports Server (NTRS)

Maluf, David A.; McDermott, William J.; Smith, Ernest E.; Bell, David G.; Gurram, Mohana

2007-01-01

Data access in the enterprise generally requires us to combine data from different sources and different formats. It is advantageous thus to focus on the intersection of the knowledge across sources and domains; keeping irrelevant knowledge around only serves to make the integration more unwieldy and more complicated than necessary. A context search over multiple domain is proposed in this paper to use context sensitive queries to support disciplined manipulation of domain knowledge resources. The objective of a context search is to provide the capability for interrogating many domain knowledge resources, which are largely semantically disjoint. The search supports formally the tasks of selecting, combining, extending, specializing, and modifying components from a diverse set of domains. This paper demonstrates a new paradigm in composition of information for enterprise applications. In particular, it discusses an approach to achieving data integration across multiple sources, in a manner that does not require heavy investment in database and middleware maintenance. This lean approach to integration leads to cost-effectiveness and scalability of data integration with an underlying schemaless object-relational database management system. This highly scalable, information on demand system framework, called NX-Search, which is an implementation of an information system built on NETMARK. NETMARK is a flexible, high-throughput open database integration framework for managing, storing, and searching unstructured or semi-structured arbitrary XML and HTML used widely at the National Aeronautics Space Administration (NASA) and industry.
Mechanism of mRNA-STAR domain interaction: Molecular dynamics simulations of Mammalian Quaking STAR protein.

PubMed

Sharma, Monika; Anirudh, C R

2017-10-03

STAR proteins are evolutionary conserved mRNA-binding proteins that post-transcriptionally regulate gene expression at all stages of RNA metabolism. These proteins possess conserved STAR domain that recognizes identical RNA regulatory elements as YUAAY. Recently reported crystal structures show that STAR domain is composed of N-terminal QUA1, K-homology domain (KH) and C-terminal QUA2, and mRNA binding is mediated by KH-QUA2 domain. Here, we present simulation studies done to investigate binding of mRNA to STAR protein, mammalian Quaking protein (QKI). We carried out conventional MD simulations of STAR domain in presence and absence of mRNA, and studied the impact of mRNA on the stability, dynamics and underlying allosteric mechanism of STAR domain. Our unbiased simulations results show that presence of mRNA stabilizes the overall STAR domain by reducing the structural deviations, correlating the 'within-domain' motions, and maintaining the native contacts information. Absence of mRNA not only influenced the essential modes of motion of STAR domain, but also affected the connectivity of networks within STAR domain. We further explored the dissociation of mRNA from STAR domain using umbrella sampling simulations, and the results suggest that mRNA binding to STAR domain occurs in multi-step: first conformational selection of mRNA backbone conformations, followed by induced fit mechanism as nucleobases interact with STAR domain.
A Proteome-wide Domain-centric Perspective on Protein Phosphorylation *

PubMed Central

Palmeri, Antonio; Ausiello, Gabriele; Ferrè, Fabrizio; Helmer-Citterich, Manuela; Gherardini, Pier Federico

2014-01-01

Phosphorylation is a widespread post-translational modification that modulates the function of a large number of proteins. Here we show that a significant proportion of all the domains in the human proteome is significantly enriched or depleted in phosphorylation events. A substantial improvement in phosphosites prediction is achieved by leveraging this observation, which has not been tapped by existing methods. Phosphorylation sites are often not shared between multiple occurrences of the same domain in the proteome, even when the phosphoacceptor residue is conserved. This is partly because of different functional constraints acting on the same domain in different protein contexts. Moreover, by augmenting domain alignments with structural information, we were able to provide direct evidence that phosphosites in protein-protein interfaces need not be positionally conserved, likely because they can modulate interactions simply by sitting in the same general surface area. PMID:24830415

Selecting soluble/foldable protein domains through single-gene or genomic ORF filtering: structure of the head domain of Burkholderia pseudomallei antigen BPSL2063.

PubMed

Gourlay, Louise J; Peano, Clelia; Deantonio, Cecilia; Perletti, Lucia; Pietrelli, Alessandro; Villa, Riccardo; Matterazzo, Elena; Lassaux, Patricia; Santoro, Claudio; Puccio, Simone; Sblattero, Daniele; Bolognesi, Martino

2015-11-01

The 1.8 Å resolution crystal structure of a conserved domain of the potential Burkholderia pseudomallei antigen and trimeric autotransporter BPSL2063 is presented as a structural vaccinology target for melioidosis vaccine development. Since BPSL2063 (1090 amino acids) hosts only one conserved domain, and the expression/purification of the full-length protein proved to be problematic, a domain-filtering library was generated using β-lactamase as a reporter gene to select further BPSL2063 domains. As a result, two domains (D1 and D2) were identified and produced in soluble form in Escherichia coli. Furthermore, as a general tool, a genomic open reading frame-filtering library from the B. pseudomallei genome was also constructed to facilitate the selection of domain boundaries from the entire ORFeome. Such an approach allowed the selection of three potential protein antigens that were also produced in soluble form. The results imply the further development of ORF-filtering methods as a tool in protein-based research to improve the selection and production of soluble proteins or domains for downstream applications such as X-ray crystallography.
Comparison of Saccharomyces cerevisiae F-BAR domain structures reveals a conserved inositol phosphate binding site.

PubMed

Moravcevic, Katarina; Alvarado, Diego; Schmitz, Karl R; Kenniston, Jon A; Mendrola, Jeannine M; Ferguson, Kathryn M; Lemmon, Mark A

2015-02-03

F-BAR domains control membrane interactions in endocytosis, cytokinesis, and cell signaling. Although they are generally thought to bind curved membranes containing negatively charged phospholipids, numerous functional studies argue that differences in lipid-binding selectivities of F-BAR domains are functionally important. Here, we compare membrane-binding properties of the Saccharomyces cerevisiae F-BAR domains in vitro and in vivo. Whereas some F-BAR domains (such as Bzz1p and Hof1p F-BARs) bind equally well to all phospholipids, the F-BAR domain from the RhoGAP Rgd1p preferentially binds phosphoinositides. We determined X-ray crystal structures of F-BAR domains from Hof1p and Rgd1p, the latter bound to an inositol phosphate. The structures explain phospholipid-binding selectivity differences and reveal an F-BAR phosphoinositide binding site that is fully conserved in a mammalian RhoGAP called Gmip and is partly retained in certain other F-BAR domains. Our findings reveal previously unappreciated determinants of F-BAR domain lipid-binding specificity and provide a basis for its prediction from sequence. Copyright © 2015 Elsevier Ltd. All rights reserved.
Comparison of Saccharomyces cerevisiae F-BAR Domain Structures Reveals a Conserved Inositol Phosphate Binding Site

DOE PAGES

Moravcevic, Katarina; Alvarado, Diego; Schmitz, Karl R.; ...

2015-01-22

F-BAR domains control membrane interactions in endocytosis, cytokinesis, and cell signaling. Although they are generally thought to bind curved membranes containing negatively charged phospholipids, numerous functional studies argue that differences in lipid-binding selectivities of F-BAR domains are functionally important. Here in this paper, we compare membrane-binding properties of the Saccharomyces cerevisiae F-BAR domains in vitro and in vivo. Whereas some F-BAR domains (such as Bzz1p and Hof1p F-BARs) bind equally well to all phospholipids, the F-BAR domain from the RhoGAP Rgd1p preferentially binds phosphoinositides. We determined X-ray crystal structures of F-BAR domains from Hof1p and Rgd1p, the latter bound tomore » an inositol phosphate. The structures explain phospholipid-binding selectivity differences and reveal an F-BAR phosphoinositide binding site that is fully conserved in a mammalian RhoGAP called Gmip and is partly retained in certain other F-BAR domains. In conclusion, our findings reveal previously unappreciated determinants of F-BAR domain lipid-binding specificity and provide a basis for its prediction from sequence.« less
The complete mitochondrial genome of Lota lota (Gadiformes: Gadidae) from the Burqin River in China.

PubMed

Lu, Zhichuang; Zhang, Nan; Song, Na; Gao, Tianxiang

2016-05-01

In this study, the complete mitochondrial genome (mitogenome) sequence of Lota lota has been determined by long polymerase chain reaction and primer walking methods. The mitogenome is a circular molecule of 16,519 bp in length and contains 37 mitochondrial genes including 13 protein-coding genes, 2 ribosomal RNA (rRNA), 22 transfer RNA (tRNA) and a control region as other bony fishes. Within the control region, we identified the termination-associated sequence domain (TAS), the central conserved sequence block domains (CSB-F and CSB-D), and the conserved sequence block domains (CSB-1, CSB-2 and CSB-3).
The history of the CATH structural classification of protein domains.

PubMed

Sillitoe, Ian; Dawson, Natalie; Thornton, Janet; Orengo, Christine

2015-12-01

This article presents a historical review of the protein structure classification database CATH. Together with the SCOP database, CATH remains comprehensive and reasonably up-to-date with the now more than 100,000 protein structures in the PDB. We review the expansion of the CATH and SCOP resources to capture predicted domain structures in the genome sequence data and to provide information on the likely functions of proteins mediated by their constituent domains. The establishment of comprehensive function annotation resources has also meant that domain families can be functionally annotated allowing insights into functional divergence and evolution within protein families. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants

PubMed Central

Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B.; Tóth, Gábor; Ortutay, Csaba P.; Patthy, László

2005-01-01

DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21 061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically. PMID:15608291
DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants.

PubMed

Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B; Tóth, Gábor; Ortutay, Csaba P; Patthy, László

2005-01-01

DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically.
The Nuclear Protein Database (NPD): sub-nuclear localisation and functional annotation of the nuclear proteome

PubMed Central

Dellaire, G.; Farrall, R.; Bickmore, W.A.

2003-01-01

The Nuclear Protein Database (NPD) is a curated database that contains information on more than 1300 vertebrate proteins that are thought, or are known, to localise to the cell nucleus. Each entry is annotated with information on predicted protein size and isoelectric point, as well as any repeats, motifs or domains within the protein sequence. In addition, information on the sub-nuclear localisation of each protein is provided and the biological and molecular functions are described using Gene Ontology (GO) terms. The database is searchable by keyword, protein name, sub-nuclear compartment and protein domain/motif. Links to other databases are provided (e.g. Entrez, SWISS-PROT, OMIM, PubMed, PubMed Central). Thus, NPD provides a gateway through which the nuclear proteome may be explored. The database can be accessed at http://npd.hgu.mrc.ac.uk and is updated monthly. PMID:12520015
77 FR 12234 - Changes in Hydric Soils Database Selection Criteria

Federal Register 2010, 2011, 2012, 2013, 2014

2012-02-29

... Conservation Service [Docket No. NRCS-2011-0026] Changes in Hydric Soils Database Selection Criteria AGENCY... Changes to the National Soil Information System (NASIS) Database Selection Criteria for Hydric Soils of the United States. SUMMARY: The National Technical Committee for Hydric Soils (NTCHS) has updated the...
Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

PubMed Central

Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

2007-01-01

We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
Complexity of type-specific 56 kDa antigen CD4 T-cell epitopes of Orientia tsutsugamushi strains causing scrub typhus in India

PubMed Central

Dasch, Gregory A.

2018-01-01

Orientia tsutsugamushi (Ots) is an obligate, intracellular, mite-transmitted human pathogen which causes scrub typhus. Understanding the diversity of Ots antigens is essential for designing specific diagnostic assays and efficient vaccines. The protective immunodominant type-specific 56 kDa antigen (TSA) of Ots varies locally and across its geographic distribution. TSA contains four hypervariable domains. We bioinformatically analyzed 345 partial sequences of TSA available from India, most of which contain only the three variable domains (VDI-III) and three spacer conserved domains (SVDI, SVDII/III, SVDIII). The total number (152) of antigenic types (amino acid variants) varied from 14–36 in the six domains of TSA that we studied. Notably, 55% (787/1435) of the predicted CD4 T-cell epitopes (TCEs) from all the six domains had high binding affinities (HBA) to at least one of the prevalent Indian human leukocyte antigen (HLA) alleles. A surprisingly high proportion (61%) of such TCEs were from spacer domains; indeed 100% of the CD4 TCEs in the SVDI were HBA. TSA sequences from India had more antigenic types (AT) than TSA from Korea. Overall, >90% of predicted CD4 TCEs from spacer domains were predicted to have HBA against one or more prevalent HLA types from Indian, Korean, Asia-Pacific region or global population data sets, while only <50% of CD4 TCEs in variable domains exhibited such HBA. The phylogenetically and immunologically important amino acids in the conserved spacer domains were identified. Our results suggest that the conserved spacer domains are predicted to be functionally more important than previously appreciated in immune responses to Ots infections. Changes occurring at the TCE level of TSA may contribute to the wide range of pathogenicity of Ots in humans and mouse models. CD4 T-cell functional experiments are needed to assess the immunological significance of these HBA spacer domains and their role in clearance of Ots from Indian patients. PMID:29698425
Alternative dimerization interfaces in the glucocorticoid receptor-α ligand binding domain.

PubMed

Bianchetti, Laurent; Wassmer, Bianca; Defosset, Audrey; Smertina, Anna; Tiberti, Marion L; Stote, Roland H; Dejaegere, Annick

2018-04-30

Nuclear hormone receptors (NRs) constitute a large family of multi-domain ligand-activated transcription factors. Dimerization is essential for their regulation, and both DNA binding domain (DBD) and ligand binding domain (LBD) are implicated in dimerization. Intriguingly, the glucocorticoid receptor-α (GRα) presents a DBD dimeric architecture similar to that of the homologous estrogen receptor-α (ERα), but an atypical dimeric architecture for the LBD. The physiological relevance of the proposed GRα LBD dimer is a subject of debate. We analyzed all GRα LBD homodimers observed in crystals using an energetic analysis based on the PISA and on the MM/PBSA methods and a sequence conservation analysis, using the ERα LBD dimer as a reference point. Several dimeric assemblies were observed for GRα LBD. The assembly generally taken to be physiologically relevant showed weak binding free energy and no significant residue conservation at the contact interface, while an alternative homodimer mediated by both helix 9 and C-terminal residues showed significant binding free energy and residue conservation. However, none of the GRα LBD assemblies found in crystals are as stable or conserved as the canonical ERα LBD dimer. GRα C-terminal sequence (F-domain) forms a steric obstacle to the canonical dimer assembly in all available structures. Our analysis calls for a re-examination of the currently accepted GRα homodimer structure and experimental investigations of the alternative architectures. This work questions the validity of the currently accepted architecture. This has implications for interpreting physiological data and for therapeutic design pertaining to glucocorticoid research. Copyright © 2018. Published by Elsevier B.V.
A Comparison of Selected Bibliographic Database Subject Overlap for Agricultural Information

ERIC Educational Resources Information Center

Ritchie, Stephanie M.; Young, Lauren M.; Sigman, Jessica

2018-01-01

Agricultural researchers and science librarians must understand which research literature databases provide the most comprehensive coverage of agricultural subjects to support their inquiries. Once the domain of a few specialized databases, agricultural research literature is now covered by broad, multidisciplinary databases. The purpose of this…
Phylogenetic Analysis and Classification of the Fungal bHLH Domain

PubMed Central

Sailsbery, Joshua K.; Atchley, William R.; Dean, Ralph A.

2012-01-01

The basic Helix-Loop-Helix (bHLH) domain is an essential highly conserved DNA-binding domain found in many transcription factors in all eukaryotic organisms. The bHLH domain has been well studied in the Animal and Plant Kingdoms but has yet to be characterized within Fungi. Herein, we obtained and evaluated the phylogenetic relationship of 490 fungal-specific bHLH containing proteins from 55 whole genome projects composed of 49 Ascomycota and 6 Basidiomycota organisms. We identified 12 major groupings within Fungi (F1–F12); identifying conserved motifs and functions specific to each group. Several classification models were built to distinguish the 12 groups and elucidate the most discerning sites in the domain. Performance testing on these models, for correct group classification, resulted in a maximum sensitivity and specificity of 98.5% and 99.8%, respectively. We identified 12 highly discerning sites and incorporated those into a set of rules (simplified model) to classify sequences into the correct group. Conservation of amino acid sites and phylogenetic analyses established that like plant bHLH proteins, fungal bHLH–containing proteins are most closely related to animal Group B. The models used in these analyses were incorporated into a software package, the source code for which is available at www.fungalgenomics.ncsu.edu. PMID:22114358
DOE Office of Scientific and Technical Information (OSTI.GOV)

Helander, Sara; Montecchio, Meri; Lemak, Alexander

Highlights: • We describe the structure of a novel fold in FKBP25 and HectD. • The new fold is named the Basic Tilted Helix Bundle (BTHB) domain. • A conserved basic surface patch is presented, suggesting a functional role. - Abstract: In this paper, we describe the structure of a N-terminal domain motif in nuclear-localized FKBP25{sub 1–73}, a member of the FKBP family, together with the structure of a sequence-related subdomain of the E3 ubiquitin ligase HectD1 that we show belongs to the same fold. This motif adopts a compact 5-helix bundle which we name the Basic Tilted Helix Bundlemore » (BTHB) domain. A positively charged surface patch, structurally centered around the tilted helix H4, is present in both FKBP25 and HectD1 and is conserved in both proteins, suggesting a conserved functional role. We provide detailed comparative analysis of the structures of the two proteins and their sequence similarities, and analysis of the interaction of the proposed FKBP25 binding protein YY1. We suggest that the basic motif in BTHB is involved in the observed DNA binding of FKBP25, and that the function of this domain can be affected by regulatory YY1 binding and/or interactions with adjacent domains.« less
A Bioinformatic Strategy for the Detection, Classification and Analysis of Bacterial Autotransporters

PubMed Central

Celik, Nermin; Webb, Chaille T.; Leyton, Denisse L.; Holt, Kathryn E.; Heinz, Eva; Gorrell, Rebecca; Kwok, Terry; Naderer, Thomas; Strugnell, Richard A.; Speed, Terence P.; Teasdale, Rohan D.; Likić, Vladimir A.; Lithgow, Trevor

2012-01-01

Autotransporters are secreted proteins that are assembled into the outer membrane of bacterial cells. The passenger domains of autotransporters are crucial for bacterial pathogenesis, with some remaining attached to the bacterial surface while others are released by proteolysis. An enigma remains as to whether autotransporters should be considered a class of secretion system, or simply a class of substrate with peculiar requirements for their secretion. We sought to establish a sensitive search protocol that could identify and characterize diverse autotransporters from bacterial genome sequence data. The new sequence analysis pipeline identified more than 1500 autotransporter sequences from diverse bacteria, including numerous species of Chlamydiales and Fusobacteria as well as all classes of Proteobacteria. Interrogation of the proteins revealed that there are numerous classes of passenger domains beyond the known proteases, adhesins and esterases. In addition the barrel-domain-a characteristic feature of autotransporters-was found to be composed from seven conserved sequence segments that can be arranged in multiple ways in the tertiary structure of the assembled autotransporter. One of these conserved motifs overlays the targeting information required for autotransporters to reach the outer membrane. Another conserved and diagnostic motif maps to the linker region between the passenger domain and barrel-domain, indicating it as an important feature in the assembly of autotransporters. PMID:22905239
Molecular scaffold analysis of natural products databases in the public domain.

PubMed

Yongye, Austin B; Waddell, Jacob; Medina-Franco, José L

2012-11-01

Natural products represent important sources of bioactive compounds in drug discovery efforts. In this work, we compiled five natural products databases available in the public domain and performed a comprehensive chemoinformatic analysis focused on the content and diversity of the scaffolds with an overview of the diversity based on molecular fingerprints. The natural products databases were compared with each other and with a set of molecules obtained from in-house combinatorial libraries, and with a general screening commercial library. It was found that publicly available natural products databases have different scaffold diversity. In contrast to the common concept that larger libraries have the largest scaffold diversity, the largest natural products collection analyzed in this work was not the most diverse. The general screening library showed, overall, the highest scaffold diversity. However, considering the most frequent scaffolds, the general reference library was the least diverse. In general, natural products databases in the public domain showed low molecule overlap. In addition to benzene and acyclic compounds, flavones, coumarins, and flavanones were identified as the most frequent molecular scaffolds across the different natural products collections. The results of this work have direct implications in the computational and experimental screening of natural product databases for drug discovery. © 2012 John Wiley & Sons A/S.
Animal-specific C-terminal domain links myeloblastosis oncoprotein (Myb) to an ancient repressor complex

PubMed Central

Andrejka, Laura; Wen, Hong; Ashton, Jonathan; Grant, Megan; Iori, Kevin; Wang, Amy; Manak, J. Robert; Lipsick, Joseph S.

2011-01-01

Members of the Myb oncoprotein and E2F-Rb tumor suppressor protein families are present within the same highly conserved multiprotein transcriptional repressor complex, named either as Myb and synthetic multivuval class B (Myb-MuvB) or as Drosophila Rb E2F and Myb-interacting proteins (dREAM). We now report that the animal-specific C terminus of Drosophila Myb but not the more highly conserved N-terminal DNA-binding domain is necessary and sufficient for (i) adult viability, (ii) proper localization to chromosomes in vivo, (iii) regulation of gene expression in vivo, and (iv) interaction with the highly conserved core of the MuvB/dREAM transcriptional repressor complex. In addition, we have identified a conserved peptide motif that is required for this interaction. Our results imply that an ancient function of Myb in regulating G2/M genes in both plants and animals appears to have been transferred from the DNA-binding domain to the animal-specific C-terminal domain. Increased expression of B-MYB/MYBL2, the human ortholog of Drosophila Myb, correlates with poor prognosis in human patients with breast cancer. Therefore, our results imply that the specific interaction of the C terminus of Myb with the MuvB/dREAM core complex may provide an attractive target for the development of cancer therapeutics. PMID:21969598
Effects of random initial conditions on the dynamical scaling behaviors of a fixed-energy Manna sandpile model in one dimension

NASA Astrophysics Data System (ADS)

Kwon, Sungchul; Kim, Jin Min

2015-01-01

For a fixed-energy (FE) Manna sandpile model in one dimension, we investigate the effects of random initial conditions on the dynamical scaling behavior of an order parameter. In the FE Manna model, the density ρ of total particles is conserved, and an absorbing phase transition occurs at ρc as ρ varies. In this work, we show that, for a given ρ , random initial distributions of particles lead to the domain structure in which domains with particle densities higher and lower than ρc alternate with each other. In the domain structure, the dominant length scale is the average domain length, which increases via the coalescence of adjacent domains. At ρc, the domain structure slows down the decay of an order parameter and also causes anomalous finite-size effects, i.e., power-law decay followed by an exponential one before the quasisteady state. As a result, the interplay of particle conservation and random initial conditions causes the domain structure, which is the origin of the anomalous dynamical scaling behaviors for random initial conditions.
How the creative use of analogies can shape medical practice.

PubMed

Prasad, G V Ramesh

2015-06-01

Analogical reasoning is central to medical progress, and is either creative or conservative. According to Hofmann et al., conservative analogy relates concepts from old technology to new technologies with emphasis on preservation of comprehension and conduct. Creative analogy however brings new understanding to new technology, brings similarities existing in the source domain to a target domain where they previously had no bearing, and imports something entirely different from the content of the analogy itself. I defend the claim that while conservative analogies are useful by virtue of being comfortable to use from familiarity and experience, and are more easily accepted by society, they only lead to incremental advances in medicine. However, creative analogies are more exciting and productive because they generate previously unexpected associations across widely separated domains, emphasize relations over physical similarities, and structure over superficiality. I use kidney transplantation and anti-rejection medication development as an exemplar of analogical reasoning used to improve medical practice. Anti-rejection medication has not helped highly sensitized patients because of their propensity to rejecting most organs. I outline how conservative analogical reasoning led to anti-rejection medication development, but creative analogical reasoning helped highly sensitized and blood type incompatible patients through domino transplants, by which they obtain a kidney to which they are not sensitized. Creative analogical reasoning is more likely than conservative analogical reasoning to lead to revolutionary progress. While these analogies overlap and creative analogies eventually become conservative, progress is best facilitated by combining conservative and creative analogical reasoning. © 2015 John Wiley & Sons, Ltd.

Mapping small molecule binding data to structural domains

PubMed Central

2012-01-01

Background Large-scale bioactivity/SAR Open Data has recently become available, and this has allowed new analyses and approaches to be developed to help address the productivity and translational gaps of current drug discovery. One of the current limitations of these data is the relative sparsity of reported interactions per protein target, and complexities in establishing clear relationships between bioactivity and targets using bioinformatics tools. We detail in this paper the indexing of targets by the structural domains that bind (or are likely to bind) the ligand within a full-length protein. Specifically, we present a simple heuristic to map small molecule binding to Pfam domains. This profiling can be applied to all proteins within a genome to give some indications of the potential pharmacological modulation and regulation of all proteins. Results In this implementation of our heuristic, ligand binding to protein targets from the ChEMBL database was mapped to structural domains as defined by profiles contained within the Pfam-A database. Our mapping suggests that the majority of assay targets within the current version of the ChEMBL database bind ligands through a small number of highly prevalent domains, and conversely the majority of Pfam domains sampled by our data play no currently established role in ligand binding. Validation studies, carried out firstly against Uniprot entries with expert binding-site annotation and secondly against entries in the wwPDB repository of crystallographic protein structures, demonstrate that our simple heuristic maps ligand binding to the correct domain in about 90 percent of all assessed cases. Using the mappings obtained with our heuristic, we have assembled ligand sets associated with each Pfam domain. Conclusions Small molecule binding has been mapped to Pfam-A domains of protein targets in the ChEMBL bioactivity database. The result of this mapping is an enriched annotation of small molecule bioactivity data and a grouping of activity classes following the Pfam-A specifications of protein domains. This is valuable for data-focused approaches in drug discovery, for example when extrapolating potential targets of a small molecule with known activity against one or few targets, or in the assessment of a potential target for drug discovery or screening studies. PMID:23282026
Molecular Characterization of the Skate Peripherin/rds Gene: Relationship to Its Orthologues and Paralogues

PubMed Central

Li, Chibo; Ding, Xi-Qin; O’Brien, John; Al-Ubaidi, Muayyad R.

2010-01-01

PURPOSE A great deal of information about functionally significant domains of a protein may be obtained by comparison of primary sequences of gene homologues over a broad phylogenetic base. This study was designed to identify evolutionarily conserved domains of the photoreceptor disc membrane protein peripherin/rds by analysis of the homologue in a primitive vertebrate, the skate. METHODS A skate retinal cDNA library was screened using a mouse peripherin/rds clone. The 5′ and 3′ untranslated regions of the skate peripherin/rds (srds) cDNA were isolated by the rapid amplification of cDNA ends (RACE) approach. The gene structure was characterized by PCR amplification and sequencing of genomic fragments. Northern and Western blot analyses were used to identify srds transcript and protein, respectively. RESULTS A new homologue of peripherin/rds was identified from the skate retinal cDNA library. SRDS is a glycoprotein with a predicted molecular mass of 40.2 kDa. The srds gene consists of two exons and one small intron and transcribes into a single 6-kb message. Phylogenetic analysis places SRDS at the base of peripherin/rds family and near the division of that group and the branch leading to rds-like and rom-1 genes. SRDS protein is 54.5% identical with peripherin/rds across species. Identity is significantly higher (73%) in the intradiscal domains. Sequence comparison revealed the conservation of all residues that have been shown, on mutation, to associate with retinitis pigmentosa and showed conservation of most residues associated with macular dystrophies. Comparison with ROM-1 and other rds-like proteins revealed the presence of a highly conserved domain in the large intradiscal loop. CONCLUSIONS Srds represents the skate orthologue of mammalian peripherin/rds genes. Conservation of most of the residues associated with human retinal diseases indicates that these residues serve important functional roles. The high degree of conservation of a short stretch within the large intradiscal loop also suggests an important function for this domain. PMID:12766040
The Structure of RalF, an ADP-Ribosylation Factor Guanine Nucleotide Exchange Factor from Legionella pneumophila, Reveals the Presence of a Cap over the Active Site

DOE Office of Scientific and Technical Information (OSTI.GOV)

Amor,J.; Swails, J.; Zhu, X.

2005-01-01

The Legionella pneumophila protein RalF is secreted into host cytosol via the Dot/Icm type IV transporter where it acts to recruit ADP-ribosylation factor (Arf) to pathogen-containing phagosomes in the establishment of a replicative organelle. The presence in RalF of the Sec7 domain, present in all Arf guanine nucleotide exchange factors, has suggested that recruitment of Arf is an early step in pathogenesis. We have determined the crystal structure of RalF and of the isolated Sec7 domain and found that RalF is made up of two domains. The Sec7 domain is homologous to mammalian Sec7 domains. The C-terminal domain forms amore » cap over the active site in the Sec7 domain and contains a conserved folding motif, previously observed in adaptor subunits of vesicle coat complexes. The importance of the capping domain and of the glutamate in the 'glutamic finger,' conserved in all Sec7 domains, to RalF functions was examined using three different assays. These data highlight the functional importance of domains other than Sec7 in Arf guanine nucleotide exchange factors to biological activities and suggest novel mechanisms of regulation of those activities.« less
Capsicum annuum dehydrin, an osmotic-stress gene in hot pepper plants.

PubMed

Chung, Eunsook; Kim, Soo-Yong; Yi, So Young; Choi, Doil

2003-06-30

Osmotic stress-related genes were selected from an EST database constructed from 7 cDNA libraries from different tissues of the hot pepper. A full-length cDNA of Capsicum annuum dehydrin (Cadhn), a late embryogenesis abundant (lea) gene, was selected from the 5' single pass sequenced cDNA clones and sequenced. The deduced polypeptide has 87% identity with potato dehydrin C17, but very little identity with the dehydrin genes of other organisms. It contains a serine-tract (S-segment) and 3 conserved lysine-rich domains (K-segments). Southern blot analysis showed that 2 copies are present in the hot pepper genome. Cadhn was induced by osmotic stress in leaf tissues as well as by the application of abscisic acid. The RNA was most abundant in green fruit. The expression of several osmotic stress-related genes was examined and Cadhn proved to be the most abundantly expressed of these in response to osmotic stress.
Extending CATH: increasing coverage of the protein structure universe and linking structure with function

PubMed Central

Cuff, Alison L.; Sillitoe, Ian; Lewis, Tony; Clegg, Andrew B.; Rentzsch, Robert; Furnham, Nicholas; Pellegrini-Calace, Marialuisa; Jones, David; Thornton, Janet; Orengo, Christine A.

2011-01-01

CATH version 3.3 (class, architecture, topology, homology) contains 128 688 domains, 2386 homologous superfamilies and 1233 fold groups, and reflects a major focus on classifying structural genomics (SG) structures and transmembrane proteins, both of which are likely to add structural novelty to the database and therefore increase the coverage of protein fold space within CATH. For CATH version 3.4 we have significantly improved the presentation of sequence information and associated functional information for CATH superfamilies. The CATH superfamily pages now reflect both the functional and structural diversity within the superfamily and include structural alignments of close and distant relatives within the superfamily, annotated with functional information and details of conserved residues. A significantly more efficient search function for CATH has been established by implementing the search server Solr (http://lucene.apache.org/solr/). The CATH v3.4 webpages have been built using the Catalyst web framework. PMID:21097779
Interaction of a putative BH3 domain of clusterin with anti-apoptotic Bcl-2 family proteins as revealed by NMR spectroscopy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Dong-Hwa; Ha, Ji-Hyang; Kim, Yul

Highlights: {yields} Identification of a conserved BH3 motif in C-terminal coiled coil region of nCLU. {yields} The nCLU BH3 domain binds to BH3 peptide-binding grooves in both Bcl-X{sub L} and Bcl-2. {yields} A conserved binding mechanism of nCLU BH3 and the other pro-apoptotic BH3 peptides with Bcl-X{sub L}. {yields} The absolutely conserved Leu323 and Asp328 of nCLU BH3 domain are critical for binding to Bcl-X{sub L.} {yields} Molecular understanding of the pro-apoptotic function of nCLU as a novel BH3-only protein. -- Abstract: Clusterin (CLU) is a multifunctional glycoprotein that is overexpressed in prostate and breast cancers. Although CLU is knownmore » to be involved in the regulation of apoptosis and cell survival, the precise molecular mechanism underlying the pro-apoptotic function of nuclear CLU (nCLU) remains unclear. In this study, we identified a conserved BH3 motif in C-terminal coiled coil (CC2) region of nCLU by sequence analysis and characterized the molecular interaction of the putative nCLU BH3 domain with anti-apoptotic Bcl-2 family proteins by nuclear magnetic resonance (NMR) spectroscopy. The chemical shift perturbation data demonstrated that the nCLU BH3 domain binds to pro-apoptotic BH3 peptide-binding grooves in both Bcl-X{sub L} and Bcl-2. A structural model of the Bcl-X{sub L}/nCLU BH3 peptide complex reveals that the binding mode is remarkably similar to those of other Bcl-X{sub L}/BH3 peptide complexes. In addition, mutational analysis confirmed that Leu323 and Asp328 of nCLU BH3 domain, absolutely conserved in the BH3 motifs of BH3-only protein family, are critical for binding to Bcl-X{sub L}. Taken altogether, our results suggest a molecular basis for the pro-apoptotic function of nCLU by elucidating the residue specific interactions of the BH3 motif in nCLU with anti-apoptotic Bcl-2 family proteins.« less
Conserved intron positions in FGFR genes reflect the modular structure of FGFR and reveal stepwise addition of domains to an already complex ancestral FGFR.

PubMed

Rebscher, Nicole; Deichmann, Christina; Sudhop, Stefanie; Fritzenwanker, Jens Holger; Green, Stephen; Hassel, Monika

2009-10-01

We have analyzed the evolution of fibroblast growth factor receptor (FGFR) tyrosine kinase genes throughout a wide range of animal phyla. No evidence for an FGFR gene was found in Porifera, but we tentatively identified an FGFR gene in the placozoan Trichoplax adhaerens. The gene encodes a protein with three immunoglobulin-like domains, a single-pass transmembrane, and a split tyrosine kinase domain. By superimposing intron positions of 20 FGFR genes from Placozoa, Cnidaria, Protostomia, and Deuterostomia over the respective protein domain structure, we identified ten ancestral introns and three conserved intron groups. Our analysis shows (1) that the position of ancestral introns correlates to the modular structure of FGFRs, (2) that the acidic domain very likely evolved in the last common ancestor of triploblasts, (3) that splicing of IgIII was enabled by a triploblast-specific insertion, and (4) that IgI is subject to substantial loss or duplication particularly in quickly evolving genomes. Moreover, intron positions in the catalytic domain of FGFRs map to the borders of protein subdomains highly conserved in other serine/threonine kinases. Nevertheless, these introns were introduced in metazoan receptor tyrosine kinases exclusively. Our data support the view that protein evolution dating back to the Cambrian explosion took place in such a short time window that only subtle changes in the domain structure are detectable in extant representatives of animal phyla. We propose that the first multidomain FGFR originated in the last common ancestor of Placozoa, Cnidaria, and Bilateria. Additional domains were introduced mainly in the ancestor of triploblasts and in the Ecdysozoa.
Online interactive U.S. Reservoir Sedimentation Survey Database

USGS Publications Warehouse

Gray, J.B.; Bernard, J.M.; Schwarz, G.E.; Stewart, D.W.; Ray, K.T.

2009-01-01

In April 2009, the U.S. Geological Survey and the Natural Resources Conservation Service (prior to 1994, the Soil Conservation Service) created the Reservoir Sedimentation Survey Database (RESSED) and Web site, the most comprehensive compilation of data from reservoir bathymetric and dry basin surveys in the United States. RESSED data can be useful for a number of purposes, including calculating changes in reservoir storage characteristics, quantifying rates of sediment delivery to reservoirs, and estimating erosion rates in a reservoir's watershed.
Educating Astronauts About Conservation Biology

NASA Technical Reports Server (NTRS)

Robinson, Julie A.

2001-01-01

This article reviews the training of astronauts in the interdisciplinary work of conservation biology. The primary responsibility of the conservation biologist at NASA is directing and supporting the photography of the Earth and maintaining the complete database of the photographs. In order to perform this work, the astronauts who take the pictures must be educated in ecological issues.
Virus-like particles as universal influenza vaccines

PubMed Central

Kang, Sang-Moo; Kim, Min-Chul; Compans, Richard W

2012-01-01

Current influenza vaccines are primarily targeted to induce immunity to the influenza virus strain-specific hemagglutinin antigen and are not effective in controlling outbreaks of new pandemic viruses. An approach for developing universal vaccines is to present highly conserved antigenic epitopes in an immunogenic conformation such as virus-like particles (VLPs) together with an adjuvant to enhance the vaccine immunogenicity. In this review, the authors focus on conserved antigenic targets and molecular adjuvants that were presented in VLPs. Conserved antigenic targets that include the hemagglutinin stalk domain, the external domain of influenza M2 and neuraminidase are discussed in addition to molecular adjuvants that are engineered to be incorporated into VLPs in a membrane-anchored form. PMID:23002980
Isolation of nucleotide binding site-leucine rich repeat and kinase resistance gene analogues from sugarcane (Saccharum spp.).

PubMed

Glynn, Neil C; Comstock, Jack C; Sood, Sushma G; Dang, Phat M; Chaparro, Jose X

2008-01-01

Resistance gene analogues (RGAs) have been isolated from many crops and offer potential in breeding for disease resistance through marker-assisted selection, either as closely linked or as perfect markers. Many R-gene sequences contain kinase domains, and indeed kinase genes have been reported as being proximal to R-genes, making kinase analogues an additionally promising target. The first step towards utilizing RGAs as markers for disease resistance is isolation and characterization of the sequences. Sugarcane clone US01-1158 was identified as resistant to yellow leaf caused by the sugarcane yellow leaf virus (SCYLV) and moderately resistant to rust caused by Puccinia melanocephala Sydow & Sydow. Degenerate primers that had previously proved useful for isolating RGAs and kinase analogues in wheat and soybean were used to amplify DNA from sugarcane (Saccharum spp.) clone US-01-1158. Sequences generated from 1512 positive clones were assembled into 134 contigs of between two and 105 sequences. Comparison of the contig consensuses with the NCBI sequence database using BLASTx showed that 20 had sequence homology to nuclear binding site and leucine rich repeat (NBS-LRR) RGAs, and eight to kinase genes. Alignment of the deduced amino acid sequences with similar sequences from the NCBI database allowed the identification of several conserved domains. The alignment and resulting phenetic tree showed that many of the sequences had greater similarity to sequences from other species than to one another. The use of degenerate primers is a useful method for isolating novel sugarcane RGA and kinase gene analogues. Further studies are needed to evaluate the role of these genes in disease resistance.
Sequencing and de novo assembly of visceral mass transcriptome of the critically endangered land snail Satsuma myomphala: Annotation and SSR discovery.

PubMed

Kang, Se Won; Patnaik, Bharat Bhusan; Hwang, Hee-Ju; Park, So Young; Chung, Jong Min; Song, Dae Kwon; Patnaik, Hongray Howrelia; Lee, Jae Bong; Kim, Changmu; Kim, Soonok; Park, Hong Seog; Park, Seung-Hwan; Park, Young-Su; Han, Yeon Soo; Lee, Jun Sang; Lee, Yong Seok

2017-03-01

Satsuma myomphala is critically endangered through loss of natural habitats, predation by natural enemies, and indiscriminate collection. It is a protected species in Korea but lacks genomic resources for an understanding of varied functional processes attributable to evolutionary success under natural habitats. For assessing the genetic information of S. myomphala, we performed for the first time, de novo transcriptome sequencing and functional annotation of expressed sequences using Illumina Next-Generation Sequencing (NGS) platform and bioinformatics analysis. We identified 103,774 unigenes of which 37,959, 12,890, and 17,699 were annotated in the PANM (Protostome DB), Unigene, and COG (Clusters of Orthologous Groups) databases, respectively. In addition, 14,451 unigenes were predicted under Gene Ontology functional categories, with 4581 assigned to a single category. Furthermore, 3369 sequences with 646 having Enzyme Commission (EC) numbers were mapped to 122 pathways in the Kyoto Encyclopedia of Genes and Genomes Pathway database. The prominent protein domains included the Zinc finger (C2H2-like), Reverse Transcriptase, Thioredoxin-like fold, and RNA recognition motif domain. Many unigenes with homology to immunity, defense, and reproduction-related genes were screened in the transcriptome. We also detected 3120 putative simple sequence repeats (SSRs) encompassing dinucleotide to hexanucleotide repeat motifs from >1kb unigene sequences. A list of PCR primers of SSR loci have been identified to study the genetic polymorphisms. The transcriptome data represents a valuable resource for further investigations on the species genome structure and biology. The unigenes information and microsatellites would provide an indispensable tool for conservation of the species in natural and adaptive environments. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Climate-induced change of environmentally defined floristic domains: A conservation based vulnerability framework

Treesearch

Debbie Jewitt; Barend F.N. Erasmus; Peter S. Goodman; Timothy G. O' Connor; William W. Hargrove; Damian M. Maddalena; Ed. T.F. Witkowski

2015-01-01

Global climate change is having marked influences on species distributions, phenology and ecosystem composition and raises questions as to the effectiveness of current conservation strategies. Conservation planning has only recently begun to adequately account for dynamic threats such as climate change. We propose a method to incorporate climate-dynamic environmental...
Evolutionary and biophysical relationships among the papillomavirus E2 proteins.

PubMed

Blakaj, Dukagjin M; Fernandez-Fuentes, Narcis; Chen, Zigui; Hegde, Rashmi; Fiser, Andras; Burk, Robert D; Brenowitz, Michael

2009-01-01

Infection by human papillomavirus (HPV) may result in clinical conditions ranging from benign warts to invasive cancer. The HPV E2 protein represses oncoprotein transcription and is required for viral replication. HPV E2 binds to palindromic DNA sequences of highly conserved four base pair sequences flanking an identical length variable 'spacer'. E2 proteins directly contact the conserved but not the spacer DNA. Variation in naturally occurring spacer sequences results in differential protein affinity that is dependent on their sensitivity to the spacer DNA's unique conformational and/or dynamic properties. This article explores the biophysical character of this core viral protein with the goal of identifying characteristics that associated with risk of virally caused malignancy. The amino acid sequence, 3d structure and electrostatic features of the E2 protein DNA binding domain are highly conserved; specific interactions with DNA binding sites have also been conserved. In contrast, the E2 protein's transactivation domain does not have extensive surfaces of highly conserved residues. Rather, regions of high conservation are localized to small surface patches. Implications to cancer biology are discussed.
The chordate proteome history database.

PubMed

Levasseur, Anthony; Paganini, Julien; Dainat, Jacques; Thompson, Julie D; Poch, Olivier; Pontarotti, Pierre; Gouret, Philippe

2012-01-01

The chordate proteome history database (http://ioda.univ-provence.fr) comprises some 20,000 evolutionary analyses of proteins from chordate species. Our main objective was to characterize and study the evolutionary histories of the chordate proteome, and in particular to detect genomic events and automatic functional searches. Firstly, phylogenetic analyses based on high quality multiple sequence alignments and a robust phylogenetic pipeline were performed for the whole protein and for each individual domain. Novel approaches were developed to identify orthologs/paralogs, and predict gene duplication/gain/loss events and the occurrence of new protein architectures (domain gains, losses and shuffling). These important genetic events were localized on the phylogenetic trees and on the genomic sequence. Secondly, the phylogenetic trees were enhanced by the creation of phylogroups, whereby groups of orthologous sequences created using OrthoMCL were corrected based on the phylogenetic trees; gene family size and gene gain/loss in a given lineage could be deduced from the phylogroups. For each ortholog group obtained from the phylogenetic or the phylogroup analysis, functional information and expression data can be retrieved. Database searches can be performed easily using biological objects: protein identifier, keyword or domain, but can also be based on events, eg, domain exchange events can be retrieved. To our knowledge, this is the first database that links group clustering, phylogeny and automatic functional searches along with the detection of important events occurring during genome evolution, such as the appearance of a new domain architecture.
Kinetoplast DNA minicircles of phloem-restricted Phytomonas associated with wilt diseases of coconut and oil palms have a two-domain structure.

PubMed

Dollet, M; Sturm, N R; Ahomadegbe, J C; Campbell, D A

2001-11-27

We report the cloning and sequencing of the first minicircle from a phloem-restricted, pathogenic Phytomonas sp. (Hart 1) isolated from a coconut palm with hartrot disease. The minicircle possessed a two-domain structure of two conserved regions, each containing three conserved sequence blocks (CSB). Based on the sequence around CSB 3 from Hart 1, PCR primers were designed to allow specific amplification of Phytomonas minicircles. This primer pair demonstrated specificity for at least six groups of plant trypanosomatids and did not amplify from insect trypanosomatids. The PCR results were consistent with a two-domain structure for other plant trypanosomatids.
Functional analysis of propeptide as an intramolecular chaperone for in vivo folding of subtilisin nattokinase.

PubMed

Jia, Yan; Liu, Hui; Bao, Wei; Weng, Meizhi; Chen, Wei; Cai, Yongjun; Zheng, Zhongliang; Zou, Guolin

2010-12-01

Here, we show that during in vivo folding of the precursor, the propeptide of subtilisin nattokinase functions as an intramolecular chaperone (IMC) that organises the in vivo folding of the subtilisin domain. Two residues belonging to β-strands formed by conserved regions of the IMC are crucial for the folding of the subtilisin domain through direct interactions. An identical protease can fold into different conformations in vivo due to the action of a mutated IMC, resulting in different kinetic parameters. Some interfacial changes involving conserved regions, even those induced by the subtilisin domain, blocked subtilisin folding and altered its conformation. Insight into the interaction between the subtilisin and IMC domains is provided by a three-dimensional structural model. Copyright © 2010 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
The conserved RNA recognition motif and C3H1 domain of the Not4 ubiquitin ligase regulate in vivo ligase function.

PubMed

Chen, Hongfeng; Sirupangi, Tirupataiah; Wu, Zhao-Hui; Johnson, Daniel L; Laribee, R Nicholas

2018-05-25

The Ccr4-Not complex controls RNA polymerase II (Pol II) dependent gene expression and proteasome function. The Not4 ubiquitin ligase is a Ccr4-Not subunit that has both a RING domain and a conserved RNA recognition motif and C3H1 domain (referred to as the RRM-C domain) with unknown function. We demonstrate that while individual Not4 RING or RRM-C mutants fail to replicate the proteasomal defects found in Not4 deficient cells, mutation of both exhibits a Not4 loss of function phenotype. Transcriptome analysis revealed that the Not4 RRM-C affects a specific subset of Pol II-regulated genes, including those involved in transcription elongation, cyclin-dependent kinase regulated nutrient responses, and ribosomal biogenesis. The Not4 RING, RRM-C, or RING/RRM-C mutations cause a generalized increase in Pol II binding at a subset of these genes, yet their impact on gene expression does not always correlate with Pol II recruitment which suggests Not4 regulates their expression through additional mechanisms. Intriguingly, we find that while the Not4 RRM-C is dispensable for Ccr4-Not association with RNA Pol II, the Not4 RING domain is required for these interactions. Collectively, these data elucidate previously unknown roles for the conserved Not4 RRM-C and RING domains in regulating Ccr4-Not dependent functions in vivo.
Chimeric Saccharomyces cerevisiae Msh6 protein with an Msh3 mispair-binding domain combines properties of both proteins

PubMed Central

Shell, Scarlet S.; Putnam, Christopher D.; Kolodner, Richard D.

2007-01-01

Msh2–Msh3 and Msh2–Msh6 are two partially redundant mispair-recognition complexes that initiate mismatch repair in eukaryotes. Crystal structures of the prokaryotic homolog MutS suggest the mechanism by which Msh6 interacts with mispairs because key mispair-contacting residues are conserved in these two proteins. Because Msh3 lacks these conserved residues, we constructed a series of mutants to investigate the requirements for mispair interaction by Msh3. We found that a chimeric protein in which the mispair-binding domain (MBD) of Msh6 was replaced by the equivalent domain of Msh3 was functional for mismatch repair. This chimera possessed the mispair-binding specificity of Msh3 and revealed that communication between the MBD and the ATPase domain is conserved between Msh2–Msh3 and Msh2–Msh6. Further, the chimeric protein retained Msh6-like properties with respect to genetic interactions with the MutL homologs and an Msh2 MBD deletion mutant, indicating that Msh3-like behaviors beyond mispair specificity are not features controlled by the MBD. PMID:17573527
The Evolutionary Pattern of Glycosylation Sites in Influenza Virus (H5N1) Hemagglutinin and Neuraminidase

PubMed Central

Chen, Wentian; Zhong, Yaogang; Qin, Yannan; Sun, Shisheng; Li, Zheng

2012-01-01

Two glycoproteins, hemagglutinin (HA) and neuraminidase (NA), on the surface of influenza viruses play crucial roles in transfaunation, membrane fusion and the release of progeny virions. To explore the distribution of N-glycosylation sites (glycosites) in these two glycoproteins, we collected and aligned the amino acid sequences of all the HA and NA subtypes. Two glycosites were located at HA0 cleavage sites and fusion peptides and were strikingly conserved in all HA subtypes, while the remaining glycosites were unique to their subtypes. Two to four conserved glycosites were found in the stalk domain of NA, but these are affected by the deletion of specific stalk domain sequences. Another highly conserved glycosite appeared at the top center of tetrameric global domain, while the others glycosites were distributed around the global domain. Here we present a detailed investigation of the distribution and the evolutionary pattern of the glycosites in the envelope glycoproteins of IVs, and further focus on the H5N1 virus and conclude that the glycosites in H5N1 have become more complicated in HA and less influential in NA in the last five years. PMID:23133677

DNA methylation in amphioxus: from ancestral functions to new roles in vertebrates.

PubMed

Albalat, Ricard; Martí-Solans, Josep; Cañestro, Cristian

2012-03-01

In vertebrates, DNA methylation is an epigenetic mechanism that modulates gene transcription, and plays crucial roles during development, cell fate maintenance, germ cell pluripotency and inheritable genome imprinting. DNA methylation might also play a role as a genome defense mechanism against the mutational activity derived from transposon mobility. In contrast to the heavily methylated genomes in vertebrates, most genomes in invertebrates are poorly or just moderately methylated, and the function of DNA methylation remains unclear. Here, we review the DNA methylation system in the cephalochordate amphioxus, which belongs to the most basally divergent group of our own phylum, the chordates. First, surveys of the amphioxus genome database reveal the presence of the DNA methylation machinery, DNA methyltransferases and methyl-CpG-binding domain proteins. Second, comparative genomics and analyses of conserved synteny between amphioxus and vertebrates provide robust evidence that the DNA methylation machinery of amphioxus represents the ancestral toolkit of chordates, and that its expansion in vertebrates was originated by the two rounds of whole-genome duplication that occurred in stem vertebrates. Third, in silico analysis of CpGo/e ratios throughout the amphioxus genome suggests a bimodal distribution of DNA methylation, consistent with a mosaic pattern comprising domains of methylated DNA interspersed with domains of unmethylated DNA, similar to the situation described in ascidians, but radically different to the globally methylated vertebrate genomes. Finally, we discuss potential roles of the DNA methylation system in amphioxus in the context of chordate genome evolution and the origin of vertebrates.
NR2E3 mutations in enhanced S-cone sensitivity syndrome (ESCS), Goldmann-Favre syndrome (GFS), clumped pigmentary retinal degeneration (CPRD), and retinitis pigmentosa (RP).

PubMed

Schorderet, Daniel F; Escher, Pascal

2009-11-01

NR2E3, also called photoreceptor-specific nuclear receptor (PNR), is a transcription factor of the nuclear hormone receptor superfamily whose expression is uniquely restricted to photoreceptors. There, its physiological activity is essential for proper rod and cone photoreceptor development and maintenance. Thirty-two different mutations in NR2E3 have been identified in either homozygous or compound heterozygous state in the recessively inherited enhanced S-cone sensitivity syndrome (ESCS), Goldmann-Favre syndrome (GFS), and clumped pigmentary retinal degeneration (CPRD). The clinical phenotype common to all these patients is night blindness, rudimental or absent rod function, and hyperfunction of the "blue" S-cones. A single p.G56R mutation is inherited in a dominant manner and causes retinitis pigmentosa (RP). We have established a new locus-specific database for NR2E3 (www.LOVD.nl/eye), containing all reported mutations, polymorphisms, and unclassified sequence variants, including novel ones. A high proportion of mutations are located in the evolutionarily-conserved DNA-binding domains (DBDs) and ligand-binding domains (LBDs) of NR2E3. Based on homology modeling of these NR2E3 domains, we propose a structural localization of mutated residues. The high variability of clinical phenotypes observed in patients affected by NR2E3-linked retinal degenerations may be caused by different disease mechanisms, including absence of DNA-binding, altered interactions with transcriptional coregulators, and differential activity of modifier genes.
EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A

PubMed Central

Ndhlovu, Andrew; Durand, Pierre M.; Hazelhurst, Scott

2015-01-01

The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. Database URL: http://www.bioinf.wits.ac.za/software/fire/evodb PMID:26140928
Characterization of DNA polymerase X from Thermus thermophilus HB8 reveals the POLXc and PHP domains are both required for 3'-5' exonuclease activity.

PubMed

Nakane, Shuhei; Nakagawa, Noriko; Kuramitsu, Seiki; Masui, Ryoji

2009-04-01

The X-family DNA polymerases (PolXs) comprise a highly conserved DNA polymerase family found in all kingdoms. Mammalian PolXs are known to be involved in several DNA-processing pathways including repair, but the cellular functions of bacterial PolXs are less known. Many bacterial PolXs have a polymerase and histidinol phosphatase (PHP) domain at their C-termini in addition to a PolX core (POLXc) domain, and possess 3'-5' exonuclease activity. Although both domains are highly conserved in bacteria, their molecular functions, especially for a PHP domain, are unknown. We found Thermus thermophilus HB8 PolX (ttPolX) has Mg(2+)/Mn(2+)-dependent DNA/RNA polymerase, Mn(2+)-dependent 3'-5' exonuclease and DNA-binding activities. We identified the domains of ttPolX by limited proteolysis and characterized their biochemical activities. The POLXc domain was responsible for the polymerase and DNA-binding activities but exonuclease activity was not detected for either domain. However, the POLXc and PHP domains interacted with each other and a mixture of the two domains had Mn(2+)-dependent 3'-5' exonuclease activity. Moreover, site-directed mutagenesis revealed catalytically important residues in the PHP domain for the 3'-5' exonuclease activity. Our findings provide a molecular insight into the functional domain organization of bacterial PolXs, especially the requirement of the PHP domain for 3'-5' exonuclease activity.
Understanding the productive author who published papers in medicine using National Health Insurance Database: A systematic review and meta-analysis.

PubMed

Chien, Tsair-Wei; Chang, Yu; Wang, Hsien-Yi

2018-02-01

Many researchers used National Health Insurance database to publish medical papers which are often retrospective, population-based, and cohort studies. However, the author's research domain and academic characteristics are still unclear.By searching the PubMed database (Pubmed.com), we used the keyword of [Taiwan] and [National Health Insurance Research Database], then downloaded 2913 articles published from 1995 to 2017. Social network analysis (SNA), Gini coefficient, and Google Maps were applied to gather these data for visualizing: the most productive author; the pattern of coauthor collaboration teams; and the author's research domain denoted by abstract keywords and Pubmed MESH (medical subject heading) terms.Utilizing the 2913 papers from Taiwan's National Health Insurance database, we chose the top 10 research teams shown on Google Maps and analyzed one author (Dr. Kao) who published 149 papers in the database in 2015. In the past 15 years, we found Dr. Kao had 2987 connections with other coauthors from 13 research teams. The cooccurrence abstract keywords with the highest frequency are cohort study and National Health Insurance Research Database. The most coexistent MESH terms are tomography, X-ray computed, and positron-emission tomography. The strength of the author research distinct domain is very low (Gini < 0.40).SNA incorporated with Google Maps and Gini coefficient provides insight into the relationships between entities. The results obtained in this study can be applied for a comprehensive understanding of other productive authors in the field of academics.
Comprehensive analysis of orthologous protein domains using the HOPS database.

PubMed

Storm, Christian E V; Sonnhammer, Erik L L

2003-10-01

One of the most reliable methods for protein function annotation is to transfer experimentally known functions from orthologous proteins in other organisms. Most methods for identifying orthologs operate on a subset of organisms with a completely sequenced genome, and treat proteins as single-domain units. However, it is well known that proteins are often made up of several independent domains, and there is a wealth of protein sequences from genomes that are not completely sequenced. A comprehensive set of protein domain families is found in the Pfam database. We wanted to apply orthology detection to Pfam families, but first some issues needed to be addressed. First, orthology detection becomes impractical and unreliable when too many species are included. Second, shorter domains contain less information. It is therefore important to assess the quality of the orthology assignment and avoid very short domains altogether. We present a database of orthologous protein domains in Pfam called HOPS: Hierarchical grouping of Orthologous and Paralogous Sequences. Orthology is inferred in a hierarchic system of phylogenetic subgroups using ortholog bootstrapping. To avoid the frequent errors stemming from horizontally transferred genes in bacteria, the analysis is presently limited to eukaryotic genes. The results are accessible in the graphical browser NIFAS, a Java tool originally developed for analyzing phylogenetic relations within Pfam families. The method was tested on a set of curated orthologs with experimentally verified function. In comparison to tree reconciliation with a complete species tree, our approach finds significantly more orthologs in the test set. Examples for investigating gene fusions and domain recombination using HOPS are given.
The non-conserved region of MRP is involved in the virulence of Streptococcus suis serotype 2

PubMed Central

Li, Quan; Fu, Yang; Ma, Caifeng; He, Yanan; Yu, Yanfei; Du, Dechao; Yao, Huochun; Lu, Chengping; Zhang, Wei

2017-01-01

ABSTRACT Muramidase-released protein (MRP) of Streptococcus suis serotype 2 (SS2) is an important epidemic virulence marker with an unclear role in bacterial infection. To investigate the biologic functions of MRP, 3 mutants named Δmrp, Δmrp domain 1 (Δmrp-d1), and Δmrp domain 2 (Δmrp-d2) were constructed to assess the phenotypic changes between the parental strain and the mutant strains. The results indicated that MRP domain 1 (MRP-D1, the non-conserved region of MRP from a virulent strain, a.a. 242–596) played a critical role in adherence of SS2 to host cells, compared with MRP domain 1* (MRP-D1*, the non-conserved region of MRP from a low virulent strain, a.a. 239–598) or MRP domain 2 (MRP-D2, the conserved region of MRP, a.a. 848–1222). We found that MRP-D1 but not MRP-D2, could bind specifically to fibronectin (FN), factor H (FH), fibrinogen (FG), and immunoglobulin G (IgG). Additionally, we confirmed that mrp-d1 mutation significantly inhibited bacteremia and brain invasion in a mouse infection model. The mrp-d1 mutation also attenuated the intracellular survival of SS2 in RAW246.7 macrophages, shortened the growth ability in pig blood and decreased the virulence of SS2 in BALB/c mice. Furthermore, antiserum against MRP-D1 was found to dramatically impede SS2 survival in pig blood. Finally, immunization with recombinant MRP-D1 efficiently enhanced murine viability after SS2 challenge, indicating its potential use in vaccination strategies. Collectively, these results indicated that MRP-D1 is involved in SS2 virulence and eloquently demonstrate the function of MRP in pathogenesis of infection. PMID:28362221
The non-conserved region of MRP is involved in the virulence of Streptococcus suis serotype 2.

PubMed

Li, Quan; Fu, Yang; Ma, Caifeng; He, Yanan; Yu, Yanfei; Du, Dechao; Yao, Huochun; Lu, Chengping; Zhang, Wei

2017-10-03

Muramidase-released protein (MRP) of Streptococcus suis serotype 2 (SS2) is an important epidemic virulence marker with an unclear role in bacterial infection. To investigate the biologic functions of MRP, 3 mutants named Δmrp, Δmrp domain 1 (Δmrp-d1), and Δmrp domain 2 (Δmrp-d2) were constructed to assess the phenotypic changes between the parental strain and the mutant strains. The results indicated that MRP domain 1 (MRP-D1, the non-conserved region of MRP from a virulent strain, a.a. 242-596) played a critical role in adherence of SS2 to host cells, compared with MRP domain 1* (MRP-D1*, the non-conserved region of MRP from a low virulent strain, a.a. 239-598) or MRP domain 2 (MRP-D2, the conserved region of MRP, a.a. 848-1222). We found that MRP-D1 but not MRP-D2, could bind specifically to fibronectin (FN), factor H (FH), fibrinogen (FG), and immunoglobulin G (IgG). Additionally, we confirmed that mrp-d1 mutation significantly inhibited bacteremia and brain invasion in a mouse infection model. The mrp-d1 mutation also attenuated the intracellular survival of SS2 in RAW246.7 macrophages, shortened the growth ability in pig blood and decreased the virulence of SS2 in BALB/c mice. Furthermore, antiserum against MRP-D1 was found to dramatically impede SS2 survival in pig blood. Finally, immunization with recombinant MRP-D1 efficiently enhanced murine viability after SS2 challenge, indicating its potential use in vaccination strategies. Collectively, these results indicated that MRP-D1 is involved in SS2 virulence and eloquently demonstrate the function of MRP in pathogenesis of infection.
A thermophilic mini-chaperonin contains a conserved polypeptide-binding surface: combined crystallographic and NMR studies of the GroEL Apical Domain with implications for substrate interactions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hua, Q. X. H.; Dementieva, I. S. D.; Walsh, M. A. W.

2001-02-23

A homologue of the Escherichia coli GroEL apical domain was obtained from thermophilic eubacterium Thermus thermophilus. The domains share 70 % sequence identity (101 out of 145 residues). The thermal stability of the T. thermophilus apical domain (T{sub m}>100{sup o}C as evaluated by circular dichroism) is at least 35{sup o}C greater than that of the E. coli apical domain (T{sub m}=65{sup o}C). The crystal structure of a selenomethione-substituted apical domain from T. thermophilus was determined to a resolution of 1.78 {angstrom} using multiwavelength-anomalous-diffraction phasing. The structure is similar to that of the E. coli apical domain (root-mean-square deviation 0.45 {angstrom}more » based on main-chain atoms). The thermophilic structure contains seven additional salt bridges of which four contain charge-stabilized hydrogen bonds. Only one of the additional salt bridges would face the 'Anfinsen cage' in GroEL. High temperatures were exploited to map sites of interactions between the apical domain and molten globules. NMR footprints of apical domain-protein complexes were obtained at elevated temperature using {sup 15}N-{sup 1}H correlation spectra of {sup 15}N-labeled apical domain. Footprints employing two polypeptides unrelated in sequence or structure (an insulin monomer and the SRY high-mobility-group box, each partially unfolded at 50{sup o}C) are essentially the same and consistent with the peptide-binding surface previously defined in E. coli GroEL and its apical domain-peptide complexes. An additional part of this surface comprising a short N-terminal {alpha}-helix is observed. The extended footprint rationalizes mutagenesis studies of intact GroEL in which point mutations affecting substrate binding were found outside the 'classical' peptide-binding site. Our results demonstrate structural conservation of the apical domain among GroEL homologues and conservation of an extended non-polar surface recognizing diverse polypeptides.« less
Comprehensive global amino acid sequence analysis of PB1F2 protein of influenza A H5N1 viruses and the influenza A virus subtypes responsible for the 20th-century pandemics.

PubMed

Pasricha, Gunisha; Mishra, Akhilesh C; Chakrabarti, Alok K

2013-07-01

PB1F2 is the 11th protein of influenza A virus translated from +1 alternate reading frame of PB1 gene. Since the discovery, varying sizes and functions of the PB1F2 protein of influenza A viruses have been reported. Selection of PB1 gene segment in the pandemics, variable size and pleiotropic effect of PB1F2 intrigued us to analyze amino acid sequences of this protein in various influenza A viruses. Amino acid sequences for PB1F2 protein of influenza A H5N1, H1N1, H2N2, and H3N2 subtypes were obtained from Influenza Research Database. Multiple sequence alignments of the PB1F2 protein sequences of the aforementioned subtypes were used to determine the size, variable and conserved domains and to perform mutational analysis. Analysis showed that 96·4% of the H5N1 influenza viruses harbored full-length PB1F2 protein. Except for the 2009 pandemic H1N1 virus, all the subtypes of the 20th-century pandemic influenza viruses contained full-length PB1F2 protein. Through the years, PB1F2 protein of the H1N1 and H3N2 viruses has undergone much variation. PB1F2 protein sequences of H5N1 viruses showed both human- and avian host-specific conserved domains. Global database of PB1F2 protein revealed that N66S mutation was present only in 3·8% of the H5N1 strains. We found a novel mutation, N84S in the PB1F2 protein of 9·35% of the highly pathogenic avian influenza H5N1 influenza viruses. Varying sizes and mutations of the PB1F2 protein in different influenza A virus subtypes with pandemic potential were obtained. There was genetic divergence of the protein in various hosts which highlighted the host-specific evolution of the virus. However, studies are required to correlate this sequence variability with the virulence and pathogenicity. © 2012 John Wiley & Sons Ltd.
Protein Information Resource: a community resource for expert annotation of protein data

PubMed Central

Barker, Winona C.; Garavelli, John S.; Hou, Zhenglin; Huang, Hongzhan; Ledley, Robert S.; McGarvey, Peter B.; Mewes, Hans-Werner; Orcutt, Bruce C.; Pfeiffer, Friedhelm; Tsugita, Akira; Vinayaka, C. R.; Xiao, Chunlin; Yeh, Lai-Su L.; Wu, Cathy

2001-01-01

The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classification-driven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively cross-referenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-International databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP. PMID:11125041
Forest Conservation Opportunity Areas - Conservative Model (ECO_RES.COA_FORREST66)

EPA Pesticide Factsheets

This layer designates areas with potential for forest conservation. These are areas of natural or semi-natural forest land cover patches that area at least 395 meters away from roads and away from patch edges. OAs were modeled by creating distance grids using the National Land Cover Database and the Census Bureau's TIGER road files.
A conserved π-cation and an electrostatic bridge are essential for 11R-lipoxygenase catalysis and structural stability.

PubMed

Eek, Priit; Piht, Mari-Ann; Rätsep, Margus; Freiberg, Arvi; Järving, Ivar; Samel, Nigulas

2015-10-01

Lipoxygenases (LOXs) are lipid-peroxidizing enzymes that consist of a regulatory calcium- and membrane-binding PLAT (polycystin-1, lipoxygenase, α-toxin) domain and a catalytic domain. In a previous study, the crystal structure of an 11R-LOX revealed a conserved π-cation bridge connecting these two domains which could mediate the regulatory effect of the PLAT domain to the active site. Here we analyzed the role of residues Trp107 and Lys172 that constitute the π-cation bridge in 11R-LOX along with Arg106 and Asp173-a potential salt bridge, which could also contribute to the inter-domain communication. According to our kinetic assays and protein unfolding experiments conducted using differential scanning fluorimetry and circular dichroism spectroscopy, mutants with a disrupted link display diminished catalytic activity alongside reduced stability of the protein fold. The results demonstrate that both these bridges contribute to the two-domain interface, and are important for proper enzyme activation. Copyright © 2015 Elsevier B.V. All rights reserved.
Structural and Biochemical Studies of ALIX/AlP1 and Its Role in Retrovirus Budding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fisher,R.; Chung, H.; Zhai, Q.

2007-01-01

ALIX/AIP1 functions in enveloped virus budding, endosomal protein sorting, and many other cellular processes. Retroviruses, including HIV-1, SIV, and EIAV, bind and recruit ALIX through YPXnL late-domain motifs (X = any residue; n = 1-3). Crystal structures reveal that human ALIX is composed of an N-terminal Bro1 domain and a central domain that is composed of two extended three-helix bundles that form elongated arms that fold back into a 'V.'. The structures also reveal conformational flexibility in the arms that suggests that the V domain may act as a flexible hinge in response to ligand binding. YPXnL late domains bindmore » in a conserved hydrophobic pocket on the second arm near the apex of the V, whereas CHMP4/ESCRT-III proteins bind a conserved hydrophobic patch on the Bro1 domain, and both interactions are required for virus budding. ALIX therefore serves as a flexible, extended scaffold that connects retroviral Gag proteins to ESCRT-III and other cellular-budding machinery.« less
The X-ray Crystallographic Structure and Activity Analysis of a Pseudomonas-Specific Subfamily of the HAD Enzyme Superfamily Evidences a Novel Biochemical Function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peisach,E.; Wang, L.; Burroughs, A.

2008-01-01

The haloacid dehalogenase (HAD) superfamily is a large family of proteins dominated by phosphotransferases. Thirty-three sequence families within the HAD superfamily (HADSF) have been identified to assist in function assignment. One such family includes the enzyme phosphoacetaldehyde hydrolase (phosphonatase). Phosphonatase possesses the conserved Rossmanniod core domain and a C1-type cap domain. Other members of this family do not possess a cap domain and because the cap domain of phosphonatase plays an important role in active site desolvation and catalysis, the function of the capless family members must be unique. A representative of the capless subfamily, PSPTO{_}2114, from the plant pathogenmore » Pseudomonas syringae, was targeted for catalytic activity and structure analyses. The X-ray structure of PSPTO{_}2114 reveals a capless homodimer that conserves some but not all of the intersubunit contacts contributed by the core domains of the phosphonatase homodimer. The region of the PSPTO{_}2114 that corresponds to the catalytic scaffold of phosphonatase (and other HAD phosphotransfereases) positions amino acid residues that are ill suited for Mg+2 cofactor binding and mediation of phosphoryl group transfer between donor and acceptor substrates. The absence of phosphotransferase activity in PSPTO{_}2114 was confirmed by kinetic assays. To explore PSPTO{_}2114 function, the conservation of sequence motifs extending outside of the HADSF catalytic scaffold was examined. The stringently conserved residues among PSPTO{_}2114 homologs were mapped onto the PSPTO{_}2114 three-dimensional structure to identify a surface region unique to the family members that do not possess a cap domain. The hypothesis that this region is used in protein-protein recognition is explored to define, for the first time, HADSF proteins which have acquired a function other than that of a catalyst. Proteins 2008.« less
LINKIN, a new transmembrane protein necessary for cell adhesion

PubMed Central

Kato, Mihoko; Chou, Tsui-Fen; Yu, Collin Z; DeModena, John; Sternberg, Paul W

2014-01-01

In epithelial collective migration, leader and follower cells migrate while maintaining cell–cell adhesion and tissue polarity. We have identified a conserved protein and interactors required for maintaining cell adhesion during a simple collective migration in the developing C. elegans male gonad. LINKIN is a previously uncharacterized, transmembrane protein conserved throughout Metazoa. We identified seven atypical FG–GAP domains in the extracellular domain, which potentially folds into a β-propeller structure resembling the α-integrin ligand-binding domain. C. elegans LNKN-1 localizes to the plasma membrane of all gonadal cells, with apical and lateral bias. We identified the LINKIN interactors RUVBL1, RUVBL2, and α-tubulin by using SILAC mass spectrometry on human HEK 293T cells and testing candidates for lnkn-1-like function in C. elegans male gonad. We propose that LINKIN promotes adhesion between neighboring cells through its extracellular domain and regulates microtubule dynamics through RUVBL proteins at its intracellular domain. DOI: http://dx.doi.org/10.7554/eLife.04449.001 PMID:25437307
Genomic structure of the human D-site binding protein (DBP) gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shutler, G.; Glassco, T.; Kang, Xiaolin

1996-06-15

The human gene for the D-Site Binding Protein (DBP) has been sequenced and characterized. This gene is a member of the b/ZIP family of transcription factors and is one of three genes forming the PAR sub-family. DBP has been implicated in the diurnal regulation of a variety of liver-specific genes. Examination of the genomic structure of DBP reveals that the gene is divided into four exons and is contained within a relatively compact region of approximately 6 kb. These exons appear to correspond to functional divisions the DBP protein. Exon 1 contains a long 5{prime} UTR, and conservation between themore » rat and the human genes of the presence of small open reading frames within this region suggests that is may play a role in translational control. Exon 2 contains a limited region of similarity to the other PAR domain genes, which may be part of a potential activation domain. Exon 3 contains the PAR domain and differs by only 1 of 71 amino acids between rat and human. Exon 4, containing both the basic and the leucine zipper domains, is likewise highly conserved. The overall degree of homology between the rat and the human cDNA sequences is 82% for the nucleic acid sequence and 92% for the protein sequence. comparison of the rat and human proximal promoters reveals extensive sequence conservation, with two previously characterized DNA binding sites being conserved at the functional and sequence levels. 31 refs., 4 figs.« less
Functional studies of the Ciona intestinalis myogenic regulatory factor reveal conserved features of chordate myogenesis.

PubMed

Izzi, Stephanie A; Colantuono, Bonnie J; Sullivan, Kelly; Khare, Parul; Meedel, Thomas H

2013-04-15

Ci-MRF is the sole myogenic regulatory factor (MRF) of the ascidian Ciona intestinalis, an invertebrate chordate. In order to investigate its properties we developed a simple in vivo assay based on misexpressing Ci-MRF in the notochord of Ciona embryos. We used this assay to examine the roles of three structural motifs that are conserved among MRFs: an alanine-threonine (Ala-Thr) dipeptide of the basic domain that is known in vertebrates as the myogenic code, a cysteine/histidine-rich (C/H) domain found just N-terminal to the basic domain, and a carboxy-terminal amphipathic α-helix referred to as Helix III. We show that the Ala-Thr dipeptide is necessary for normal Ci-MRF function, and that while eliminating the C/H domain or Helix III individually has no demonstrable effect on Ci-MRF, simultaneous loss of both motifs significantly reduces its activity. Our studies also indicate that direct interaction between CiMRF and an essential E-box of Ciona Troponin I is required for the expression of this muscle-specific gene and that multiple classes of MRF-regulated genes exist in Ciona. These findings are consistent with substantial conservation of MRF-directed myogenesis in chordates and demonstrate for the first time that the Ala/Thr dipeptide of the basic domain of an invertebrate MRF behaves as a myogenic code. Copyright © 2013 Elsevier Inc. All rights reserved.
The cytoplasmic end of transmembrane domain 3 regulates the activity of the Saccharomyces cerevisiae G-protein-coupled alpha-factor receptor.

PubMed Central

Parrish, William; Eilers, Markus; Ying, Weiwen; Konopka, James B

2002-01-01

The binding of alpha-factor to its receptor (Ste2p) activates a G-protein-signaling pathway leading to conjugation of MATa cells of the budding yeast S. cerevisiae. We conducted a genetic screen to identify constitutively activating mutations in the N-terminal region of the alpha-factor receptor that includes transmembrane domains 1-5. This approach identified 12 unique constitutively activating mutations, the strongest of which affected polar residues at the cytoplasmic ends of transmembrane domains 2 and 3 (Asn84 and Gln149, respectively) that are conserved in the alpha-factor receptors of divergent yeast species. Targeted mutagenesis, in combination with molecular modeling studies, suggested that Gln149 is oriented toward the core of the transmembrane helix bundle where it may be involved in mediating an interaction with Asn84. These residues appear to play specific roles in maintaining the inactive conformation of the protein since a variety of mutations at either position cause constitutive receptor signaling. Interestingly, the activity of many mammalian G-protein-coupled receptors is also regulated by conserved polar residues (the E/DRY motif) at the cytoplasmic end of transmembrane domain 3. Altogether, the results of this study suggest a conserved role for the cytoplasmic end of transmembrane domain 3 in regulating the activity of divergent G-protein-coupled receptors. PMID:11861550
Maintenance of an Intact Human Immunodeficiency Virus Type 1 vpr Gene following Mother-to-Infant Transmission

PubMed Central

Yedavalli, Venkat R. K.; Chappey, Colombe; Ahmad, Nafees

1998-01-01

The vpr sequences from six human immunodeficiency virus type 1 (HIV-1)-infected mother-infant pairs following perinatal transmission were analyzed. We found that 153 of the 166 clones analyzed from uncultured peripheral blood mononuclear cell DNA samples showed a 92.17% frequency of intact vpr open reading frames. There was a low degree of heterogeneity of vpr genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vpr sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Moreover, the infants’ sequences displayed patterns similar to those seen in their mothers. The functional domains essential for Vpr activity, including virion incorporation, nuclear import, and cell cycle arrest and differentiation were highly conserved in most of the sequences. Phylogenetic analyses of 166 mother-infant pairs and 195 other available vpr sequences from HIV databases formed distinct clusters for each mother-infant pair and for other vpr sequences and grouped the six mother-infant pairs’ sequences with subtype B sequences. A high degree of conservation of intact and functional vpr supports the notion that vpr plays an important role in HIV-1 infection and replication in mother-infant isolates that are involved in perinatal transmission. PMID:9658150

Comparative molecular dynamics studies of heterozygous open reading frames of DNA polymerase eta (η) in pathogenic yeast Candida albicans

NASA Astrophysics Data System (ADS)

Satpati, Suresh; Manohar, Kodavati; Acharya, Narottam; Dixit, Anshuman

2017-01-01

Genomic instability in Candida albicans is believed to play a crucial role in fungal pathogenesis. DNA polymerases contribute significantly to stability of any genome. Although Candida Genome database predicts presence of S. cerevisiae DNA polymerase orthologs; functional and structural characterizations of Candida DNA polymerases are still unexplored. DNA polymerase eta (Polη) is unique as it promotes efficient bypass of cyclobutane pyrimidine dimers. Interestingly, C. albicans is heterozygous in carrying two Polη genes and the nucleotide substitutions were found only in the ORFs. As allelic differences often result in functional differences of the encoded proteins, comparative analyses of structural models and molecular dynamic simulations were performed to characterize these orthologs of DNA Polη. Overall structures of both the ORFs remain conserved except subtle differences in the palm and PAD domains. The complementation analysis showed that both the ORFs equally suppressed UV sensitivity of yeast rad30 deletion strain. Our study has predicted two novel molecular interactions, a highly conserved molecular tetrad of salt bridges and a series of π-π interactions spanning from thumb to PAD. This study suggests these ORFs as the homologues of yeast Polη, and due to its heterogeneity in C. albicans they may play a significant role in pathogenicity.
Identification and characterization of a gene encoding for a nucleotidase from Phaseolus vulgaris.

PubMed

Cabello-Díaz, Juan Miguel; Gálvez-Valdivieso, Gregorio; Caballo, Cristina; Lambert, Rocío; Quiles, Francisco Antonio; Pineda, Manuel; Piedras, Pedro

2015-08-01

Nucleotidases are phosphatases that catalyze the removal of phosphate from nucleotides, compounds with an important role in plant metabolism. A phosphatase enzyme, with high affinity for nucleotides monophosphate previously identified and purified in embryonic axes from French bean, has been analyzed by MALDI TOF/TOF and two internal peptides have been obtained. The information of these peptide sequences has been used to search in the genome database and only a candidate gene that encodes for the phosphatase was identified (PvNTD1). The putative protein contains the conserved domains (motif I-IV) for haloacid dehalogenase-like hydrolases superfamily. The residues involved in the catalytic activity are also conserved. A recombinant protein overexpressed in Escherichia coli has shown molybdate resistant phosphatase activity with nucleosides monophosphate as substrate, confirming that the identified gene encodes for the phosphatase with high affinity for nucleotides purified in French bean embryonic axes. The activity of the purified protein was inhibited by adenosine. The expression of PvNTD1 gene was induced at the specific moment of radicle protrusion in embryonic axes. The gene was also highly expressed in young leaves whereas the level of expression in mature tissues was minimal. Copyright © 2015 The Authors. Published by Elsevier GmbH.. All rights reserved.
Protein domain organisation: adding order.

PubMed

Kummerfeld, Sarah K; Teichmann, Sarah A

2009-01-29

Domains are the building blocks of proteins. During evolution, they have been duplicated, fused and recombined, to produce proteins with novel structures and functions. Structural and genome-scale studies have shown that pairs or groups of domains observed together in a protein are almost always found in only one N to C terminal order and are the result of a single recombination event that has been propagated by duplication of the multi-domain unit. Previous studies of domain organisation have used graph theory to represent the co-occurrence of domains within proteins. We build on this approach by adding directionality to the graphs and connecting nodes based on their relative order in the protein. Most of the time, the linear order of domains is conserved. However, using the directed graph representation we have identified non-linear features of domain organization that are over-represented in genomes. Recognising these patterns and unravelling how they have arisen may allow us to understand the functional relationships between domains and understand how the protein repertoire has evolved. We identify groups of domains that are not linearly conserved, but instead have been shuffled during evolution so that they occur in multiple different orders. We consider 192 genomes across all three kingdoms of life and use domain and protein annotation to understand their functional significance. To identify these features and assess their statistical significance, we represent the linear order of domains in proteins as a directed graph and apply graph theoretical methods. We describe two higher-order patterns of domain organisation: clusters and bi-directionally associated domain pairs and explore their functional importance and phylogenetic conservation. Taking into account the order of domains, we have derived a novel picture of global protein organization. We found that all genomes have a higher than expected degree of clustering and more domain pairs in forward and reverse orientation in different proteins relative to random graphs with identical degree distributions. While these features were statistically over-represented, they are still fairly rare. Looking in detail at the proteins involved, we found strong functional relationships within each cluster. In addition, the domains tended to be involved in protein-protein interaction and are able to function as independent structural units. A particularly striking example was the human Jak-STAT signalling pathway which makes use of a set of domains in a range of orders and orientations to provide nuanced signaling functionality. This illustrated the importance of functional and structural constraints (or lack thereof) on domain organisation.
Unraveling patterns of site-to-site synonymous rates variation and associated gene properties of protein domains and families.

PubMed

Dimitrieva, Slavica; Anisimova, Maria

2014-01-01

In protein-coding genes, synonymous mutations are often thought not to affect fitness and therefore are not subject to natural selection. Yet increasingly, cases of non-neutral evolution at certain synonymous sites were reported over the last decade. To evaluate the extent and the nature of site-specific selection on synonymous codons, we computed the site-to-site synonymous rate variation (SRV) and identified gene properties that make SRV more likely in a large database of protein-coding gene families and protein domains. To our knowledge, this is the first study that explores the determinants and patterns of the SRV in real data. We show that the SRV is widespread in the evolution of protein-coding sequences, putting in doubt the validity of the synonymous rate as a standard neutral proxy. While protein domains rarely undergo adaptive evolution, the SRV appears to play important role in optimizing the domain function at the level of DNA. In contrast, protein families are more likely to evolve by positive selection, but are less likely to exhibit SRV. Stronger SRV was detected in genes with stronger codon bias and tRNA reusage, those coding for proteins with larger number of interactions or forming larger number of structures, located in intracellular components and those involved in typically conserved complex processes and functions. Genes with extreme SRV show higher expression levels in nearly all tissues. This indicates that codon bias in a gene, which often correlates with gene expression, may often be a site-specific phenomenon regulating the speed of translation along the sequence, consistent with the co-translational folding hypothesis. Strikingly, genes with SRV were strongly overrepresented for metabolic pathways and those associated with several genetic diseases, particularly cancers and diabetes.
PrionScan: an online database of predicted prion domains in complete proteomes.

PubMed

Espinosa Angarica, Vladimir; Angulo, Alfonso; Giner, Arturo; Losilla, Guillermo; Ventura, Salvador; Sancho, Javier

2014-02-05

Prions are a particular type of amyloids related to a large variety of important processes in cells, but also responsible for serious diseases in mammals and humans. The number of experimentally characterized prions is still low and corresponds to a handful of examples in microorganisms and mammals. Prion aggregation is mediated by specific protein domains with a remarkable compositional bias towards glutamine/asparagine and against charged residues and prolines. These compositional features have been used to predict new prion proteins in the genomes of different organisms. Despite these efforts, there are only a few available data sources containing prion predictions at a genomic scale. Here we present PrionScan, a new database of predicted prion-like domains in complete proteomes. We have previously developed a predictive methodology to identify and score prionogenic stretches in protein sequences. In the present work, we exploit this approach to scan all the protein sequences in public databases and compile a repository containing relevant information of proteins bearing prion-like domains. The database is updated regularly alongside UniprotKB and in its present version contains approximately 28000 predictions in proteins from different functional categories in more than 3200 organisms from all the taxonomic subdivisions. PrionScan can be used in two different ways: database query and analysis of protein sequences submitted by the users. In the first mode, simple queries allow to retrieve a detailed description of the properties of a defined protein. Queries can also be combined to generate more complex and specific searching patterns. In the second mode, users can submit and analyze their own sequences. It is expected that this database would provide relevant insights on prion functions and regulation from a genome-wide perspective, allowing researches performing cross-species prion biology studies. Our database might also be useful for guiding experimentalists in the identification of new candidates for further experimental characterization.
Actin-related proteins regulate the RSC chromatin remodeler by weakening intramolecular interactions of the Sth1 ATPase.

PubMed

Turegun, Bengi; Baker, Richard W; Leschziner, Andres E; Dominguez, Roberto

2018-01-01

The catalytic subunits of SWI/SNF-family and INO80-family chromatin remodelers bind actin and actin-related proteins (Arps) through an N-terminal helicase/SANT-associated (HSA) domain. Between the HSA and ATPase domains lies a conserved post-HSA (pHSA) domain. The HSA domain of Sth1, the catalytic subunit of the yeast SWI/SNF-family remodeler RSC, recruits the Rtt102-Arp7/9 heterotrimer. Rtt102-Arp7/9 regulates RSC function, but the mechanism is unclear. We show that the pHSA domain interacts directly with another conserved region of the catalytic subunit, protrusion-1. Rtt102-Arp7/9 binding to the HSA domain weakens this interaction and promotes the formation of stable, monodisperse complexes with DNA and nucleosomes. A crystal structure of Rtt102-Arp7/9 shows that ATP binds to Arp7 but not Arp9. However, Arp7 does not hydrolyze ATP. Together, the results suggest that Rtt102 and ATP stabilize a conformation of Arp7/9 that potentiates binding to the HSA domain, which releases intramolecular interactions within Sth1 and controls DNA and nucleosome binding.
A conserved inter-domain communication mechanism regulates the ATPase activity of the AAA-protein Drg1.

PubMed

Prattes, Michael; Loibl, Mathias; Zisser, Gertrude; Luschnig, Daniel; Kappel, Lisa; Rössler, Ingrid; Grassegger, Manuela; Hromic, Altijana; Krieger, Elmar; Gruber, Karl; Pertschy, Brigitte; Bergler, Helmut

2017-03-17

AAA-ATPases fulfil essential roles in different cellular pathways and often act in form of hexameric complexes. Interaction with pathway-specific substrate and adaptor proteins recruits them to their targets and modulates their catalytic activity. This substrate dependent regulation of ATP hydrolysis in the AAA-domains is mediated by a non-catalytic N-terminal domain. The exact mechanisms that transmit the signal from the N-domain and coordinate the individual AAA-domains in the hexameric complex are still the topic of intensive research. Here, we present the characterization of a novel mutant variant of the eukaryotic AAA-ATPase Drg1 that shows dysregulation of ATPase activity and altered interaction with Rlp24, its substrate in ribosome biogenesis. This defective regulation is the consequence of amino acid exchanges at the interface between the regulatory N-domain and the adjacent D1 AAA-domain. The effects caused by these mutations strongly resemble those of pathological mutations of the AAA-ATPase p97 which cause the hereditary proteinopathy IBMPFD (inclusion body myopathy associated with Paget's disease of the bone and frontotemporal dementia). Our results therefore suggest well conserved mechanisms of regulation between structurally, but not functionally related members of the AAA-family.
A low-complexity region in the YTH domain protein Mmi1 enhances RNA binding.

PubMed

Stowell, James A W; Wagstaff, Jane L; Hill, Chris H; Yu, Minmin; McLaughlin, Stephen H; Freund, Stefan M V; Passmore, Lori A

2018-06-15

Mmi1 is an essential RNA-binding protein in the fission yeast Schizosaccharomyces pombe that eliminates meiotic transcripts during normal vegetative growth. Mmi1 contains a YTH domain that binds specific RNA sequences, targeting mRNAs for degradation. The YTH domain of Mmi1 uses a noncanonical RNA-binding surface that includes contacts outside the conserved fold. Here, we report that an N-terminal extension that is proximal to the YTH domain enhances RNA binding. Using X-ray crystallography, NMR, and biophysical methods, we show that this low-complexity region becomes more ordered upon RNA binding. This enhances the affinity of the interaction of the Mmi1 YTH domain with specific RNAs by reducing the dissociation rate of the Mmi1-RNA complex. We propose that the low-complexity region influences RNA binding indirectly by reducing dynamic motions of the RNA-binding groove and stabilizing a conformation of the YTH domain that binds to RNA with high affinity. Taken together, our work reveals how a low-complexity region proximal to a conserved folded domain can adopt an ordered structure to aid nucleic acid binding. © 2018 Stowell et al.
Protein domains of unknown function are essential in bacteria.

PubMed

Goodacre, Norman F; Gerloff, Dietlind L; Uetz, Peter

2013-12-31

More than 20% of all protein domains are currently annotated as "domains of unknown function" (DUFs). About 2,700 DUFs are found in bacteria compared with just over 1,500 in eukaryotes. Over 800 DUFs are shared between bacteria and eukaryotes, and about 300 of these are also present in archaea. A total of 2,786 bacterial Pfam domains even occur in animals, including 320 DUFs. Evolutionary conservation suggests that many of these DUFs are important. Here we show that 355 essential proteins in 16 model bacterial species contain 238 DUFs, most of which represent single-domain proteins, clearly establishing the biological essentiality of DUFs. We suggest that experimental research should focus on conserved and essential DUFs (eDUFs) for functional analysis given their important function and wide taxonomic distribution, including bacterial pathogens. The functional units of proteins are domains. Typically, each domain has a distinct structure and function. Genomes encode thousands of domains, and many of the domains have no known function (domains of unknown function [DUFs]). They are often ignored as of little relevance, given that many of them are found in only a few genomes. Here we show that many DUFs are essential DUFs (eDUFs) based on their presence in essential proteins. We also show that eDUFs are often essential even if they are found in relatively few genomes. However, in general, more common DUFs are more often essential than rare DUFs.
ERp57 interacts with conserved cysteine residues in the MHC class I peptide-binding groove.

PubMed

Antoniou, Antony N; Santos, Susana G; Campbell, Elaine C; Lynch, Sarah; Arosa, Fernando A; Powis, Simon J

2007-05-15

The oxidoreductase ERp57 is a component of the major histocompatibility complex (MHC) class I peptide-loading complex. ERp57 can interact directly with MHC class I molecules, however, little is known about which of the cysteine residues within the MHC class I molecule are relevant to this interaction. MHC class I molecules possess conserved disulfide bonds between cysteines 101-164, and 203-259 in the peptide-binding and alpha3 domain, respectively. By studying a series of mutants of these conserved residues, we demonstrate that ERp57 predominantly associates with cysteine residues in the peptide-binding domain, thus indicating ERp57 has direct access to the peptide-binding groove of MHC class I molecules during assembly.
Phylogenetic analysis, subcellular localization, and expression patterns of RPD3/HDA1 family histone deacetylases in plants

PubMed Central

Alinsug, Malona V; Yu, Chun-Wei; Wu, Keqiang

2009-01-01

Background Although histone deacetylases from model organisms have been previously identified, there is no clear basis for the classification of histone deacetylases under the RPD3/HDA1 superfamily, particularly on plants. Thus, this study aims to reconstruct a phylogenetic tree to determine evolutionary relationships between RPD3/HDA1 histone deacetylases from six different plants representing dicots with Arabidopsis thaliana, Populus trichocarpa, and Pinus taeda, monocots with Oryza sativa and Zea mays, and the lower plants with Physcomitrella patens. Results Sixty two histone deacetylases of RPD3/HDA1 family from the six plant species were phylogenetically analyzed to determine corresponding orthologues. Three clusters were formed separating Class I, Class II, and Class IV. We have confirmed lower and higher plant orthologues for AtHDA8 and AtHDA14, classifying both genes as Class II histone deacetylases in addition to AtHDA5, AtHDA15, and AtHDA18. Since Class II histone deacetylases in other eukaryotes have been known to undergo nucleocytoplasmic transport, it remains unknown whether such functional regulation also happens in plants. Thus, bioinformatics studies using different programs and databases were conducted to predict their corresponding localization sites, nuclear export signal, nuclear localization signal, as well as expression patterns. We also found new conserved domains in most of the RPD3/HDA1 histone deacetylases which were similarly conserved in its corresponding orthologues. Assessing gene expression patterns using Genevestigator, it appears that RPD3/HDA1 histone deacetylases are expressed all throughout the plant parts and developmental stages of the plant. Conclusion The RPD3/HDA1 histone deacetylase family in plants is divided into three distinct groups namely, Class I, Class II, and Class IV suggesting functional diversification. Class II comprises not only AtHDA5, AtHDA15, and AtHDA18 but also includes AtHDA8 and AtHDA14. New conserved domains have also been identified in most of the RPD3/HDA1 family indicating further versatile roles other than histone deacetylation. PMID:19327164
Application of cytochrome b DNA sequences for the authentication of endangered snake species.

PubMed

Wong, Ka-Lok; Wang, Jun; But, Paul Pui-Hay; Shaw, Pang-Chui

2004-01-06

In order to enforce the conservation program and curbing the illegal trading and consumption of endangered snake species, the value of cytochrome b sequence in the authentication of snake species was evaluated. As an illustration, DNA was extracted, selected cytochrome b DNA sequences amplified and sequenced from six snakes commonly consumed in Hong Kong. Cataloging with sequences available in public, a cytochrome b database containing 90 species of snakes was constructed. In this database, sequence homology between snakes ranged from 70.68 to 95.11%. On the other hand, intraspecific variation of three tested snakes was 0-0.98%. Using the database, we were able to determine the identity of six meat samples confiscated by the Agriculture, Fisheries and Conservation Department, HKSAR.
SIMS: addressing the problem of heterogeneity in databases

NASA Astrophysics Data System (ADS)

Arens, Yigal

1997-02-01

The heterogeneity of remotely accessible databases -- with respect to contents, query language, semantics, organization, etc. -- presents serious obstacles to convenient querying. The SIMS (single interface to multiple sources) system addresses this global integration problem. It does so by defining a single language for describing the domain about which information is stored in the databases and using this language as the query language. Each database to which SIMS is to provide access is modeled using this language. The model describes a database's contents, organization, and other relevant features. SIMS uses these models, together with a planning system drawing on techniques from artificial intelligence, to decompose a given user's high-level query into a series of queries against the databases and other data manipulation steps. The retrieval plan is constructed so as to minimize data movement over the network and maximize parallelism to increase execution speed. SIMS can recover from network failures during plan execution by obtaining data from alternate sources, when possible. SIMS has been demonstrated in the domains of medical informatics and logistics, using real databases.
VOZ; isolation and characterization of novel vascular plant transcription factors with a one-zinc finger from Arabidopsis thaliana.

PubMed

Mitsuda, Nobutaka; Hisabori, Toru; Takeyasu, Kunio; Sato, Masa H

2004-07-01

A 38-bp pollen-specific cis-acting region of the AVP1 gene is involved in the expression of the Arabidopsis thaliana V-PPase during pollen development. Here, we report the isolation and structural characterization of AtVOZ1 and AtVOZ2, novel transcription factors that bind to the 38-bp cis-acting region of A. thaliana V-PPase gene, AVP1. AtVOZ1 and AtVOZ2 show 53% amino acid sequence similarity. Homologs of AtVOZ1 and AtVOZ2 are found in various vascular plants as well as a moss, Physcomitrella patens. Promoter-beta-glucuronidase reporter analysis shows that AtVOZ1 is specifically expressed in the phloem tissue and AtVOZ2 is strongly expressed in the root. In vivo transient effector-reporter analysis in A. thaliana suspension-cultured cells demonstrates that AtVOZ1 and AtVOZ2 function as transcriptional activators in the Arabidopsis cell. Two conserved regions termed Domain-A and Domain-B were identified from an alignment of AtVOZ proteins and their homologs of O. sativa and P. patens. AtVOZ2 binds as a dimer to the specific palindromic sequence, GCGTNx7ACGC, with Domain-B, which is comprised of a functional novel zinc coordinating motif and a conserved basic region. Domain-B is shown to function as both the DNA-binding and the dimerization domains of AtVOZ2. From highly the conservative nature among all identified VOZ proteins, we conclude that Domain-B is responsible for the DNA binding and dimerization of all VOZ-family proteins and designate it as the VOZ-domain.
The CDM Superfamily Protein MBC Directs Myoblast Fusion through a Mechanism That Requires Phosphatidylinositol 3,4,5-Triphosphate Binding but Is Independent of Direct Interaction with DCrk▿§

PubMed Central

Balagopalan, Lakshmi; Chen, Mei-Hui; Geisbrecht, Erika R.; Abmayr, Susan M.

2006-01-01

myoblast city (mbc), a member of the CDM superfamily, is essential in the Drosophila melanogaster embryo for fusion of myoblasts into multinucleate fibers. Using germ line clones in which both maternal and zygotic contributions were eliminated and rescue of the zygotic loss-of-function phenotype, we established that mbc is required in the fusion-competent subset of myoblasts. Along with its close orthologs Dock180 and CED-5, MBC has an SH3 domain at its N terminus, conserved internal domains termed DHR1 and DHR2 (or “Docker”), and C-terminal proline-rich domains that associate with the adapter protein DCrk. The importance of these domains has been evaluated by the ability of MBC mutations and deletions to rescue the mbc loss-of-function muscle phenotype. We demonstrate that the SH3 and Docker domains are essential. Moreover, ethyl methanesulfonate-induced mutations that change amino acids within the MBC Docker domain to residues that are conserved in other CDM family members nevertheless eliminate MBC function in the embryo, which suggests that these sites may mediate interactions specific to Drosophila MBC. A functional requirement for the conserved DHR1 domain, which binds to phosphatidylinositol 3,4,5-triphosphate, implicates phosphoinositide signaling in myoblast fusion. Finally, the proline-rich C-terminal sites mediate strong interactions with DCrk, as expected. These sites are not required for MBC to rescue the muscle loss-of-function phenotype, however, which suggests that MBC's role in myoblast fusion can be carried out independently of direct DCrk binding. PMID:17030600
Domain alternation and active site remodeling are conserved structural features of ubiquitin E1.

PubMed

Lv, Zongyang; Yuan, Lingmin; Atkison, James H; Aldana-Masangkay, Grace; Chen, Yuan; Olsen, Shaun K

2017-07-21

E1 enzymes for ubiquitin (Ub) and Ub-like modifiers (Ubls) harbor two catalytic activities that are required for Ub/Ubl activation: adenylation and thioester bond formation. Structural studies of the E1 for the Ubl s mall u biquitin-like mo difier (SUMO) revealed a single active site that is transformed by a conformational switch that toggles its competency for catalysis of these two distinct chemical reactions. Although the mechanisms of adenylation and thioester bond formation revealed by SUMO E1 structures are thought to be conserved in Ub E1, there is currently a lack of structural data supporting this hypothesis. Here, we present a structure of Schizosaccharomyces pombe Uba1 in which the second catalytic cysteine half-domain (SCCH domain) harboring the catalytic cysteine has undergone a 106° rotation that results in a completely different network of intramolecular interactions between the SCCH and adenylation domains and translocation of the catalytic cysteine 12 Å closer to the Ub C terminus compared with previous Uba1 structures. SCCH domain alternation is accompanied by conformational changes within the Uba1 adenylation domains that effectively disassemble the adenylation active site. Importantly, the structural and biochemical data suggest that domain alternation and remodeling of the adenylation active site are interconnected and are intrinsic structural features of Uba1 and that the overall structural basis for adenylation and thioester bond formation exhibited by SUMO E1 is indeed conserved in Ub E1. Finally, the mechanistic insights provided by the novel conformational snapshot of Uba1 presented in this study may guide efforts to develop small molecule inhibitors of this critically important enzyme that is an active target for anticancer therapeutics. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
SynechoNET: integrated protein-protein interaction database of a model cyanobacterium Synechocystis sp. PCC 6803.

PubMed

Kim, Woo-Yeon; Kang, Sungsoo; Kim, Byoung-Chul; Oh, Jeehyun; Cho, Seongwoong; Bhak, Jong; Choi, Jong-Soon

2008-01-01

Cyanobacteria are model organisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. Despite many studies on cyanobacteria, there is no web-based database of their regulatory and signaling protein-protein interaction networks to date. We report a database and website SynechoNET that provides predicted protein-protein interactions. SynechoNET shows cyanobacterial domain-domain interactions as well as their protein-level interactions using the model cyanobacterium, Synechocystis sp. PCC 6803. It predicts the protein-protein interactions using public interaction databases that contain mutually complementary and redundant data. Furthermore, SynechoNET provides information on transmembrane topology, signal peptide, and domain structure in order to support the analysis of regulatory membrane proteins. Such biological information can be queried and visualized in user-friendly web interfaces that include the interactive network viewer and search pages by keyword and functional category. SynechoNET is an integrated protein-protein interaction database designed to analyze regulatory membrane proteins in cyanobacteria. It provides a platform for biologists to extend the genomic data of cyanobacteria by predicting interaction partners, membrane association, and membrane topology of Synechocystis proteins. SynechoNET is freely available at http://synechocystis.org/ or directly at http://bioportal.kobic.kr/SynechoNET/.
Flavivirus and Filovirus EvoPrinters: New alignment tools for the comparative analysis of viral evolution.

PubMed

Brody, Thomas; Yavatkar, Amarendra S; Park, Dong Sun; Kuzin, Alexander; Ross, Jermaine; Odenwald, Ward F

2017-06-01

Flavivirus and Filovirus infections are serious epidemic threats to human populations. Multi-genome comparative analysis of these evolving pathogens affords a view of their essential, conserved sequence elements as well as progressive evolutionary changes. While phylogenetic analysis has yielded important insights, the growing number of available genomic sequences makes comparisons between hundreds of viral strains challenging. We report here a new approach for the comparative analysis of these hemorrhagic fever viruses that can superimpose an unlimited number of one-on-one alignments to identify important features within genomes of interest. We have adapted EvoPrinter alignment algorithms for the rapid comparative analysis of Flavivirus or Filovirus sequences including Zika and Ebola strains. The user can input a full genome or partial viral sequence and then view either individual comparisons or generate color-coded readouts that superimpose hundreds of one-on-one alignments to identify unique or shared identity SNPs that reveal ancestral relationships between strains. The user can also opt to select a database genome in order to access a library of pre-aligned genomes of either 1,094 Flaviviruses or 460 Filoviruses for rapid comparative analysis with all database entries or a select subset. Using EvoPrinter search and alignment programs, we show the following: 1) superimposing alignment data from many related strains identifies lineage identity SNPs, which enable the assessment of sublineage complexity within viral outbreaks; 2) whole-genome SNP profile screens uncover novel Dengue2 and Zika recombinant strains and their parental lineages; 3) differential SNP profiling identifies host cell A-to-I hyper-editing within Ebola and Marburg viruses, and 4) hundreds of superimposed one-on-one Ebola genome alignments highlight ultra-conserved regulatory sequences, invariant amino acid codons and evolutionarily variable protein-encoding domains within a single genome. EvoPrinter allows for the assessment of lineage complexity within Flavivirus or Filovirus outbreaks, identification of recombinant strains, highlights sequences that have undergone host cell A-to-I editing, and identifies unique input and database SNPs within highly conserved sequences. EvoPrinter's ability to superimpose alignment data from hundreds of strains onto a single genome has allowed us to identify unique Zika virus sublineages that are currently spreading in South, Central and North America, the Caribbean, and in China. This new set of integrated alignment programs should serve as a useful addition to existing tools for the comparative analysis of these viruses.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Rushton, Phillip S.; Olek, Anna T.; Makowski, Lee

The crystallographic structure of a rice (Oryza sativa) cellulose synthase, OsCesA8, plant-conserved region (P-CR), one of two unique domains in the catalytic domain of plant CesAs, was solved to 2.4 Å resolution. Two antiparallel α-helices form a coiled-coil domain linked by a large extended connector loop containing a conserved trio of aromatic residues. The P-CR structure was fit into a molecular envelope for the P-CR domain derived from small-angle X-ray scattering data. The P-CR structure and molecular envelope, combined with a homology-based chain trace of the CesA8 catalytic core, were modeled into a previously determined CesA8 small-angle X-ray scattering molecularmore » envelope to produce a detailed topological model of the CesA8 catalytic domain. The predicted position for the P-CR domain from the molecular docking models places the P-CR connector loop into a hydrophobic pocket of the catalytic core, with the coiled-coil aligned near the entrance of the substrate UDP-glucose into the active site. In this configuration, the P-CR coiled-coil alone is unlikely to regulate substrate access to the active site, but it could interact with other domains of CesA, accessory proteins, or other CesA catalytic domains to control substrate delivery.« less
Monitoring Wildlife Interactions with Their Environment: An Interdisciplinary Approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Charles-Smith, Lauren E.; Domnguez, Ignacio X.; Fornaro, Robert J.

In a rapidly changing world, wildlife ecologists strive to correctly model and predict complex relationships between animals and their environment, which facilitates management decisions impacting public policy to conserve and protect delicate ecosystems. Recent advances in monitoring systems span scientific domains, including animal and weather monitoring devices and landscape classification mapping techniques. The current challenge is how to combine and use detailed output from various sources to address questions spanning multiple disciplines. WolfScout wildlife and weather tracking system is a software tool capable of filling this niche. WolfScout automates integration of the latest technological advances in wildlife GPS collars, weathermore » stations, drought conditions, and severe weather reports, and animal demographic information. The WolfScout database stores a variety of classified landscape maps including natural and manmade features. Additionally, WolfScout’s spatial database management system allows users to calculate distances between animals’ location and landscape characteristics, which are linked to the best approximation of environmental conditions at the animal’s location during the interaction. Through a secure website, data are exported in formats compatible with multiple software programs including R and ArcGIS. The WolfScout design promotes interoperability in data, between researchers, and software applications while standardizing analyses of animal interactions with their environment.« less

Genome-Wide Identification of Arabidopsis Coiled-Coil Proteins and Establishment of the ARABI-COIL Database1

PubMed Central

Rose, Annkatrin; Manikantan, Sankaraganesh; Schraegle, Shannon J.; Maloy, Michael A.; Stahlberg, Eric A.; Meier, Iris

2004-01-01

Increasing evidence demonstrates the importance of long coiled-coil proteins for the spatial organization of cellular processes. Although several protein classes with long coiled-coil domains have been studied in animals and yeast, our knowledge about plant long coiled-coil proteins is very limited. The repeat nature of the coiled-coil sequence motif often prevents the simple identification of homologs of animal coiled-coil proteins by generic sequence similarity searches. As a consequence, counterparts of many animal proteins with long coiled-coil domains, like lamins, golgins, or microtubule organization center components, have not been identified yet in plants. Here, all Arabidopsis proteins predicted to contain long stretches of coiled-coil domains were identified by applying the algorithm MultiCoil to a genome-wide screen. A searchable protein database, ARABI-COIL (http://www.coiled-coil.org/arabidopsis), was established that integrates information on number, size, and position of predicted coiled-coil domains with subcellular localization signals, transmembrane domains, and available functional annotations. ARABI-COIL serves as a tool to sort and browse Arabidopsis long coiled-coil proteins to facilitate the identification and selection of candidate proteins of potential interest for specific research areas. Using the database, candidate proteins were identified for Arabidopsis membrane-bound, nuclear, and organellar long coiled-coil proteins. PMID:15020757
Toward More Accurate Iris Recognition Using Cross-Spectral Matching.

PubMed

Nalla, Pattabhi Ramaiah; Kumar, Ajay

2017-01-01

Iris recognition systems are increasingly deployed for large-scale applications such as national ID programs, which continue to acquire millions of iris images to establish identity among billions. However, with the availability of variety of iris sensors that are deployed for the iris imaging under different illumination/environment, significant performance degradation is expected while matching such iris images acquired under two different domains (either sensor-specific or wavelength-specific). This paper develops a domain adaptation framework to address this problem and introduces a new algorithm using Markov random fields model to significantly improve cross-domain iris recognition. The proposed domain adaptation framework based on the naive Bayes nearest neighbor classification uses a real-valued feature representation, which is capable of learning domain knowledge. Our approach to estimate corresponding visible iris patterns from the synthesis of iris patches in the near infrared iris images achieves outperforming results for the cross-spectral iris recognition. In this paper, a new class of bi-spectral iris recognition system that can simultaneously acquire visible and near infra-red images with pixel-to-pixel correspondences is proposed and evaluated. This paper presents experimental results from three publicly available databases; PolyU cross-spectral iris image database, IIITD CLI and UND database, and achieve outperforming results for the cross-sensor and cross-spectral iris matching.
Exploring the dark foldable proteome by considering hydrophobic amino acids topology

PubMed Central

Bitard-Feildel, Tristan; Callebaut, Isabelle

2017-01-01

The protein universe corresponds to the set of all proteins found in all organisms. A way to explore it is by taking into account the domain content of the proteins. However, some part of sequences and many entire sequences remain un-annotated despite a converging number of domain families. The un-annotated part of the protein universe is referred to as the dark proteome and remains poorly characterized. In this study, we quantify the amount of foldable domains within the dark proteome by using the hydrophobic cluster analysis methodology. These un-annotated foldable domains were grouped using a combination of remote homology searches and domain annotations, leading to define different levels of darkness. The dark foldable domains were analyzed to understand what make them different from domains stored in databases and thus difficult to annotate. The un-annotated domains of the dark proteome universe display specific features relative to database domains: shorter length, non-canonical content and particular topology in hydrophobic residues, higher propensity for disorder, and a higher energy. These features make them hard to relate to known families. Based on these observations, we emphasize that domain annotation methodologies can still be improved to fully apprehend and decipher the molecular evolution of the protein universe. PMID:28134276
Inferring the Brassica rapa Interactome Using Protein–Protein Interaction Data from Arabidopsis thaliana

PubMed Central

Yang, Jianhua; Osman, Kim; Iqbal, Mudassar; Stekel, Dov J.; Luo, Zewei; Armstrong, Susan J.; Franklin, F. Chris H.

2013-01-01

Following successful completion of the Brassica rapa sequencing project, the next step is to investigate functions of individual genes/proteins. For Arabidopsis thaliana, large amounts of protein–protein interaction (PPI) data are available from the major PPI databases (DBs). It is known that Brassica crop species are closely related to A. thaliana. This provides an opportunity to infer the B. rapa interactome using PPI data available from A. thaliana. In this paper, we present an inferred B. rapa interactome that is based on the A. thaliana PPI data from two resources: (i) A. thaliana PPI data from three major DBs, BioGRID, IntAct, and TAIR. (ii) ortholog-based A. thaliana PPI predictions. Linking between B. rapa and A. thaliana was accomplished in three complementary ways: (i) ortholog predictions, (ii) identification of gene duplication based on synteny and collinearity, and (iii) BLAST sequence similarity search. A complementary approach was also applied, which used known/predicted domain–domain interaction data. Specifically, since the two species are closely related, we used PPI data from A. thaliana to predict interacting domains that might be conserved between the two species. The predicted interactome was investigated for the component that contains known A. thaliana meiotic proteins to demonstrate its usability. PMID:23293649
Cloning and characterization of carboxyl terminus of heat shock cognate 70-interacting protein gene from the silkworm, Bombyx mori.

PubMed

Ohsawa, Takeshi; Fujimoto, Shota; Tsunakawa, Akane; Shibano, Yuka; Kawasaki, Hideki; Iwanaga, Masashi

2016-11-01

Carboxyl terminus of heat shock cognate 70-interacting protein (CHIP) is an evolutionarily conserved E3 ubiquitin ligase across different eukaryotic species and is known to play a key role in protein quality control. CHIP has two distinct functional domains, an N-terminal tetratricopeptide repeat (TPR) and a C-terminal U-box domain, which are required for the ubiquitination of numerous labile client proteins that are chaperoned by heat shock proteins (HSPs) and heat shock cognate proteins (HSCs). During our screen for CHIP-like proteins in the Bombyx mori databases, we found a novel silkworm gene, Bombyx mori CHIP. Phylogenetic analysis showed that BmCHIP belongs to Lepidopteran lineages. Quantitative reverse transcription-PCR analysis indicated that BmCHIP was relatively highly expressed in the gonad and fat body. A pull-down experiment and auto-ubiquitination assay showed that BmCHIP interacted with BmHSC70 and had E3 ligase activity. Additionally, immunohistochemical analysis revealed that BmCHIP was partially co-localized with ubiquitin in BmN4 cells. These data support that BmCHIP plays an important role in the ubiquitin proteasome system as an E3 ubiquitin ligase in B. mori. Copyright © 2016 Elsevier Inc. All rights reserved.
Molecular characterization of DnaJ 5 homologs in silkworm Bombyx mori and its expression during egg diapause.

PubMed

Sirigineedi, Sasibhushan; Vijayagowri, Esvaran; Murthy, Geetha N; Rao, Guruprasada; Ponnuvel, Kangayam M

2014-12-01

A comparison of the cDNA sequences (1 056 bp) of Bombyx mori DnaJ 5 homolog with B. mori genome revealed that unlike in other Hsps, it has an intron of 234 bp. The DnaJ 5 homolog contains 351 amino acids, of which 70 contain the conserved DnaJ domain at the N-terminal end. This homolog of B. mori has all desirable functional domains similar to other insects, and the 13 different DnaJ homologs identified in B. mori genome were distributed on different chromosomes. The expressed sequence tag database analysis of Hsp40 gene expression revealed higher expression in wing disc followed by diapause-induced eggs. Microarray analysis revealed higher expression of DnaJ 5 homolog at 18th h after oviposition in diapause-induced eggs. Further validation of DnaJ 5 expression through qPCR in diapause-induced and nondiapause eggs at different time intervals revealed higher expression in diapause eggs at 18 and 24 h after oviposition, which coincided with the expression of Hsp70 as the Hsp 40 is its co-chaperone. This study thus provides an outline of the genome organization of Hsp40 gene, and its role in egg diapause induction in B. mori. © 2013 Institute of Zoology, Chinese Academy of Sciences.
Fuzzy queries above relational database

NASA Astrophysics Data System (ADS)

Smolka, Pavel; Bradac, Vladimir

2017-11-01

The aim of the theme is to introduce a possibility of fuzzy queries implemented in relational databases. The issue is described on a model which identifies the appropriate part of the problem domain for fuzzy approach. The model is demonstrated on a database of wines focused on searching in it. The construction of the database complies with the Law of the Czech Republic.
Functional characterization of the non-catalytic ectodomains of the nucleotide pyrophosphatase/phosphodiesterase NPP1.

PubMed Central

Gijsbers, Rik; Ceulemans, Hugo; Bollen, Mathieu

2003-01-01

The ubiquitous nucleotide pyrophosphatases/phosphodiesterases NPP1-3 consist of a short intracellular N-terminal domain, a single transmembrane domain and a large extracellular part, comprising two somatomedin-B-like domains, a catalytic domain and a poorly defined C-terminal domain. We show here that the C-terminal domain of NPP1-3 is structurally related to a family of DNA/RNA non-specific endonucleases. However, none of the residues that are essential for catalysis by the endonucleases are conserved in NPP1-NPP3, suggesting that the nuclease-like domain of NPP1-3 does not represent a second catalytic domain. Truncation analysis revealed that the nuclease-like domain of NPP1 is required for protein stability, for the targeting of NPP1 to the plasma membrane and for the expression of catalytic activity. We also demonstrate that 16 conserved cysteines in the somatomedin-B-like domains of NPP1, in concert with two flanking cysteines, mediate the dimerization of NPP1. The K173Q polymorphism of NPP1, which maps to the second somatomedin-B-like domain and has been associated with the aetiology of insulin resistance, did not affect the dimerization or catalytic activity of NPP1, and did not endow NPP1 with an affinity for the insulin receptor. Our data suggest that the non-catalytic ectodomains contribute to the subunit structure, stability and function of NPP1-3. PMID:12533192
Interface conditions for domain decomposition with radical grid refinement

NASA Technical Reports Server (NTRS)

Scroggs, Jeffrey S.

1991-01-01

Interface conditions for coupling the domains in a physically motivated domain decomposition method are discussed. The domain decomposition is based on an asymptotic-induced method for the numerical solution of hyperbolic conservation laws with small viscosity. The method consists of multiple stages. The first stage is to obtain a first approximation using a first-order method, such as the Godunov scheme. Subsequent stages of the method involve solving internal-layer problem via a domain decomposition. The method is derived and justified via singular perturbation techniques.
The CW domain, a structural module shared amongst vertebrates, vertebrate-infecting parasites and higher plants.

PubMed

Perry, Jason; Zhao, Yunde

2003-11-01

A previously undetected domain, named CW for its conserved cysteine and tryptophan residues, appears to be a four-cysteine zinc-finger motif found exclusively in vertebrates, vertebrate-infecting parasites and higher plants. Of the twelve distinct nuclear protein families that comprise the CW domain-containing superfamily, only the microrchida (MORC) family has begun to be characterized. However, several families contain other domains suggesting a relationship between the CW domain and either chromatin methylation status or early embryonic development.
A structural role for the PHP domain in E. coli DNA polymerase III.

PubMed

Barros, Tiago; Guenther, Joel; Kelch, Brian; Anaya, Jordan; Prabhakar, Arjun; O'Donnell, Mike; Kuriyan, John; Lamers, Meindert H

2013-05-14

In addition to the core catalytic machinery, bacterial replicative DNA polymerases contain a Polymerase and Histidinol Phosphatase (PHP) domain whose function is not entirely understood. The PHP domains of some bacterial replicases are active metal-dependent nucleases that may play a role in proofreading. In E. coli DNA polymerase III, however, the PHP domain has lost several metal-coordinating residues and is likely to be catalytically inactive. Genomic searches show that the loss of metal-coordinating residues in polymerase PHP domains is likely to have coevolved with the presence of a separate proofreading exonuclease that works with the polymerase. Although the E. coli Pol III PHP domain has lost metal-coordinating residues, the structure of the domain has been conserved to a remarkable degree when compared to that of metal-binding PHP domains. This is demonstrated by our ability to restore metal binding with only three point mutations, as confirmed by the metal-bound crystal structure of this mutant determined at 2.9 Å resolution. We also show that Pol III, a large multi-domain protein, unfolds cooperatively and that mutations in the degenerate metal-binding site of the PHP domain decrease the overall stability of Pol III and reduce its activity. While the presence of a PHP domain in replicative bacterial polymerases is strictly conserved, its ability to coordinate metals and to perform proofreading exonuclease activity is not, suggesting additional non-enzymatic roles for the domain. Our results show that the PHP domain is a major structural element in Pol III and its integrity modulates both the stability and activity of the polymerase.
Nucleoplasmin-like domain of FKBP39 from Drosophila melanogaster forms a tetramer with partly disordered tentacle-like C-terminal segments

PubMed Central

Kozłowska, Małgorzata; Tarczewska, Aneta; Jakób, Michał; Bystranowska, Dominika; Taube, Michał; Kozak, Maciej; Czarnocki-Cieciura, Mariusz; Dziembowski, Andrzej; Orłowski, Marek; Tkocz, Katarzyna; Ożyhar, Andrzej

2017-01-01

Nucleoplasmins are a nuclear chaperone family defined by the presence of a highly conserved N-terminal core domain. X-ray crystallographic studies of isolated nucleoplasmin core domains revealed a β-propeller structure consisting of a set of five monomers that together form a stable pentamer. Recent studies on isolated N-terminal domains from Drosophila 39-kDa FK506-binding protein (FKBP39) and from other chromatin-associated proteins showed analogous, nucleoplasmin-like (NPL) pentameric structures. Here, we report that the NPL domain of the full-length FKBP39 does not form pentameric complexes. Multi-angle light scattering (MALS) and sedimentation equilibrium ultracentrifugation (SE AUC) analyses of the molecular mass of the full-length protein indicated that FKBP39 forms homotetrameric complexes. Molecular models reconstructed from small-angle X-ray scattering (SAXS) revealed that the NPL domain forms a stable, tetrameric core and that FK506-binding domains are linked to it by intrinsically disordered, flexible chains that form tentacle-like segments. Analyses of full-length FKBP39 and its isolated NPL domain suggested that the distal regions of the polypeptide chain influence and determine the quaternary conformation of the nucleoplasmin-like protein. These results provide new insights regarding the conserved structure of nucleoplasmin core domains and provide a potential explanation for the importance of the tetrameric structural organization of full-length nucleoplasmins. PMID:28074868
The evolutionarily conserved interaction between LC3 and p62 selectively mediates autophagy-dependent degradation of mutant huntingtin.

PubMed

Tung, Ying-Tsen; Hsu, Wen-Ming; Lee, Hsinyu; Huang, Wei-Pang; Liao, Yung-Feng

2010-07-01

Mammalian p62/sequestosome-1 protein binds to both LC3, the mammalian homologue of yeast Atg8, and polyubiquitinated cargo proteins destined to undergo autophagy-mediated degradation. We previously identified a cargo receptor-binding domain in Atg8 that is essential for its interaction with the cargo receptor Atg19 in selective autophagic processes in yeast. We, thus, sought to determine whether this interaction is evolutionally conserved from yeast to mammals. Using an amino acid replacement approach, we demonstrate that cells expressing mutant LC3 (LC3-K30D, LC3-K51A, or LC3-L53A) all exhibit defective lipidation of LC3, a disrupted LC3-p62 interaction, and impaired autophagic degradation of p62, suggesting that the p62-binding site of LC3 is localized within an evolutionarily conserved domain. Importantly, whereas cells expressing these LC3 mutants exhibited similar overall autophagic activity comparable to that of cells expressing wild-type LC3, autophagy-mediated clearance of the aggregation-prone mutant Huntingtin was defective in the mutant-expressing cells. Together, these results suggest that p62 directly binds to the evolutionarily conserved cargo receptor-binding domain of Atg8/LC3 and selectively mediates the clearance of mutant Huntingtin.
Genome-wide identification and analysis of basic helix-loop-helix domains in dog, Canis lupus familiaris.

PubMed

Wang, Xu-Hua; Wang, Yong; Liu, A-Ke; Liu, Xiao-Ting; Zhou, Yang; Yao, Qin; Chen, Ke-Ping

2015-04-01

The basic helix-loop-helix (bHLH) domain is a highly conserved amino acid motif that defines a group of DNA-binding transcription factors. bHLH proteins play essential regulatory roles in a variety of biological processes in animal, plant, and fungus. The domestic dog, Canis lupus familiaris, is a good model organism for genetic, physiological, and behavioral studies. In this study, we identified 115 putative bHLH genes in the dog genome. Based on a phylogenetic analysis, 51, 26, 14, 4, 12, and 4 dog bHLH genes were assigned to six separate groups (A-F); four bHLH genes were categorized as ''orphans''. Within-group evolutionary relationships inferred from the phylogenetic analysis were consistent with positional conservation, other conserved domains flanking the bHLH motif, and highly conserved intron/exon patterns in other vertebrates. Our analytical results confirmed the GenBank annotations of 89 dog bHLH proteins and provided information that could be used to update the annotations of the remaining 26 dog bHLH proteins. These data will provide good references for further studies on the structures and regulatory functions of bHLH proteins in the growth and development of dogs, which may help in understanding the mechanisms that underlie the physical and behavioral differences between dogs and wolves.
Oligomerisation status and evolutionary conservation of interfaces of protein structural domain superfamilies.

PubMed

Sukhwal, Anshul; Sowdhamini, Ramanathan

2013-07-01

Protein-protein interactions are important in carrying out many biological processes and functions. These interactions may be either permanent or of temporary nature. Several studies have employed tools like solvent accessibility and graph theory to identify these interactions, but still more studies need to be performed to quantify and validate them. Although we now have many databases available with predicted and experimental results on protein-protein interactions, we still do not have many databases which focus on providing structural details of the interacting complexes, their oligomerisation state and homologues. In this work, protein-protein interactions have been thoroughly investigated within the structural regime and quantified for their strength using calculated pseudoenergies. The PPCheck server, an in-house webserver, has been used for calculating the pseudoenergies like van der Waals, hydrogen bonds and electrostatic energy based on distances between atoms of amino acids from two interacting proteins. PPCheck can be visited at . Based on statistical data, as obtained by studying established protein-protein interacting complexes from earlier studies, we came to a conclusion that an average protein-protein interface consisted of about 51 to 150 amino acid residues and the generalized energy per residue ranged from -2 kJ mol(-1) to -6 kJ mol(-1). We found that some of the proteins have an exceptionally higher number of amino acids at the interface and it was purely because of their elaborate interface or extended topology i.e. some of their secondary structure regions or loops were either inter-mixing or running parallel to one another or they were taking part in domain swapping. Residue networks were prepared for all the amino acids of the interacting proteins involved in different types of interactions (like van der Waals, hydrogen-bonding, electrostatic or intramolecular interactions) and were analysed between the query domain-interacting partner pair and its remote homologue-interacting partner pair. We found that, in exceptional cases, homologous proteins belonging to the same superfamily, but with remote sequence similarity, can share similar interfaces.
Conservation of an Intact vif Gene of Human Immunodeficiency Virus Type 1 during Maternal-Fetal Transmission

PubMed Central

Yedavalli, Venkat R. K.; Chappey, Colombe; Matala, Erik; Ahmad, Nafees

1998-01-01

The human immunodeficiency virus type 1 (HIV-1) vif gene is conserved among most lentiviruses, suggesting that vif is important for natural infection. To determine whether an intact vif gene is positively selected during mother-to-infant transmission, we analyzed vif sequences from five infected mother-infant pairs following perinatal transmission. The coding potential of the vif open reading frame directly derived from uncultured peripheral blood mononuclear cell DNA was maintained in most of the 78,912 bp sequenced. We found that 123 of the 137 clones analyzed showed an 89.8% frequency of intact vif open reading frames. There was a low degree of heterogeneity of vif genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vif sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Furthermore, the epidemiologically linked mother-infant pair vif sequences displayed similar patterns that were not seen in vif sequences from epidemiologically unlinked individuals. The functional domains, including the two cysteines at positions 114 and 133, a serine phosphorylation site at position 144, and the C-terminal basic amino acids essential for vif protein function, were highly conserved in most of the sequences. Phylogenetic analyses of 137 mother-infant pair vif sequences and 187 other available vif sequences from HIV-1 databases revealed distinct clusters for vif sequences from each mother-infant pair and for other vif sequences. Taken together, these findings suggest that vif plays an important role in HIV-1 infection and replication in mothers and their perinatally infected infants. PMID:9445004
Molecular Cloning and Characterization of Taurocyamine Kinase from Clonorchis sinensis: A Candidate Chemotherapeutic Target

PubMed Central

Tokuhiro, Shinji; Nagataki, Mitsuru; Jarilla, Blanca R.; Nomura, Haruka; Kim, Tae Im; Hong, Sung-Jong; Agatsuma, Takeshi

2013-01-01

Background Adult Clonorchis sinensis lives in the bile duct and causes endemic clonorchiasis in East Asian countries. Phosphagen kinases (PK) constitute a highly conserved family of enzymes, which play a role in ATP buffering in cells, and are potential targets for chemotherapeutic agents, since variants of PK are found only in invertebrate animals, including helminthic parasites. This work is conducted to characterize a PK from C. sinensis and to address further investigation for future drug development. Methology/Principal findings A cDNA clone encoding a putative polypeptide of 717 amino acids was retrieved from a C. sinensis transcriptome. This polypeptide was homologous to taurocyamine kinase (TK) of the invertebrate animals and consisted of two contiguous domains. C. sinensis TK (CsTK) gene was reported and found consist of 13 exons intercalated with 12 introns. This suggested an evolutionary pathway originating from an arginine kinase gene group, and distinguished annelid TK from the general CK phylogenetic group. CsTK was found not to have a homologous counterpart in sequences analysis of its mammalian hosts from public databases. Individual domains of CsTK, as well as the whole two-domain enzyme, showed enzymatic activity and specificity toward taurocyamine substrate. Of the CsTK residues, R58, I60 and Y84 of domain 1, and H60, I63 and Y87 of domain 2 were found to participate in binding taurocyamine. CsTK expression was distributed in locomotive and reproductive organs of adult C. sinensis. Developmentally, CsTK was stably expressed in both the adult and metacercariae stages. Recombinant CsTK protein was found to have low sensitivity and specificity toward C. sinensis and platyhelminth-infected human sera on ELISA. Conclusion CsTK is a promising anti-C. sinensis drug target since the enzyme is found only in the C. sinensis and has a substrate specificity for taurocyamine, which is different from its mammalian counterpart, creatine. PMID:24278491
Involvement of zebrafish RIG-I in NF-κB and IFN signaling pathways: insights into functional conservation of RIG-I in antiviral innate immunity.

PubMed

Nie, Li; Zhang, Ying-sheng; Dong, Wei-ren; Xiang, Li-xin; Shao, Jian-zhong

2015-01-01

The retinoic acid-inducible gene I (RIG-I) is a critical sensor for host recognition of RNA virus infection and initiation of antiviral signaling pathways in mammals. However, data on the occurrence and functions of this molecule in lower vertebrates are limited. In this study, we characterized an RIG-I homolog (DrRIG-I) from zebrafish. Structurally, this DrRIG-I shares a number of conserved functional domains/motifs with its mammalian counterparts, namely, caspase activation and recruitment domain, DExD/H box, a helicase domain, and a C-terminal domain. Functionally, stimulation with DrRIG-I CARD in zebrafish embryos significantly activated the NF-κB and IFN signaling pathways, leading to the expression of TNF-α, IL-8 and IFN-induced Mx, ISG15, and viperin. However, knockdown of TRIM25 (a pivotal activator for RIG-I receptors) significantly suppressed the induced activation of IFN signaling. Results suggested the functional conservation of RIG-I receptors in the NF-κB and IFN signaling pathways between teleosts and mammals, providing a perspective into the evolutionary history of RIG-I-mediated antiviral innate immunity. Copyright © 2014 Elsevier Ltd. All rights reserved.
A Laminin G-EGF-Laminin G module in Neurexin IV is essential for the apico-lateral localization of Contactin and organization of septate junctions.

PubMed

Banerjee, Swati; Paik, Raehum; Mino, Rosa E; Blauth, Kevin; Fisher, Elizabeth S; Madden, Victoria J; Fanning, Alan S; Bhat, Manzoor A

2011-01-01

Septate junctions (SJs) display a unique ultrastructural morphology with ladder-like electron densities that are conserved through evolution. Genetic and molecular analyses have identified a highly conserved core complex of SJ proteins consisting of three cell adhesion molecules Neurexin IV, Contactin, and Neuroglian, which interact with the cytoskeletal FERM domain protein Coracle. How these individual proteins interact to form the septal arrays that create the paracellular barrier is poorly understood. Here, we show that point mutations that map to specific domains of neurexin IV lead to formation of fewer septae and disorganization of SJs. Consistent with these observations, our in vivo domain deletion analyses identified the first Laminin G-EGF-Laminin G module in the extracellular region of Neurexin IV as necessary for the localization of and association with Contactin. Neurexin IV protein that is devoid of its cytoplasmic region is able to create septae, but fails to form a full complement of SJs. These data provide the first in vivo evidence that specific domains in Neurexin IV are required for protein-protein interactions and organization of SJs. Given the molecular conservation of SJ proteins across species, our studies may provide insights into how vertebrate axo-glial SJs are organized in myelinated axons.
A Laminin G-EGF-Laminin G Module in Neurexin IV Is Essential for the Apico-Lateral Localization of Contactin and Organization of Septate Junctions

PubMed Central

Banerjee, Swati; Paik, Raehum; Mino, Rosa E.; Blauth, Kevin; Fisher, Elizabeth S.; Madden, Victoria J.; Fanning, Alan S.; Bhat, Manzoor A.

2011-01-01

Septate junctions (SJs) display a unique ultrastructural morphology with ladder-like electron densities that are conserved through evolution. Genetic and molecular analyses have identified a highly conserved core complex of SJ proteins consisting of three cell adhesion molecules Neurexin IV, Contactin, and Neuroglian, which interact with the cytoskeletal FERM domain protein Coracle. How these individual proteins interact to form the septal arrays that create the paracellular barrier is poorly understood. Here, we show that point mutations that map to specific domains of neurexin IV lead to formation of fewer septae and disorganization of SJs. Consistent with these observations, our in vivo domain deletion analyses identified the first Laminin G-EGF-Laminin G module in the extracellular region of Neurexin IV as necessary for the localization of and association with Contactin. Neurexin IV protein that is devoid of its cytoplasmic region is able to create septae, but fails to form a full complement of SJs. These data provide the first in vivo evidence that specific domains in Neurexin IV are required for protein-protein interactions and organization of SJs. Given the molecular conservation of SJ proteins across species, our studies may provide insights into how vertebrate axo-glial SJs are organized in myelinated axons. PMID:22022470

FishTraits: a database of ecological and life-history traits of freshwater fishes of the United States

USGS Publications Warehouse

Angermeier, Paul L.; Frimpong, Emmanuel A.

2011-01-01

The need for integrated and widely accessible sources of species traits data to facilitate studies of ecology, conservation, and management has motivated development of traits databases for various taxa. In spite of the increasing number of traits-based analyses of freshwater fishes in the United States, no consolidated database of traits of this group exists publicly, and much useful information on these species is documented only in obscure sources. The largely inaccessible and unconsolidated traits information makes large-scale analysis involving many fishes and/or traits particularly challenging. We have compiled a database of > 100 traits for 809 (731 native and 78 nonnative) fish species found in freshwaters of the conterminous United States, including 37 native families and 145 native genera. The database, named Fish Traits, contains information on four major categories of traits: (1) trophic ecology; (2) body size, reproductive ecology, and life history; (3) habitat preferences; and (4) salinity and temperature tolerances. Information on geographic distribution and conservation status was also compiled. The database enhances many opportunities for conducting research on fish species traits and constitutes the first step toward establishing a central repository for a continually expanding set of traits of North American fishes.
Concomitant prediction of function and fold at the domain level with GO-based profiles.

PubMed

Lopez, Daniel; Pazos, Florencio

2013-01-01

Predicting the function of newly sequenced proteins is crucial due to the pace at which these raw sequences are being obtained. Almost all resources for predicting protein function assign functional terms to whole chains, and do not distinguish which particular domain is responsible for the allocated function. This is not a limitation of the methodologies themselves but it is due to the fact that in the databases of functional annotations these methods use for transferring functional terms to new proteins, these annotations are done on a whole-chain basis. Nevertheless, domains are the basic evolutionary and often functional units of proteins. In many cases, the domains of a protein chain have distinct molecular functions, independent from each other. For that reason resources with functional annotations at the domain level, as well as methodologies for predicting function for individual domains adapted to these resources are required.We present a methodology for predicting the molecular function of individual domains, based on a previously developed database of functional annotations at the domain level. The approach, which we show outperforms a standard method based on sequence searches in assigning function, concomitantly predicts the structural fold of the domains and can give hints on the functionally important residues associated to the predicted function.
The prokaryotic antecedents of the ubiquitin-signaling system and the early evolution of ubiquitin-like β-grasp domains

PubMed Central

Iyer, Lakshminarayan M; Burroughs, A Maxwell; Aravind, L

2006-01-01

Background Ubiquitin (Ub)-mediated signaling is one of the hallmarks of all eukaryotes. Prokaryotic homologs of Ub (ThiS and MoaD) and E1 ligases have been studied in relation to sulfur incorporation reactions in thiamine and molybdenum/tungsten cofactor biosynthesis. However, there is no evidence for entire protein modification systems with Ub-like proteins and deconjugation by deubiquitinating enzymes in prokaryotes. Hence, the evolutionary assembly of the eukaryotic Ub-signaling apparatus remains unclear. Results We systematically analyzed prokaryotic Ub-related β-grasp fold proteins using sensitive sequence profile searches and structural analysis. Consequently, we identified novel Ub-related proteins beyond the characterized ThiS, MoaD, TGS, and YukD domains. To understand their functional associations, we sought and recovered several conserved gene neighborhoods and domain architectures. These included novel associations involving diverse sulfur metabolism proteins, siderophore biosynthesis and the gene encoding the transfer mRNA binding protein SmpB, as well as domain fusions between Ub-like domains and PIN-domain related RNAses. Most strikingly, we found conserved gene neighborhoods in phylogenetically diverse bacteria combining genes for JAB domains (the primary de-ubiquitinating isopeptidases of the proteasomal complex), along with E1-like adenylating enzymes and different Ub-related proteins. Further sequence analysis of other conserved genes in these neighborhoods revealed several Ub-conjugating enzyme/E2-ligase related proteins. Genes for an Ub-like protein and a JAB domain peptidase were also found in the tail assembly gene cluster of certain caudate bacteriophages. Conclusion These observations imply that members of the Ub family had already formed strong functional associations with E1-like proteins, UBC/E2-related proteins, and JAB peptidases in the bacteria. Several of these Ub-like proteins and the associated protein families are likely to function together in signaling systems just as in eukaryotes. PMID:16859499
Crystal structure, biochemical and genetic characterization of yeast and E. cuniculi TAF(II)5 N-terminal domain: implications for TFIID assembly.

PubMed

Romier, Christophe; James, Nicole; Birck, Catherine; Cavarelli, Jean; Vivarès, Christian; Collart, Martine A; Moras, Dino

2007-05-18

General transcription factor TFIID plays an essential role in transcription initiation by RNA polymerase II at numerous promoters. However, understanding of the assembly and a full structural characterization of this large 15 subunit complex is lacking. TFIID subunit TAF(II)5 has been shown to be present twice in this complex and to be critical for the function and assembly of TFIID. Especially, the TAF(II)5 N-terminal domain is required for its incorporation within TFIID and immuno-labelling experiments carried out by electron microscopy at low resolution have suggested that this domain might homodimerize, possibly explaining the three-lobed architecture of TFIID. However, the resolution at which the electron microscopy (EM) analyses were conducted is not sufficient to determine whether homodimerization occurs or whether a more intricate assembly implying other subunits is required. Here we report the X-ray structures of the fully evolutionary conserved C-terminal sub-domain of the TAF(II)5 N terminus, from yeast and the mammalian parasite Encephalitozoon cuniculi. This sub-domain displays a novel fold with specific surfaces having conserved physico-chemical properties that can form protein-protein interactions. Although a crystallographic dimer implying one of these surfaces is present in one of the crystal forms, several biochemical analyses show that this sub-domain is monomeric in solution, even at various salt conditions and in presence of different divalent cations. Consequently, the N-terminal sub-domain of the TAF(II)5 N terminus, which is homologous to a dimerization motif but has not been fully conserved during evolution, was studied by analytical ultracentrifugation and yeast genetics. Our results show that this sub-domain dimerizes at very high concentration but is neither required for yeast viability, nor for incorporation of two TAF(II)5 molecules within TFIID and for the assembly of this complex. Altogether, although our results do not argue in favour of a homodimerization of the TAF(II)5 N-terminal domain, our structural analyses suggest a role for this domain in assembly of TFIID and its related complexes SAGA, STAGA, TFTC and PCAF.
CoSMoS: Conserved Sequence Motif Search in the proteome

PubMed Central

Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I

2006-01-01

Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915
OST-HTH: a novel predicted RNA-binding domain

PubMed Central

2010-01-01

Background The mechanism by which the arthropod Oskar and vertebrate TDRD5/TDRD7 proteins nucleate or organize structurally related ribonucleoprotein (RNP) complexes, the polar granule and nuage, is poorly understood. Using sequence profile searches we identify a novel domain in these proteins that is widely conserved across eukaryotes and bacteria. Results Using contextual information from domain architectures, sequence-structure superpositions and available functional information we predict that this domain is likely to adopt the winged helix-turn-helix fold and bind RNA with a potential specificity for dsRNA. We show that in eukaryotes this domain is often combined in the same polypeptide with protein-protein- or lipid- interaction domains that might play a role in anchoring these proteins to specific cytoskeletal structures. Conclusions Thus, proteins with this domain might have a key role in the recognition and localization of dsRNA, including miRNAs, rasiRNAs and piRNAs hybridized to their targets. In other cases, this domain is fused to ubiquitin-binding, E3 ligase and ubiquitin-like domains indicating a previously under-appreciated role for ubiquitination in regulating the assembly and stability of nuage-like RNP complexes. Both bacteria and eukaryotes encode a conserved family of proteins that combines this predicted RNA-binding domain with a previously uncharacterized domain (DUF88). We present evidence that it is an RNAse belonging to the superfamily that includes the 5'->3' nucleases, PIN and NYN domains and might be recruited to degrade certain RNAs. Reviewers This article was reviewed by Sandor Pongor and Arcady Mushegian. PMID:20302647
Crystal structure studies of NADP{sup +} dependent isocitrate dehydrogenase from Thermus thermophilus exhibiting a novel terminal domain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kumar, S.M.; Pampa, K.J.; Manjula, M.

2014-06-20

Highlights: • We determined the structure of isocitrate dehydrogenase with citrate and cofactor. • The structure reveals a unique novel terminal domain involved in dimerization. • Clasp domain shows significant difference, and catalytic residues are conserved. • Oligomerization of the enzyme is quantized with subunit-subunit interactions. • Novel domain of this enzyme is classified as subfamily of the type IV. - Abstract: NADP{sup +} dependent isocitrate dehydrogenase (IDH) is an enzyme catalyzing oxidative decarboxylation of isocitrate into oxalosuccinate (intermediate) and finally the product α-ketoglutarate. The crystal structure of Thermus thermophilus isocitrate dehydrogenase (TtIDH) ternary complex with citrate and cofactor NADP{supmore » +} was determined using X-ray diffraction method to a resolution of 1.80 Å. The overall fold of this protein was resolved into large domain, small domain and a clasp domain. The monomeric structure reveals a novel terminal domain involved in dimerization, very unique and novel domain when compared to other IDH’s. And, small domain and clasp domain showing significant differences when compared to other IDH’s of the same sub-family. The structure of TtIDH reveals the absence of helix at the clasp domain, which is mainly involved in oligomerization in other IDH’s. Also, helices/beta sheets are absent in the small domain, when compared to other IDH’s of the same sub family. The overall TtIDH structure exhibits closed conformation with catalytic triad residues, Tyr144-Asp248-Lys191 are conserved. Oligomerization of the protein is quantized using interface area and subunit–subunit interactions between protomers. Overall, the TtIDH structure with novel terminal domain may be categorized as a first structure of subfamily of type IV.« less
Structural mapping of the coiled-coil domain of a bacterial condensin and comparative analyses across all domains of life suggest conserved features of SMC proteins.

PubMed

Waldman, Vincent M; Stanage, Tyler H; Mims, Alexandra; Norden, Ian S; Oakley, Martha G

2015-06-01

The structural maintenance of chromosomes (SMC) proteins form the cores of multisubunit complexes that are required for the segregation and global organization of chromosomes in all domains of life. These proteins share a common domain structure in which N- and C- terminal regions pack against one another to form a globular ATPase domain. This "head" domain is connected to a central, globular, "hinge" or dimerization domain by a long, antiparallel coiled coil. To date, most efforts for structural characterization of SMC proteins have focused on the globular domains. Recently, however, we developed a method to map interstrand interactions in the 50-nm coiled-coil domain of MukB, the divergent SMC protein found in γ-proteobacteria. Here, we apply that technique to map the structure of the Bacillus subtilis SMC (BsSMC) coiled-coil domain. We find that, in contrast to the relatively complicated coiled-coil domain of MukB, the BsSMC domain is nearly continuous, with only two detectable coiled-coil interruptions. Near the middle of the domain is a break in coiled-coil structure in which there are three more residues on the C-terminal strand than on the N-terminal strand. Close to the head domain, there is a second break with a significantly longer insertion on the same strand. These results provide an experience base that allows an informed interpretation of the output of coiled-coil prediction algorithms for this family of proteins. A comparison of such predictions suggests that these coiled-coil deviations are highly conserved across SMC types in a wide variety of organisms, including humans. © 2015 Wiley Periodicals, Inc.
Conservation of tubulin-binding sequences in TRPV1 throughout evolution.

PubMed

Sardar, Puspendu; Kumar, Abhishek; Bhandari, Anita; Goswami, Chandan

2012-01-01

Transient Receptor Potential Vanilloid sub type 1 (TRPV1), commonly known as capsaicin receptor can detect multiple stimuli ranging from noxious compounds, low pH, temperature as well as electromagnetic wave at different ranges. In addition, this receptor is involved in multiple physiological and sensory processes. Therefore, functions of TRPV1 have direct influences on adaptation and further evolution also. Availability of various eukaryotic genomic sequences in public domain facilitates us in studying the molecular evolution of TRPV1 protein and the respective conservation of certain domains, motifs and interacting regions that are functionally important. Using statistical and bioinformatics tools, our analysis reveals that TRPV1 has evolved about ∼420 million years ago (MYA). Our analysis reveals that specific regions, domains and motifs of TRPV1 has gone through different selection pressure and thus have different levels of conservation. We found that among all, TRP box is the most conserved and thus have functional significance. Our results also indicate that the tubulin binding sequences (TBS) have evolutionary significance as these stretch sequences are more conserved than many other essential regions of TRPV1. The overall distribution of positively charged residues within the TBS motifs is conserved throughout evolution. In silico analysis reveals that the TBS-1 and TBS-2 of TRPV1 can form helical structures and may play important role in TRPV1 function. Our analysis identifies the regions of TRPV1, which are important for structure-function relationship. This analysis indicates that tubulin binding sequence-1 (TBS-1) near the TRP-box forms a potential helix and the tubulin interactions with TRPV1 via TBS-1 have evolutionary significance. This interaction may be required for the proper channel function and regulation and may also have significance in the context of Taxol®-induced neuropathy.
In vivo functional mapping of the conserved protein domains within murine Themis1.

PubMed

Zvezdova, Ekaterina; Lee, Jan; El-Khoury, Dalal; Barr, Valarie; Akpan, Itoro; Samelson, Lawrence; Love, Paul E

2014-09-01

Thymocyte development requires the coordinated input of signals that originate from numerous cell surface molecules. Although the majority of thymocyte signal-initiating receptors are lineage-specific, most trigger 'ubiquitous' downstream signaling pathways. T-lineage-specific receptors are coupled to these signaling pathways by lymphocyte-restricted adapter molecules. We and others recently identified a new putative adapter protein, Themis1, whose expression is largely restricted to the T lineage. Mice lacking Themis1 exhibit a severe block in thymocyte development and a striking paucity of mature T cells revealing a critical role for Themis1 in T-cell maturation. Themis1 orthologs contain three conserved domains: a proline-rich region (PRR) that binds to the ubiquitous cytosolic adapter Grb2, a nuclear localization sequence (NLS), and two copies of a novel cysteine-containing globular (CABIT) domain. In the present study, we evaluated the functional importance of each of these motifs by retroviral reconstitution of Themis1(-/-) progenitor cells. The results demonstrate an essential requirement for the PRR and NLS motifs but not the conserved CABIT cysteines for Themis1 function.
Recombinant Influenza Virus Carrying the Conserved Domain of Respiratory Syncytial Virus (RSV) G Protein Confers Protection against RSV without inflammatory disease

PubMed Central

Lee, Yu-Na; Hwang, Hye Suk; Kim, Min-Chul; Lee, Young-Tae; Cho, Min-Kyoung; Kwon, Young-Man; Lee, Jong Seok; Plemper, Richard K.; Kang, Sang-Moo

2014-01-01

Respiratory syncytial virus (RSV) is one of the most important causes for viral lower respiratory tract disease in humans. There is no licensed RSV vaccine. Here, we generated recombinant influenza viruses (PR8/RSV.HA-G) carrying the chimeric constructs of hemagglutinin (HA) and central conserved-domains of the RSV G protein. PR8/RSV.HA-G virus showed lower pathogenicity without compromising immunogenicity in mice. Single intranasal inoculation of mice with PR8/RSV.HA-G induced IgG2a isotype dominant antibodies and RSV neutralizing activity. Mice with single intranasal inoculation of PR8/RSV.HA-G were protected against RSV infection as evidenced by significant reduction of lung viral loads to a detection limit upon RSV challenge. PR8/RSV.HA-G inoculation of mice did not induce pulmonary eosinophilia and inflammation upon RSV infection. These findings support a concept that recombinant influenza viruses carrying the RSV G conserved-domain can be developed as a promising RSV vaccine candidate without pulmonary disease. PMID:25553517
The Conserved Foot Domain of RNA Pol II Associates with Proteins Involved in Transcriptional Initiation and/or Early Elongation

PubMed Central

García-López, M. Carmen; Pelechano, Vicent; Mirón-García, M. Carmen; Garrido-Godino, Ana I.; García, Alicia; Calvo, Olga; Werner, Michel; Pérez-Ortín, José E.; Navarro, Francisco

2011-01-01

RNA polymerase (pol) II establishes many protein–protein interactions with transcriptional regulators to coordinate different steps of transcription. Although some of these interactions have been well described, little is known about the existence of RNA pol II regions involved in contact with transcriptional regulators. We hypothesize that conserved regions on the surface of RNA pol II contact transcriptional regulators. We identified such an RNA pol II conserved region that includes the majority of the “foot” domain and identified interactions of this region with Mvp1, a protein required for sorting proteins to the vacuole, and Spo14, a phospholipase D. Deletion of MVP1 and SPO14 affects the transcription of their target genes and increases phosphorylation of Ser5 in the carboxy-terminal domain (CTD). Genetic, phenotypic, and functional analyses point to a role for these proteins in transcriptional initiation and/or early elongation, consistent with their genetic interactions with CEG1, a guanylyltransferase subunit of the Saccharomyces cerevisiae capping enzyme. PMID:21954159
Effects of soil water holding capacity on evapotranspiration and irrigation scheduling

USDA-ARS?s Scientific Manuscript database

The USDA Natural Resources Conservation Service (NRCS), through the National Cooperative Soil Survey, developed three soil geographic databases that are appropriate for acquiring soil information at the national, regional, and local scales. These relational databases include the National Soil Geogra...
SH3 interactome conserves general function over specific form

PubMed Central

Xin, Xiaofeng; Gfeller, David; Cheng, Jackie; Tonikian, Raffi; Sun, Lin; Guo, Ailan; Lopez, Lianet; Pavlenco, Alevtina; Akintobi, Adenrele; Zhang, Yingnan; Rual, Jean-François; Currell, Bridget; Seshagiri, Somasekar; Hao, Tong; Yang, Xinping; Shen, Yun A; Salehi-Ashtiani, Kourosh; Li, Jingjing; Cheng, Aaron T; Bouamalay, Dryden; Lugari, Adrien; Hill, David E; Grimes, Mark L; Drubin, David G; Grant, Barth D; Vidal, Marc; Boone, Charles; Sidhu, Sachdev S; Bader, Gary D

2013-01-01

Src homology 3 (SH3) domains bind peptides to mediate protein–protein interactions that assemble and regulate dynamic biological processes. We surveyed the repertoire of SH3 binding specificity using peptide phage display in a metazoan, the worm Caenorhabditis elegans, and discovered that it structurally mirrors that of the budding yeast Saccharomyces cerevisiae. We then mapped the worm SH3 interactome using stringent yeast two-hybrid and compared it with the equivalent map for yeast. We found that the worm SH3 interactome resembles the analogous yeast network because it is significantly enriched for proteins with roles in endocytosis. Nevertheless, orthologous SH3 domain-mediated interactions are highly rewired. Our results suggest a model of network evolution where general function of the SH3 domain network is conserved over its specific form. PMID:23549480
Consistent Temperature Coupling with Thermal Fluctuations of Smooth Particle Hydrodynamics and Molecular Dynamics

PubMed Central

Ganzenmüller, Georg C.; Hiermaier, Stefan; Steinhauser, Martin O.

2012-01-01

We propose a thermodynamically consistent and energy-conserving temperature coupling scheme between the atomistic and the continuum domain. The coupling scheme links the two domains using the DPDE (Dissipative Particle Dynamics at constant Energy) thermostat and is designed to handle strong temperature gradients across the atomistic/continuum domain interface. The fundamentally different definitions of temperature in the continuum and atomistic domain – internal energy and heat capacity versus particle velocity – are accounted for in a straightforward and conceptually intuitive way by the DPDE thermostat. We verify the here-proposed scheme using a fluid, which is simultaneously represented as a continuum using Smooth Particle Hydrodynamics, and as an atomistically resolved liquid using Molecular Dynamics. In the case of equilibrium contact between both domains, we show that the correct microscopic equilibrium properties of the atomistic fluid are obtained. As an example of a strong non-equilibrium situation, we consider the propagation of a steady shock-wave from the continuum domain into the atomistic domain, and show that the coupling scheme conserves both energy and shock-wave dynamics. To demonstrate the applicability of our scheme to real systems, we consider shock loading of a phospholipid bilayer immersed in water in a multi-scale simulation, an interesting topic of biological relevance. PMID:23300586
Structures of Bacterial Biosynthetic Arginine Decarboxylases

DOE Office of Scientific and Technical Information (OSTI.GOV)

F Forouhar; S Lew; J Seetharaman

2011-12-31

Biosynthetic arginine decarboxylase (ADC; also known as SpeA) plays an important role in the biosynthesis of polyamines from arginine in bacteria and plants. SpeA is a pyridoxal-5'-phosphate (PLP)-dependent enzyme and shares weak sequence homology with several other PLP-dependent decarboxylases. Here, the crystal structure of PLP-bound SpeA from Campylobacter jejuni is reported at 3.0 {angstrom} resolution and that of Escherichia coli SpeA in complex with a sulfate ion is reported at 3.1 {angstrom} resolution. The structure of the SpeA monomer contains two large domains, an N-terminal TIM-barrel domain followed by a {beta}-sandwich domain, as well as two smaller helical domains. Themore » TIM-barrel and {beta}-sandwich domains share structural homology with several other PLP-dependent decarboxylases, even though the sequence conservation among these enzymes is less than 25%. A similar tetramer is observed for both C. jejuni and E. coli SpeA, composed of two dimers of tightly associated monomers. The active site of SpeA is located at the interface of this dimer and is formed by residues from the TIM-barrel domain of one monomer and a highly conserved loop in the {beta}-sandwich domain of the other monomer. The PLP cofactor is recognized by hydrogen-bonding, {pi}-stacking and van der Waals interactions.« less
The Aspartate-Less Receiver (ALR) Domains: Distribution, Structure and Function

PubMed Central

Weiner, Joshua J.; Han, Lanlan; Peterson, Francis C.; Volkman, Brian F.; Silvaggi, Nicholas R.; Ulijasz, Andrew T.

2015-01-01

Two-component signaling systems are ubiquitous in bacteria, Archaea and plants and play important roles in sensing and responding to environmental stimuli. To propagate a signaling response the typical system employs a sensory histidine kinase that phosphorylates a Receiver (REC) domain on a conserved aspartate (Asp) residue. Although it is known that some REC domains are missing this Asp residue, it remains unclear as to how many of these divergent REC domains exist, what their functional roles are and how they are regulated in the absence of the conserved Asp. Here we have compiled all deposited REC domains missing their phosphorylatable Asp residue, renamed here as the Aspartate-Less Receiver (ALR) domains. Our data show that ALRs are surprisingly common and are enriched for when attached to more rare effector outputs. Analysis of our informatics and the available ALR atomic structures, combined with structural, biochemical and genetic data of the ALR archetype RitR from Streptococcus pneumoniae presented here suggest that ALRs have reorganized their active pockets to instead take on a constitutive regulatory role or accommodate input signals other than Asp phosphorylation, while largely retaining the canonical post-phosphorylation mechanisms and dimeric interface. This work defines ALRs as an atypical REC subclass and provides insights into shared mechanisms of activation between ALR and REC domains. PMID:25875291
FARE-CAFE: a database of functional and regulatory elements of cancer-associated fusion events

PubMed Central

Korla, Praveen Kumar; Cheng, Jack; Huang, Chien-Hung; Tsai, Jeffrey J. P.; Liu, Yu-Hsuan; Kurubanjerdjit, Nilubon; Hsieh, Wen-Tsong; Chen, Huey-Yi; Ng, Ka-Lok

2015-01-01

Chromosomal translocation (CT) is of enormous clinical interest because this disorder is associated with various major solid tumors and leukemia. A tumor-specific fusion gene event may occur when a translocation joins two separate genes. Currently, various CT databases provide information about fusion genes and their genomic elements. However, no database of the roles of fusion genes, in terms of essential functional and regulatory elements in oncogenesis, is available. FARE-CAFE is a unique combination of CTs, fusion proteins, protein domains, domain–domain interactions, protein–protein interactions, transcription factors and microRNAs, with subsequent experimental information, which cannot be found in any other CT database. Genomic DNA information including, for example, manually collected exact locations of the first and second break points, sequences and karyotypes of fusion genes are included. FARE-CAFE will substantially facilitate the cancer biologist’s mission of elucidating the pathogenesis of various types of cancer. This database will ultimately help to develop ‘novel’ therapeutic approaches. Database URL: http://ppi.bioinfo.asia.edu.tw/FARE-CAFE PMID:26384373
DOE Office of Scientific and Technical Information (OSTI.GOV)

Marcianò, G.; Huang, D. T., E-mail: d.huang@beatson.gla.ac.uk

The Spt16–SSRP1 heterodimer is a histone chaperone that plays an important role in regulating chromatin assembly. Here, a crystal structure of the N-terminal domain of human Spt16 is presented and it is shown that this domain may contribute to histone binding. The histone chaperone FACT plays an important role in facilitating nucleosome assembly and disassembly during transcription. FACT is a heterodimeric complex consisting of Spt16 and SSRP1. The N-terminal domain of Spt16 resembles an inactive aminopeptidase. How this domain contributes to the histone chaperone activity of FACT remains elusive. Here, the crystal structure of the N-terminal domain (NTD) of humanmore » Spt16 is reported at a resolution of 1.84 Å. The structure adopts an aminopeptidase-like fold similar to those of the Saccharomyces cerevisiae and Schizosaccharomyces pombe Spt16 NTDs. Isothermal titration calorimetry analyses show that human Spt16 NTD binds histones H3/H4 with low-micromolar affinity, suggesting that Spt16 NTD may contribute to histone binding in the FACT complex. Surface-residue conservation and electrostatic analysis reveal a conserved acidic patch that may be involved in histone binding.« less
Plant homologs of mammalian MBT-domain protein-regulated KDM1 histone lysine demethylases do not interact with plant Tudor/PWWP/MBT-domain proteins

PubMed Central

Sadiq, Irfan; Keren, Ido; Citovsky, Vitaly

2016-01-01

Histone lysine demethylases of the LSD1/KDM1 family play important roles in epigenetic regulation of eukaryotic chromatin, and they are conserved between plants and animals. Mammalian LSD1 is thought to be targeted to its substrates, i.e., methylated histones, by an MBT-domain protein SFMBT1 that represents a component of the LSD1-based repressor complex and binds methylated histones. Because MBT-domain proteins are conserved between different organisms, from animals to plants, we examined whether the KDM1-type histone lysine demethylases KDM1C and FLD of Arabidopsis interact with the Arabidopsis Tudor/PWWP/MBT-domain SFMBT1-like proteins SL1, SL2, SL3, and SL4. No such interaction was detected using the bimolecular fluorescence complementation assay in living plant cells. Thus, plants most likely direct their KDM1 chromatin-modifying enzymes to methylated histones of the target chromatin by a mechanism different from that employed by the mammalian cells. PMID:26826387

Plant homologs of mammalian MBT-domain protein-regulated KDM1 histone lysine demethylases do not interact with plant Tudor/PWWP/MBT-domain proteins.

PubMed

Sadiq, Irfan; Keren, Ido; Citovsky, Vitaly

2016-02-19

Histone lysine demethylases of the LSD1/KDM1 family play important roles in epigenetic regulation of eukaryotic chromatin, and they are conserved between plants and animals. Mammalian LSD1 is thought to be targeted to its substrates, i.e., methylated histones, by an MBT-domain protein SFMBT1 that represents a component of the LSD1-based repressor complex and binds methylated histones. Because MBT-domain proteins are conserved between different organisms, from animals to plants, we examined whether the KDM1-type histone lysine demethylases KDM1C and FLD of Arabidopsis interact with the Arabidopsis Tudor/PWWP/MBT-domain SFMBT1-like proteins SL1, SL2, SL3, and SL4. No such interaction was detected using the bimolecular fluorescence complementation assay in living plant cells. Thus, plants most likely direct their KDM1 chromatin-modifying enzymes to methylated histones of the target chromatin by a mechanism different from that employed by the mammalian cells. Copyright © 2016 Elsevier Inc. All rights reserved.
The Walker B motif in avian FANCM is required to limit sister chromatid exchanges but is dispensable for DNA crosslink repair

PubMed Central

Rosado, Ivan V.; Niedzwiedz, Wojciech; Alpi, Arno F.; Patel, Ketan J.

2009-01-01

FANCM, the most highly conserved component of the Fanconi Anaemia (FA) pathway can resolve recombination intermediates and remodel synthetic replication forks. However, it is not known if these activities are relevant to how this conserved protein activates the FA pathway and promotes DNA crosslink repair. Here we use chicken DT40 cells to systematically dissect the function of the helicase and nuclease domains of FANCM. Our studies reveal that these domains contribute distinct roles in the tolerance of crosslinker, UV light and camptothecin-induced DNA damage. Although the complete helicase domain is critical for crosslink repair, a predicted inactivating mutation of the Walker B box domain has no impact on FA pathway associated functions. However, this mutation does result in elevated sister chromatid exchanges (SCE). Furthermore, our genetic dissection indicates that FANCM functions with the Blm helicase to suppress spontaneous SCE events. Overall our results lead us to reappraise the role of helicase domain associated activities of FANCM with respect to the activation of the FA pathway, crosslink repair and in the resolution of recombination intermediates. PMID:19465393
Crystal structure of P58(IPK) TPR fragment reveals the mechanism for its molecular chaperone activity in UPR

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tao, Jiahui; Petrova, Kseniya; Ron, David

2010-05-25

P58(IPK) might function as an endoplasmic reticulum molecular chaperone to maintain protein folding homeostasis during unfolded protein responses. P58(IPK) contains nine tetratricopeptide repeat (TPR) motifs and a C-terminal J-domain within its primary sequence. To investigate the mechanism by which P58(IPK) functions to promote protein folding within the endoplasmic reticulum, we have determined the crystal structure of P58(IPK) TPR fragment to 2.5 {angstrom} resolution by the SAD method. The crystal structure of P58(IPK) revealed three domains (I-III) with similar folds and each domain contains three TPR motifs. An ELISA assay indicated that P58(IPK) acts as a molecular chaperone by interacting withmore » misfolded proteins such as luciferase and rhodanese. The P58(IPK) structure reveals a conserved hydrophobic patch located in domain I that might be involved in binding the misfolded polypeptides. Structure-based mutagenesis for the conserved hydrophobic residues located in domain I significantly reduced the molecular chaperone activity of P58(IPK).« less
Functional diversity of potassium channel voltage-sensing domains.

PubMed

Islas, León D

2016-01-01

Voltage-gated potassium channels or Kv's are membrane proteins with fundamental physiological roles. They are composed of 2 main functional protein domains, the pore domain, which regulates ion permeation, and the voltage-sensing domain, which is in charge of sensing voltage and undergoing a conformational change that is later transduced into pore opening. The voltage-sensing domain or VSD is a highly conserved structural motif found in all voltage-gated ion channels and can also exist as an independent feature, giving rise to voltage sensitive enzymes and also sustaining proton fluxes in proton-permeable channels. In spite of the structural conservation of VSDs in potassium channels, there are several differences in the details of VSD function found across variants of Kvs. These differences are mainly reflected in variations in the electrostatic energy needed to open different potassium channels. In turn, the differences in detailed VSD functioning among voltage-gated potassium channels might have physiological consequences that have not been explored and which might reflect evolutionary adaptations to the different roles played by Kv channels in cell physiology.
Functional diversity of potassium channel voltage-sensing domains

PubMed Central

Islas, León D.

2016-01-01

Abstract Voltage-gated potassium channels or Kv's are membrane proteins with fundamental physiological roles. They are composed of 2 main functional protein domains, the pore domain, which regulates ion permeation, and the voltage-sensing domain, which is in charge of sensing voltage and undergoing a conformational change that is later transduced into pore opening. The voltage-sensing domain or VSD is a highly conserved structural motif found in all voltage-gated ion channels and can also exist as an independent feature, giving rise to voltage sensitive enzymes and also sustaining proton fluxes in proton-permeable channels. In spite of the structural conservation of VSDs in potassium channels, there are several differences in the details of VSD function found across variants of Kvs. These differences are mainly reflected in variations in the electrostatic energy needed to open different potassium channels. In turn, the differences in detailed VSD functioning among voltage-gated potassium channels might have physiological consequences that have not been explored and which might reflect evolutionary adaptations to the different roles played by Kv channels in cell physiology. PMID:26794852
The conserved glycine residues in the transmembrane domain of the Semliki Forest virus fusion protein are not required for assembly and fusion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liao Maofu; Kielian, Margaret

2005-02-05

The alphavirus Semliki Forest virus (SFV) infects cells via a low pH-triggered fusion reaction mediated by the viral E1 protein. Both the E1 fusion peptide and transmembrane (TM) domain are essential for membrane fusion, but the functional requirements for the TM domain are poorly understood. Here we explored the role of the five TM domain glycine residues, including the highly conserved glycine pair at E1 residues 415/416. SFV mutants with alanine substitutions for individual or all five glycine residues (5G/A) showed growth kinetics and fusion pH dependence similar to those of wild-type SFV. Mutants with increasing substitution of glycine residuesmore » showed an increasingly more stringent requirement for cholesterol during fusion. The 5G/A mutant showed decreased fusion kinetics and extent in fluorescent lipid mixing assays. TM domain glycine residues thus are not required for efficient SFV fusion or assembly but can cause subtle effects on the properties of membrane fusion.« less
Structural basis of filopodia formation induced by the IRSp53/MIM homology domain of human IRSp53

PubMed Central

Millard, Thomas H; Bompard, Guillaume; Heung, Man Yeung; Dafforn, Timothy R; Scott, David J; Machesky, Laura M; Fütterer, Klaus

2005-01-01

The scaffolding protein insulin receptor tyrosine kinase substrate p53 (IRSp53), a ubiquitous regulator of the actin cytoskeleton, mediates filopodia formation under the control of Rho-family GTPases. IRSp53 comprises a central SH3 domain, which binds to proline-rich regions of a wide range of actin regulators, and a conserved N-terminal IRSp53/MIM homology domain (IMD) that harbours F-actin-bundling activity. Here, we present the crystal structure of this novel actin-bundling domain revealing a coiled-coil domain that self-associates into a 180 Å-long zeppelin-shaped dimer. Sedimentation velocity experiments confirm the presence of a single molecular species of twice the molecular weight of the monomer in solution. Mutagenesis of conserved basic residues at the extreme ends of the dimer abrogated actin bundling in vitro and filopodia formation in vivo, demonstrating that IMD-mediated actin bundling is required for IRSp53-induced filopodia formation. This study promotes an expanded view of IRSp53 as an actin regulator that integrates scaffolding and effector functions. PMID:15635447
EBNA-2 of herpesvirus papio diverges significantly from the type A and type B EBNA-2 proteins of Epstein-Barr virus but retains an efficient transactivation domain with a conserved hydrophobic motif.

PubMed Central

Ling, P D; Ryon, J J; Hayward, S D

1993-01-01

EBNA-2 contributes to the establishment of Epstein-Barr virus (EBV) latency in B cells and to the resultant alterations in B-cell growth pattern by up-regulating expression from specific viral and cellular promoters. We have taken a comparative approach toward characterizing functional domains within EBNA-2. To this end, we have cloned and sequenced the EBNA-2 gene from the closely related baboon virus herpesvirus papio (HVP). All human EBV isolates have either a type A or type B EBNA-2 gene. However, the HVP EBNA-2 gene falls into neither the type A category nor the type B category, suggesting that the separation into these two subtypes may have been a recent evolutionary event. Comparison of the predicted amino acid sequences indicates 37% amino acid identity with EBV type A EBNA-2 and 35% amino acid identity with type B EBNA-2. To define the domains of EBNA-2 required for transcriptional activation, the DNA binding domain of the GAL4 protein was fused to overlapping segments of EBV EBNA-2. This approach identified a 40-amino-acid (40-aa) EBNA-2 activation domain located between aa 437 and 477. Transactivation ability was completely lost when the amino-terminal boundary of this domain was moved to aa 441, indicating that the motif at aa 437 to 440, Pro-Ile-Leu-Phe, contains residues critical for function. The aa 437 boundary identified in these experiments coincides precisely with a block of conserved sequences in HVP EBNA-2, and the comparable carboxy-terminal region of HVP EBNA-2 also functioned as a strong transcriptional activation domain when fused to the Gal4(1-147) protein. The EBV and HVP EBNA-2 activation domains share a mixed proline-rich, negatively charged character with a striking conservation of positionally equivalent hydrophobic residues. The importance of the individual amino acids making up the Pro-Ile-Leu-Phe motif was examined by mutagenesis. Any alteration of these residues was found to reduce transactivation efficiency, with changes at the Pro-437 and Phe-440 positions producing the most deleterious effects. Activation of the EBV latency C promoter by EBNA-2 was shown to be dependent on the presence of the carboxy-terminal activation domain. However, this requirement was generic, rather than specific, since the EBNA-2 activation domain could be replaced with those from the herpes simplex virus (HSV) VP16 protein or the EBV Rta protein. Potential karyophilic signals within EBNA-2 were examined by introducing oligonucleotides encoding positively charged amino acid groupings that might serve in this capacity into a cytoplasmic test protein, HSV delta IE175, and by examining the intracellular localization of the resulting proteins. This assay identified a strong nuclear localization signal between EBV amino acids (aa) 478 to 485, which was conserved in HVP, and a weaker noncanonical signal between EBV aa 341 to 355, which was not conserved in HVP. Images PMID:8388484
EBNA-2 of herpesvirus papio diverges significantly from the type A and type B EBNA-2 proteins of Epstein-Barr virus but retains an efficient transactivation domain with a conserved hydrophobic motif.

PubMed

Ling, P D; Ryon, J J; Hayward, S D

1993-06-01

EBNA-2 contributes to the establishment of Epstein-Barr virus (EBV) latency in B cells and to the resultant alterations in B-cell growth pattern by up-regulating expression from specific viral and cellular promoters. We have taken a comparative approach toward characterizing functional domains within EBNA-2. To this end, we have cloned and sequenced the EBNA-2 gene from the closely related baboon virus herpesvirus papio (HVP). All human EBV isolates have either a type A or type B EBNA-2 gene. However, the HVP EBNA-2 gene falls into neither the type A category nor the type B category, suggesting that the separation into these two subtypes may have been a recent evolutionary event. Comparison of the predicted amino acid sequences indicates 37% amino acid identity with EBV type A EBNA-2 and 35% amino acid identity with type B EBNA-2. To define the domains of EBNA-2 required for transcriptional activation, the DNA binding domain of the GAL4 protein was fused to overlapping segments of EBV EBNA-2. This approach identified a 40-amino-acid (40-aa) EBNA-2 activation domain located between aa 437 and 477. Transactivation ability was completely lost when the amino-terminal boundary of this domain was moved to aa 441, indicating that the motif at aa 437 to 440, Pro-Ile-Leu-Phe, contains residues critical for function. The aa 437 boundary identified in these experiments coincides precisely with a block of conserved sequences in HVP EBNA-2, and the comparable carboxy-terminal region of HVP EBNA-2 also functioned as a strong transcriptional activation domain when fused to the Gal4(1-147) protein. The EBV and HVP EBNA-2 activation domains share a mixed proline-rich, negatively charged character with a striking conservation of positionally equivalent hydrophobic residues. The importance of the individual amino acids making up the Pro-Ile-Leu-Phe motif was examined by mutagenesis. Any alteration of these residues was found to reduce transactivation efficiency, with changes at the Pro-437 and Phe-440 positions producing the most deleterious effects. Activation of the EBV latency C promoter by EBNA-2 was shown to be dependent on the presence of the carboxy-terminal activation domain. However, this requirement was generic, rather than specific, since the EBNA-2 activation domain could be replaced with those from the herpes simplex virus (HSV) VP16 protein or the EBV Rta protein. Potential karyophilic signals within EBNA-2 were examined by introducing oligonucleotides encoding positively charged amino acid groupings that might serve in this capacity into a cytoplasmic test protein, HSV delta IE175, and by examining the intracellular localization of the resulting proteins. This assay identified a strong nuclear localization signal between EBV amino acids (aa) 478 to 485, which was conserved in HVP, and a weaker noncanonical signal between EBV aa 341 to 355, which was not conserved in HVP.
EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A.

PubMed

Ndhlovu, Andrew; Durand, Pierre M; Hazelhurst, Scott

2015-01-01

The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. © The Author(s) 2015. Published by Oxford University Press.
Multiple graph regularized protein domain ranking.

PubMed

Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

2012-11-19

Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.
Multiple graph regularized protein domain ranking

PubMed Central

2012-01-01

Background Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. Results To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. Conclusion The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. PMID:23157331
Evolution of SH2 domains and phosphotyrosine signalling networks

PubMed Central

Liu, Bernard A.; Nash, Piers D.

2012-01-01

Src homology 2 (SH2) domains mediate selective protein–protein interactions with tyrosine phosphorylated proteins, and in doing so define specificity of phosphotyrosine (pTyr) signalling networks. SH2 domains and protein-tyrosine phosphatases expand alongside protein-tyrosine kinases (PTKs) to coordinate cellular and organismal complexity in the evolution of the unikont branch of the eukaryotes. Examination of conserved families of PTKs and SH2 domain proteins provides fiduciary marks that trace the evolutionary landscape for the development of complex cellular systems in the proto-metazoan and metazoan lineages. The evolutionary provenance of conserved SH2 and PTK families reveals the mechanisms by which diversity is achieved through adaptations in tissue-specific gene transcription, altered ligand binding, insertions of linear motifs and the gain or loss of domains following gene duplication. We discuss mechanisms by which pTyr-mediated signalling networks evolve through the development of novel and expanded families of SH2 domain proteins and the elaboration of connections between pTyr-signalling proteins. These changes underlie the variety of general and specific signalling networks that give rise to tissue-specific functions and increasingly complex developmental programmes. Examination of SH2 domains from an evolutionary perspective provides insight into the process by which evolutionary expansion and modification of molecular protein interaction domain proteins permits the development of novel protein-interaction networks and accommodates adaptation of signalling networks. PMID:22889907
A Steric-inhibition model for regulation of nucleotide exchange via the Dock180 family of GEFs.

PubMed

Lu, Mingjian; Kinchen, Jason M; Rossman, Kent L; Grimsley, Cynthia; Hall, Matthew; Sondek, John; Hengartner, Michael O; Yajnik, Vijay; Ravichandran, Kodi S

2005-02-22

CDM (CED-5, Dock180, Myoblast city) family members have been recently identified as novel, evolutionarily conserved guanine nucleotide exchange factors (GEFs) for Rho-family GTPases . They regulate multiple processes, including embryonic development, cell migration, apoptotic-cell engulfment, tumor invasion, and HIV-1 infection, in diverse model systems . However, the mechanism(s) of regulation of CDM proteins has not been well understood. Here, our studies on the prototype member Dock180 reveal a steric-inhibition model for regulating the Dock180 family of GEFs. At basal state, the N-terminal SH3 domain of Dock180 binds to the distant catalytic Docker domain and negatively regulates the function of Dock180. Further studies revealed that the SH3:Docker interaction sterically blocks Rac access to the Docker domain. Interestingly, ELMO binding to the SH3 domain of Dock180 disrupted the SH3:Docker interaction, facilitated Rac access to the Docker domain, and contributed to the GEF activity of the Dock180/ELMO complex. Additional genetic rescue studies in C. elegans suggested that the regulation of the Docker-domain-mediated GEF activity by the SH3 domain and its adjoining region is evolutionarily conserved. This steric-inhibition model may be a general mechanism for regulating multiple SH3-domain-containing Dock180 family members and may have implications for a variety of biological processes.
Emergency preparedness for the accidental release of radionuclides from the Uljin Nuclear Power Plant in Korea.

PubMed

Park, Soon-Ung; Lee, In-Hye; Joo, Seung Jin; Ju, Jae-Won

2017-12-01

Site specific radionuclide dispersion databases were archived for the emergency response to the hypothetical releases of 137 Cs from the Uljin nuclear power plant in Korea. These databases were obtained with the horizontal resolution of 1.5 km in the local domain centered the power plant site by simulations of the Lagrangian Particle Dispersion Model (LPDM) with the Unified Model (UM)-Local Data Assimilation Prediction System (LDAPS). The Eulerian Dispersion Model-East Asia (EDM-EA) with the UM-Global Data Assimilation Prediction System (UM-GDAPS) meteorological models was used to get dispersion databases in the regional domain. The LPDM model was performed for a year with a 5-day interval yielding 72 synoptic time-scale cases in a year. For each case hourly mean near surface concentrations, hourly mean column integrated concentrations, hourly total depositions for 5 consecutive days were archived by the LPDM model in the local domain and by the EDM-EA model in the regional domain of Asia. Among 72 synoptic cases in a year the worst synoptic case that showed the highest mean surface concentration averaged for 5 days in the LPDM model domain was chosen to illustrate the emergency preparedness to the hypothetical accident at the site. The simulated results by the LPDM model with the 137 Cs emission rate of the Fukushima nuclear power plant accident for the first 5-day period were found to be able to provide prerequisite information for the emergency response to the early phase of the accident whereas those of the EDM-EA model could provide information required for the environmental impact assessment of the accident in the regional domain. The archived site-specific database of 72 synoptic cases in a year could have a great potential to be used as a prognostic information on the emergency preparedness for the early phase of accident. Copyright © 2017 Elsevier Ltd. All rights reserved.
Structure of the N-terminal domain of the protein Expansion: an ‘Expansion’ to the Smad MH2 fold

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beich-Frandsen, Mads; Aragón, Eric; Llimargas, Marta

2015-04-01

Expansion is a modular protein that is conserved in protostomes. The first structure of the N-terminal domain of Expansion has been determined at 1.6 Å resolution and the new Nα-MH2 domain was found to belong to the Smad/FHA superfamily of structures. Gene-expression changes observed in Drosophila embryos after inducing the transcription factor Tramtrack led to the identification of the protein Expansion. Expansion contains an N-terminal domain similar in sequence to the MH2 domain characteristic of Smad proteins, which are the central mediators of the effects of the TGF-β signalling pathway. Apart from Smads and Expansion, no other type of proteinmore » belonging to the known kingdoms of life contains MH2 domains. To compare the Expansion and Smad MH2 domains, the crystal structure of the Expansion domain was determined at 1.6 Å resolution, the first structure of a non-Smad MH2 domain to be characterized to date. The structure displays the main features of the canonical MH2 fold with two main differences: the addition of an α-helical region and the remodelling of a protein-interaction site that is conserved in the MH2 domain of Smads. Owing to these differences, to the new domain was referred to as Nα-MH2. Despite the presence of the Nα-MH2 domain, Expansion does not participate in TGF-β signalling; instead, it is required for other activities specific to the protostome phyla. Based on the structural similarities to the MH2 fold, it is proposed that the Nα-MH2 domain should be classified as a new member of the Smad/FHA superfamily.« less
A structural role for the PHP domain in E. coli DNA polymerase III

PubMed Central

2013-01-01

Background In addition to the core catalytic machinery, bacterial replicative DNA polymerases contain a Polymerase and Histidinol Phosphatase (PHP) domain whose function is not entirely understood. The PHP domains of some bacterial replicases are active metal-dependent nucleases that may play a role in proofreading. In E. coli DNA polymerase III, however, the PHP domain has lost several metal-coordinating residues and is likely to be catalytically inactive. Results Genomic searches show that the loss of metal-coordinating residues in polymerase PHP domains is likely to have coevolved with the presence of a separate proofreading exonuclease that works with the polymerase. Although the E. coli Pol III PHP domain has lost metal-coordinating residues, the structure of the domain has been conserved to a remarkable degree when compared to that of metal-binding PHP domains. This is demonstrated by our ability to restore metal binding with only three point mutations, as confirmed by the metal-bound crystal structure of this mutant determined at 2.9 Å resolution. We also show that Pol III, a large multi-domain protein, unfolds cooperatively and that mutations in the degenerate metal-binding site of the PHP domain decrease the overall stability of Pol III and reduce its activity. Conclusions While the presence of a PHP domain in replicative bacterial polymerases is strictly conserved, its ability to coordinate metals and to perform proofreading exonuclease activity is not, suggesting additional non-enzymatic roles for the domain. Our results show that the PHP domain is a major structural element in Pol III and its integrity modulates both the stability and activity of the polymerase. PMID:23672456
An Interdisciplinary Conservation Module for Condition Survey on Cultural Heritages with a 3d Information System

NASA Astrophysics Data System (ADS)

Pedelì, C.

2013-07-01

In order to make the most of the digital outsourced documents, based on new technologies (e.g.: 3D LASER scanners, photogrammetry, etc.), a new approach was followed and a new ad hoc information system was implemented. The obtained product allow to the final user to reuse and manage the digital documents providing graphic tools and an integrated specific database to manage the entire documentation and conservation process, starting from the condition assessment until the conservation / restoration work. The system is organised on two main modules: Archaeology and Conservation. This paper focus on the features and the advantages of the second one. In particular it is emphasized its logical organisation, the possibility to easily mapping by using a very precise 3D metric platform, to benefit of the integrated relational database which allows to well organise, compare, keep and manage different kind of information at different level. Conservation module can manage along the time the conservation process of a site, monuments, object or excavation and conservation work in progress. An alternative approach called OVO by the author of this paper, force the surveyor to observe and describe the entity decomposing it on functional components, materials and construction techniques. Some integrated tools as the "ICOMOS-ISCS Illustrated glossary … " help the user to describe pathologies with a unified approach and terminology. Also the conservation project phase is strongly supported to envision future intervention and cost. A final section is devoted to record the conservation/restoration work already done or in progress. All information areas of the conservation module are interconnected to each other to allows to the system a complete interchange of graphic and alphanumeric data. The conservation module it self is connected to the archaeological one to create an interdisciplinary daily tool.
Characterization of the DMAE-modified juvenile excretory-secretory protein Juv-p120 of Litomosoides sigmodontis.

PubMed

Wagner, Ulrike; Hirzmann, Jörg; Hintz, Martin; Beck, Ewald; Geyer, Rudolf; Hobom, Gerd; Taubert, Anja; Zahner, Horst

2011-04-01

Juv-p120 is an excretory-secretory 160 kDa glycoprotein of juvenile female Litomosoides sigmodontis and exhibits features typical for mucins. 50% of its molecular mass is attributed to posttranslational modifications with the unusual substituent dimethylaminoethanol (DMAE). By that Juv-p120 corresponds to the surface proteins of the microfilarial sheath, Shp3 and Shp3a. The secreted protein consists of 697 amino acids, organized in two different domains of repeat elements separated by a stretch of polar residues. The N-terminal domain shows fourteen P/S/T/F-rich repeat elements highly modified with phospho-DMAE substituted O-glycans confering a negative charge to the protein. The C-terminal domain is extremely rich in glutamine (35%) and leucine (25%) in less organized repeats and may play a role in oligomerization of Juv-p120 monomers. A protein family with a similar Q/L-rich region and conserved core promoter region was identified in Brugia malayi by homology screening and in Wuchereria bancrofti and Loa loa by database similarity search. One of the Q/L-rich proteins in each genus has an extended S/T-rich region and due to this feature is supposed to be a putative Juv-p120 ortholog. The corresponding modification of Juv-p120 and the microfilarial sheath surface antigens Shp3/3a explains the appearance of anti-sheath antibodies before the release of microfilariae. The function of Juv-p120 is unknown. Copyright © 2011 Elsevier B.V. All rights reserved.
A calmodulin binding protein from Arabidopsis is induced by ethylene and contains a DNA-binding motif

NASA Technical Reports Server (NTRS)

Reddy, A. S.; Reddy, V. S.; Golovkin, M.

2000-01-01

Calmodulin (CaM), a key calcium sensor in all eukaryotes, regulates diverse cellular processes by interacting with other proteins. To isolate CaM binding proteins involved in ethylene signal transduction, we screened an expression library prepared from ethylene-treated Arabidopsis seedlings with 35S-labeled CaM. A cDNA clone, EICBP (Ethylene-Induced CaM Binding Protein), encoding a protein that interacts with activated CaM was isolated in this screening. The CaM binding domain in EICBP was mapped to the C-terminus of the protein. These results indicate that calcium, through CaM, could regulate the activity of EICBP. The EICBP is expressed in different tissues and its expression in seedlings is induced by ethylene. The EICBP contains, in addition to a CaM binding domain, several features that are typical of transcription factors. These include a DNA-binding domain at the N terminus, an acidic region at the C terminus, and nuclear localization signals. In database searches a partial cDNA (CG-1) encoding a DNA-binding motif from parsley and an ethylene up-regulated partial cDNA from tomato (ER66) showed significant similarity to EICBP. In addition, five hypothetical proteins in the Arabidopsis genome also showed a very high sequence similarity with EICBP, indicating that there are several EICBP-related proteins in Arabidopsis. The structural features of EICBP are conserved in all EICBP-related proteins in Arabidopsis, suggesting that they may constitute a new family of DNA binding proteins and are likely to be involved in modulating gene expression in the presence of ethylene.

Crystallographic Studies of Intermediate Filament Proteins.

PubMed

Guzenko, Dmytro; Chernyatina, Anastasia A; Strelkov, Sergei V

Intermediate filaments (IFs), together with microtubules and actin microfilaments, are the three main cytoskeletal components in metazoan cells. IFs are formed by a distinct protein family, which is made up of 70 members in humans. Most IF proteins are tissue- or organelle-specific, which includes lamins, the IF proteins of the nucleus. The building block of IFs is an elongated dimer, which consists of a central α-helical 'rod' domain flanked by flexible N- and C-terminal domains. The conserved rod domain is the 'signature feature' of the IF family. Bioinformatics analysis reveals that the rod domain of all IF proteins contains three α-helical segments of largely conserved length, interconnected by linkers. Moreover, there is a conserved pattern of hydrophobic repeats within each segment, which includes heptads and hendecads. This defines the presence of both left-handed and almost parallel coiled-coil regions along the rod length. Using X-ray crystallography on multiple overlapping fragments of IF proteins, the atomic structure of the nearly complete rod domain has been determined. Here, we discuss some specific challenges of this procedure, such as crystallization and diffraction data phasing by molecular replacement. Further insights into the structure of the coiled coil and the terminal domains have been obtained using electron paramagnetic resonance measurements on the full-length protein, with spin labels attached at specific positions. This atomic resolution information, as well as further interesting findings, such as the variation of the coiled-coil stability along the rod length, provide clues towards interpreting the data on IF assembly, collected by a range of methods. However, a full description of this process at the molecular level is not yet at hand.
The primary structure of stinging nettle (Urtica dioica) agglutinin. A two-domain member of the hevein family.

PubMed

Beintema, J J; Peumans, W J

1992-03-09

The primary structure of stinging nettle (Urtica dioica) agglutinin has been determined by sequence analysis of peptides obtained from three overlapping proteolytic digests. The sequence of 80 residues consists of two hevein-like domains with the same spacing of half-cystine residues and several other conserved residues as observed earlier in other proteins with hevein-like domains. The hinge region between the two domains is four residues longer than those between the four domains in cereal lectins like wheat germ agglutinin.
Evolution of the PWWP-domain encoding genes in the plant and animal lineages

PubMed Central

2012-01-01

Background Conserved domains are recognized as the building blocks of eukaryotic proteins. Domains showing a tendency to occur in diverse combinations (‘promiscuous’ domains) are involved in versatile architectures in proteins with different functions. Current models, based on global-level analyses of domain combinations in multiple genomes, have suggested that the propensity of some domains to associate with other domains in high-level architectures increases with organismal complexity. Alternative models using domain-based phylogenetic trees propose that domains have become promiscuous independently in different lineages through convergent evolution and are, thus, random with no functional or structural preferences. Here we test whether complex protein architectures have occurred by accretion from simpler systems and whether the appearance of multidomain combinations parallels organismal complexity. As a model, we analyze the modular evolution of the PWWP domain and ask whether its appearance in combinations with other domains into multidomain architectures is linked with the occurrence of more complex life-forms. Whether high-level combinations of domains are conserved and transmitted as stable units (cassettes) through evolution is examined in the genomes of plant or metazoan species selected for their established position in the evolution of the respective lineages. Results Using the domain-tree approach, we analyze the evolutionary origins and distribution patterns of the promiscuous PWWP domain to understand the principles of its modular evolution and its existence in combination with other domains in higher-level protein architectures. We found that as a single module the PWWP domain occurs only in proteins with a limited, mainly, species-specific distribution. Earlier, it was suggested that domain promiscuity is a fast-changing (volatile) feature shaped by natural selection and that only a few domains retain their promiscuity status throughout evolution. In contrast, our data show that most of the multidomain PWWP combinations in extant multicellular organisms (humans or land plants) are present in their unicellular ancestral relatives suggesting they have been transmitted through evolution as conserved linear arrangements (‘cassettes’). Among the most interesting biologically relevant results is the finding that the genes of the two plant Trithorax family subgroups (ATX1/2 and ATX3/4/5) have different phylogenetic origins. The two subgroups occur together in the earliest land plants Physcomitrella patens and Selaginella moellendorffii. Conclusion Gain/loss of a single PWWP domain is observed throughout evolution reflecting dynamic lineage- or species-specific events. In contrast, higher-level protein architectures involving the PWWP domain have survived as stable arrangements driven by evolutionary descent. The association of PWWP domains with the DNA methyltransferases in O. tauri and in the metazoan lineage seems to have occurred independently consistent with convergent evolution. Our results do not support models wherein more complex protein architectures involving the PWWP domain occur with the appearance of more evolutionarily advanced life forms. PMID:22734652
Constructing the principles: Method and metaphysics in the progress of theoretical physics

NASA Astrophysics Data System (ADS)

Glass, Lawrence C.

This thesis presents a new framework for the philosophy of physics focused on methodological differences found in the practice of modern theoretical physics. The starting point for this investigation is the longstanding debate over scientific realism. Some philosophers have argued that it is the aim of science to produce an accurate description of the world including explanations for observable phenomena. These scientific realists hold that our best confirmed theories are approximately true and that the entities they propose actually populate the world, whether or not they have been observed. Others have argued that science achieves only frameworks for the prediction and manipulation of observable phenomena. These anti-realists argue that truth is a misleading concept when applied to empirical knowledge. Instead, focus should be on the empirical adequacy of scientific theories. This thesis argues that the fundamental distinction at issue, a division between true scientific theories and ones which are empirically adequate, is best explored in terms of methodological differences. In analogy with the realism debate, there are at least two methodological strategies. Rather than focusing on scientific theories as wholes, this thesis takes as units of analysis physical principles which are systematic empirical generalizations. The first possible strategy, the conservative, takes the assumption that the empirical adequacy of a theory in one domain serves as good evidence for such adequacy in other domains. This then motivates the application of the principle to new domains. The second strategy, the innovative, assumes that empirical adequacy in one domain does not justify the expectation of adequacy in other domains. New principles are offered as explanations in the new domain. The final part of the thesis is the application of this framework to two examples. On the first, Lorentz's use of the aether is reconstructed in terms of the conservative strategy with respect to the principles of Galilean relativity. A comparison between the conservative strategy as an application of the conservative strategy and TeVeS as one of the innovative constitutes the second example.
LenVarDB: database of length-variant protein domains.

PubMed

Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan

2014-01-01

Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
Advanced Traffic Management Systems (ATMS) research analysis database system

DOT National Transportation Integrated Search

2001-06-01

The ATMS Research Analysis Database Systems (ARADS) consists of a Traffic Software Data Dictionary (TSDD) and a Traffic Software Object Model (TSOM) for application to microscopic traffic simulation and signal optimization domains. The purpose of thi...
The identification of complete domains within protein sequences using accurate E-values for semi-global alignment

PubMed Central

Kann, Maricel G.; Sheetlin, Sergey L.; Park, Yonil; Bryant, Stephen H.; Spouge, John L.

2007-01-01

The sequencing of complete genomes has created a pressing need for automated annotation of gene function. Because domains are the basic units of protein function and evolution, a gene can be annotated from a domain database by aligning domains to the corresponding protein sequence. Ideally, complete domains are aligned to protein subsequences, in a ‘semi-global alignment’. Local alignment, which aligns pieces of domains to subsequences, is common in high-throughput annotation applications, however. It is a mature technique, with the heuristics and accurate E-values required for screening large databases and evaluating the screening results. Hidden Markov models (HMMs) provide an alternative theoretical framework for semi-global alignment, but their use is limited because they lack heuristic acceleration and accurate E-values. Our new tool, GLOBAL, overcomes some limitations of previous semi-global HMMs: it has accurate E-values and the possibility of the heuristic acceleration required for high-throughput applications. Moreover, according to a standard of truth based on protein structure, two semi-global HMM alignment tools (GLOBAL and HMMer) had comparable performance in identifying complete domains, but distinctly outperformed two tools based on local alignment. When searching for complete protein domains, therefore, GLOBAL avoids disadvantages commonly associated with HMMs, yet maintains their superior retrieval performance. PMID:17596268
SkyDOT: a publicly accessible variability database, containing multiple sky surveys and real-time data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Starr, D. L.; Wozniak, P. R.; Vestrand, W. T.

2002-01-01

SkyDOT (Sky Database for Objects in Time-Domain) is a Virtual Observatory currently comprised of data from the RAPTOR, ROTSE I, and OGLE I1 survey projects. This makes it a very large time domain database. In addition, the RAPTOR project provides SkyDOT with real-time variability data as well as stereoscopic information. With its web interface, we believe SkyDOT will be a very useful tool for both astronomers, and the public. Our main task has been to construct an efficient relational database containing all existing data, while handling a real-time inflow of data. We also provide a useful web interface allowing easymore » access to both astronomers and the public. Initially, this server will allow common searches, specific queries, and access to light curves. In the future we will include machine learning classification tools and access to spectral information.« less
Low-Resolution Structure of the Full-Length Barley (Hordeum vulgare) SGT1 Protein in Solution, Obtained Using Small-Angle X-Ray Scattering

PubMed Central

Taube, Michał; Pieńkowska, Joanna R.; Jarmołowski, Artur; Kozak, Maciej

2014-01-01

SGT1 is an evolutionarily conserved eukaryotic protein involved in many important cellular processes. In plants, SGT1 is involved in resistance to disease. In a low ionic strength environment, the SGT1 protein tends to form dimers. The protein consists of three structurally independent domains (the tetratricopeptide repeats domain (TPR), the CHORD- and SGT1-containing domain (CS), and the SGT1-specific domain (SGS)), and two less conserved variable regions (VR1 and VR2). In the present study, we provide the low-resolution structure of the barley (Hordeum vulgare) SGT1 protein in solution and its dimer/monomer equilibrium using small-angle scattering of synchrotron radiation, ab-initio modeling and circular dichroism spectroscopy. The multivariate curve resolution least-square method (MCR-ALS) was applied to separate the scattering data of the monomeric and dimeric species from a complex mixture. The models of the barley SGT1 dimer and monomer were formulated using rigid body modeling with ab-initio structure prediction. Both oligomeric forms of barley SGT1 have elongated shapes with unfolded inter-domain regions. Circular dichroism spectroscopy confirmed that the barley SGT1 protein had a modular architecture, with an α-helical TPR domain, a β-sheet sandwich CS domain, and a disordered SGS domain separated by VR1 and VR2 regions. Using molecular docking and ab-initio protein structure prediction, a model of dimerization of the TPR domains was proposed. PMID:24714665
Evaluating, Comparing, and Interpreting Protein Domain Hierarchies

PubMed Central

2014-01-01

Abstract Arranging protein domain sequences hierarchically into evolutionarily divergent subgroups is important for investigating evolutionary history, for speeding up web-based similarity searches, for identifying sequence determinants of protein function, and for genome annotation. However, whether or not a particular hierarchy is optimal is often unclear, and independently constructed hierarchies for the same domain can often differ significantly. This article describes methods for statistically evaluating specific aspects of a hierarchy, for probing the criteria underlying its construction and for direct comparisons between hierarchies. Information theoretical notions are used to quantify the contributions of specific hierarchical features to the underlying statistical model. Such features include subhierarchies, sequence subgroups, individual sequences, and subgroup-associated signature patterns. Underlying properties are graphically displayed in plots of each specific feature's contributions, in heat maps of pattern residue conservation, in “contrast alignments,” and through cross-mapping of subgroups between hierarchies. Together, these approaches provide a deeper understanding of protein domain functional divergence, reveal uncertainties caused by inconsistent patterns of sequence conservation, and help resolve conflicts between competing hierarchies. PMID:24559108
Structural Characterization of the Boca/Mesd Maturation Factors for LDL-Receptor-Type beta Propeller Domains

DOE Office of Scientific and Technical Information (OSTI.GOV)

M Collins; W Hendrickson

2011-12-31

Folding and trafficking of low-density lipoprotein receptor (LDLR) family members, which play essential roles in development and homeostasis, are mediated by specific chaperones. The Boca/Mesd chaperone family specifically promotes folding and trafficking of the YWTD {beta} propeller-EGF domain pair found in the ectodomain of all LDLR members. Limited proteolysis, NMR spectroscopy, analytical ultracentrifugation, and X-ray crystallography were used to define a conserved core composed of a structured domain that is preceded by a disordered N-terminal region. High-resolution structures of the ordered domain were determined for homologous proteins from three metazoans. Seven independent protomers reveal a novel ferrodoxin-like superfamily fold withmore » two distinct {beta} sheet topologies. A conserved hydrophobic surface forms a dimer interface in each crystal, but these differ substantially at the atomic level, indicative of nonspecific hydrophobic interactions that may play a role in the chaperone activity of the Boca/Mesd family.« less
Structural Insight into the Core of CAD, the Multifunctional Protein Leading De Novo Pyrimidine Biosynthesis.

PubMed

Moreno-Morcillo, María; Grande-García, Araceli; Ruiz-Ramos, Alba; Del Caño-Ochoa, Francisco; Boskovic, Jasminka; Ramón-Maiques, Santiago

2017-06-06

CAD, the multifunctional protein initiating and controlling de novo biosynthesis of pyrimidines in animals, self-assembles into ∼1.5 MDa hexamers. The structures of the dihydroorotase (DHO) and aspartate transcarbamoylase (ATC) domains of human CAD have been previously determined, but we lack information on how these domains associate and interact with the rest of CAD forming a multienzymatic unit. Here, we prove that a construct covering human DHO and ATC oligomerizes as a dimer of trimers and that this arrangement is conserved in CAD-like from fungi, which holds an inactive DHO-like domain. The crystal structures of the ATC trimer and DHO-like dimer from the fungus Chaetomium thermophilum confirm the similarity with the human CAD homologs. These results demonstrate that, despite being inactive, the fungal DHO-like domain has a conserved structural function. We propose a model that sets the DHO and ATC complex as the central element in the architecture of CAD. Copyright © 2017 Elsevier Ltd. All rights reserved.
Axial U(1) current in Grabowska and Kaplan's formulation

NASA Astrophysics Data System (ADS)

Hamada, Yu; Kawai, Hikaru

2017-06-01

Recently, Grabowska and Kaplan [Phys. Rev. Lett. 116, 211602 (2016); Phys. Rev. D 94, 114504 (2016)] suggested a nonperturbative formulation of a chiral gauge theory, which consists of the conventional domain-wall fermion and a gauge field that evolves by gradient flow from one domain wall to the other. We introduce two sets of domain-wall fermions belonging to complex conjugate representations so that the effective theory is a 4D vector-like gauge theory. Then, as a natural definition of the axial-vector current, we consider a current that generates simultaneous phase transformations for the massless modes in 4 dimensions. However, this current is exactly conserved and does not reproduce the correct anomaly. In order to investigate this point precisely, we consider the mechanism of the conservation. We find that this current includes not only the axial current on the domain wall but also a contribution from the bulk, which is nonlocal in the sense of 4D fields. Therefore, the local current is obtained by subtracting the bulk contribution from it.
A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

PubMed Central

2010-01-01

Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840
Basement domain map of the conterminous United States and Alaska

USGS Publications Warehouse

Lund, Karen; Box, Stephen E.; Holm-Denoma, Christopher S.; San Juan, Carma A.; Blakely, Richard J.; Saltus, Richard W.; Anderson, Eric D.; DeWitt, Ed

2015-01-01

The tectonic settings for crustal types represented in the basement domains are subdivided into constituent geologic environments and the types of primary metals endowments and deposits in them are documented. The compositions, architecture, and original metals endowments are potentially important to assessments of primary mineral deposits and to the residence and recycling of metals in the crust of the United States portion of the North American continent. The databases can be configured to demonstrate the construction of the United States through time, to identify specific types of crust, or to identify domains potentially containing metal endowments of specific genetic types or endowed with specific metals. The databases can also be configured to illustrate other purposes chosen by users.
Arabidopsis VASCULAR-RELATED UNKNOWN PROTEIN1 Regulates Xylem Development and Growth by a Conserved Mechanism That Modulates Hormone Signaling1[W][OPEN

PubMed Central

Grienenberger, Etienne; Douglas, Carl J.

2014-01-01

Despite a strict conservation of the vascular tissues in vascular plants (tracheophytes), our understanding of the genetic basis underlying the differentiation of secondary cell wall-containing cells in the xylem of tracheophytes is still far from complete. Using coexpression analysis and phylogenetic conservation across sequenced tracheophyte genomes, we identified a number of Arabidopsis (Arabidopsis thaliana) genes of unknown function whose expression is correlated with secondary cell wall deposition. Among these, the Arabidopsis VASCULAR-RELATED UNKNOWN PROTEIN1 (VUP1) gene encodes a predicted protein of 24 kD with no annotated functional domains but containing domains that are highly conserved in tracheophytes. Here, we show that the VUP1 expression pattern, determined by promoter-β-glucuronidase reporter gene expression, is associated with vascular tissues, while vup1 loss-of-function mutants exhibit collapsed morphology of xylem vessel cells. Constitutive overexpression of VUP1 caused dramatic and pleiotropic developmental defects, including severe dwarfism, dark green leaves, reduced apical dominance, and altered photomorphogenesis, resembling brassinosteroid-deficient mutants. Constitutive overexpression of VUP homologs from multiple tracheophyte species induced similar defects. Whole-genome transcriptome analysis revealed that overexpression of VUP1 represses the expression of many brassinosteroid- and auxin-responsive genes. Additionally, deletion constructs and site-directed mutagenesis were used to identify critical domains and amino acids required for VUP1 function. Altogether, our data suggest a conserved role for VUP1 in regulating secondary wall formation during vascular development by tissue- or cell-specific modulation of hormone signaling pathways. PMID:24567189
A Potential Role for Drosophila Mucins in Development and Physiology

PubMed Central

Syed, Zulfeqhar A.; Härd, Torleif; Uv, Anne; van Dijk-Härd, Iris F.

2008-01-01

Vital vertebrate organs are protected from the external environment by a barrier that to a large extent consists of mucins. These proteins are characterized by poorly conserved repeated sequences that are rich in prolines and potentially glycosylated threonines and serines (PTS). We have now used the characteristics of the PTS repeat domain to identify Drosophila mucins in a simple bioinformatics approach. Searching the predicted protein database for proteins with at least 4 repeats and a high ST content, more than 30 mucin-like proteins were identified, ranging from 300–23000 amino acids in length. We find that Drosophila mucins are present at all stages of the fly life cycle, and that their transcripts localize to selective organs analogous to sites of vertebrate mucin expression. The results could allow for addressing basic questions about human mucin-related diseases in this model system. Additionally, many of the mucins are expressed in selective tissues during embryogenesis, thus revealing new potential functions for mucins as apical matrix components during organ morphogenesis. PMID:18725942
Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins

PubMed Central

Delcourt, Vivian; Lucier, Jean-François; Gagnon, Jules; Beaudoin, Maxime C; Vanderperre, Benoît; Breton, Marc-André; Motard, Julie; Jacques, Jean-François; Brunelle, Mylène; Gagnon-Arsenault, Isabelle; Fournier, Isabelle; Ouangraoua, Aida; Hunting, Darel J; Cohen, Alan A; Landry, Christian R; Scott, Michelle S

2017-01-01

Recent functional, proteomic and ribosome profiling studies in eukaryotes have concurrently demonstrated the translation of alternative open-reading frames (altORFs) in addition to annotated protein coding sequences (CDSs). We show that a large number of small proteins could in fact be coded by these altORFs. The putative alternative proteins translated from altORFs have orthologs in many species and contain functional domains. Evolutionary analyses indicate that altORFs often show more extreme conservation patterns than their CDSs. Thousands of alternative proteins are detected in proteomic datasets by reanalysis using a database containing predicted alternative proteins. This is illustrated with specific examples, including altMiD51, a 70 amino acid mitochondrial fission-promoting protein encoded in MiD51/Mief1/SMCR7L, a gene encoding an annotated protein promoting mitochondrial fission. Our results suggest that many genes are multicoding genes and code for a large protein and one or several small proteins. PMID:29083303
The psychological well-being of disability caregivers: examining the roles of family strain, family-to-work conflict, and perceived supervisor support.

PubMed

Li, Andrew; Shaffer, Jonathan; Bagger, Jessica

2015-01-01

We draw on the cross-domain model of work-family conflict and conservation of resources theory to examine the relationship between disability caregiving demands and the psychological well-being of employed caregivers. Using a sample of employed disability caregivers from a national survey, we found that the relationship between caregiving demands and family-to-work conflict was stronger when employees experienced high levels of strain from family. Additionally, we found high levels of family to-work conflict were subsequently associated with decreases in life satisfaction and increases in depression, but only when perceived supervisor support was low. Overall, our findings suggest an indirect relationship between caregiving demands and psychological well-being that is mediated by family-to-work conflict and is conditional on family strain and perceived supervisor support. The theoretical and practical implications of these findings are discussed. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Insights into the Specificity of Lysine Acetyltransferases

DOE PAGES

Tucker, Alex C.; Taylor, Keenan C.; Rank, Katherine C.; ...

2014-11-07

Reversible lysine acetylation by protein acetyltransferases is a conserved regulatory mechanism that controls diverse cellular pathways. Gcn5-related N-acetyltransferases (GNATs), named after their founding member, are found in all domains of life. GNATs are known for their role as histone acetyltransferases, but non-histone bacterial protein acetytransferases have been identified. Only structures of GNAT complexes with short histone peptide substrates are available in databases. Given the biological importance of this modification and the abundance of lysine in polypeptides, how specificity is attained for larger protein substrates is central to understanding acetyl-lysine-regulated networks. In this paper, we report the structure of a GNATmore » in complex with a globular protein substrate solved to 1.9 Å. GNAT binds the protein substrate with extensive surface interactions distinct from those reported for GNAT-peptide complexes. Finally, our data reveal determinants needed for the recognition of a protein substrate and provide insight into the specificity of GNATs.« less

Computational modeling of Repeat1 region of INI1/hSNF5: An evolutionary link with ubiquitin

PubMed Central

Bhutoria, Savita

2016-01-01

Abstract The structure of a protein can be very informative of its function. However, determining protein structures experimentally can often be very challenging. Computational methods have been used successfully in modeling structures with sufficient accuracy. Here we have used computational tools to predict the structure of an evolutionarily conserved and functionally significant domain of Integrase interactor (INI)1/hSNF5 protein. INI1 is a component of the chromatin remodeling SWI/SNF complex, a tumor suppressor and is involved in many protein‐protein interactions. It belongs to SNF5 family of proteins that contain two conserved repeat (Rpt) domains. Rpt1 domain of INI1 binds to HIV‐1 Integrase, and acts as a dominant negative mutant to inhibit viral replication. Rpt1 domain also interacts with oncogene c‐MYC and modulates its transcriptional activity. We carried out an ab initio modeling of a segment of INI1 protein containing the Rpt1 domain. The structural model suggested the presence of a compact and well defined ββαα topology as core structure in the Rpt1 domain of INI1. This topology in Rpt1 was similar to PFU domain of Phospholipase A2 Activating Protein, PLAA. Interestingly, PFU domain shares similarity with Ubiquitin and has ubiquitin binding activity. Because of the structural similarity between Rpt1 domain of INI1 and PFU domain of PLAA, we propose that Rpt1 domain of INI1 may participate in ubiquitin recognition or binding with ubiquitin or ubiquitin related proteins. This modeling study may shed light on the mode of interactions of Rpt1 domain of INI1 and is likely to facilitate future functional studies of INI1. PMID:27261671
Computational modeling of Repeat1 region of INI1/hSNF5: An evolutionary link with ubiquitin.

PubMed

Bhutoria, Savita; Kalpana, Ganjam V; Acharya, Seetharama A

2016-09-01

The structure of a protein can be very informative of its function. However, determining protein structures experimentally can often be very challenging. Computational methods have been used successfully in modeling structures with sufficient accuracy. Here we have used computational tools to predict the structure of an evolutionarily conserved and functionally significant domain of Integrase interactor (INI)1/hSNF5 protein. INI1 is a component of the chromatin remodeling SWI/SNF complex, a tumor suppressor and is involved in many protein-protein interactions. It belongs to SNF5 family of proteins that contain two conserved repeat (Rpt) domains. Rpt1 domain of INI1 binds to HIV-1 Integrase, and acts as a dominant negative mutant to inhibit viral replication. Rpt1 domain also interacts with oncogene c-MYC and modulates its transcriptional activity. We carried out an ab initio modeling of a segment of INI1 protein containing the Rpt1 domain. The structural model suggested the presence of a compact and well defined ββαα topology as core structure in the Rpt1 domain of INI1. This topology in Rpt1 was similar to PFU domain of Phospholipase A2 Activating Protein, PLAA. Interestingly, PFU domain shares similarity with Ubiquitin and has ubiquitin binding activity. Because of the structural similarity between Rpt1 domain of INI1 and PFU domain of PLAA, we propose that Rpt1 domain of INI1 may participate in ubiquitin recognition or binding with ubiquitin or ubiquitin related proteins. This modeling study may shed light on the mode of interactions of Rpt1 domain of INI1 and is likely to facilitate future functional studies of INI1. © 2016 The Protein Society.
β-Helical architecture of cytoskeletal bactofilin filaments revealed by solid-state NMR

PubMed Central

Vasa, Suresh; Lin, Lin; Shi, Chaowei; Habenstein, Birgit; Riedel, Dietmar; Kühn, Juliane; Thanbichler, Martin; Lange, Adam

2015-01-01

Bactofilins are a widespread class of bacterial filament-forming proteins, which serve as cytoskeletal scaffolds in various cellular pathways. They are characterized by a conserved architecture, featuring a central conserved domain (DUF583) that is flanked by variable terminal regions. Here, we present a detailed investigation of bactofilin filaments from Caulobacter crescentus by high-resolution solid-state NMR spectroscopy. De novo sequential resonance assignments were obtained for residues Ala39 to Phe137, spanning the conserved DUF583 domain. Analysis of the secondary chemical shifts shows that this core region adopts predominantly β-sheet secondary structure. Mutational studies of conserved hydrophobic residues located in the identified β-strand segments suggest that bactofilin folding and polymerization is mediated by an extensive and redundant network of hydrophobic interactions, consistent with the high intrinsic stability of bactofilin polymers. Transmission electron microscopy revealed a propensity of bactofilin to form filament bundles as well as sheet-like, 2D crystalline assemblies, which may represent the supramolecular arrangement of bactofilin in the native context. Based on the diffraction pattern of these 2D crystalline assemblies, scanning transmission electron microscopy measurements of the mass per length of BacA filaments, and the distribution of β-strand segments identified by solid-state NMR, we propose that the DUF583 domain adopts a β-helical architecture, in which 18 β-strand segments are arranged in six consecutive windings of a β-helix. PMID:25550503
β-Helical architecture of cytoskeletal bactofilin filaments revealed by solid-state NMR.

PubMed

Vasa, Suresh; Lin, Lin; Shi, Chaowei; Habenstein, Birgit; Riedel, Dietmar; Kühn, Juliane; Thanbichler, Martin; Lange, Adam

2015-01-13

Bactofilins are a widespread class of bacterial filament-forming proteins, which serve as cytoskeletal scaffolds in various cellular pathways. They are characterized by a conserved architecture, featuring a central conserved domain (DUF583) that is flanked by variable terminal regions. Here, we present a detailed investigation of bactofilin filaments from Caulobacter crescentus by high-resolution solid-state NMR spectroscopy. De novo sequential resonance assignments were obtained for residues Ala39 to Phe137, spanning the conserved DUF583 domain. Analysis of the secondary chemical shifts shows that this core region adopts predominantly β-sheet secondary structure. Mutational studies of conserved hydrophobic residues located in the identified β-strand segments suggest that bactofilin folding and polymerization is mediated by an extensive and redundant network of hydrophobic interactions, consistent with the high intrinsic stability of bactofilin polymers. Transmission electron microscopy revealed a propensity of bactofilin to form filament bundles as well as sheet-like, 2D crystalline assemblies, which may represent the supramolecular arrangement of bactofilin in the native context. Based on the diffraction pattern of these 2D crystalline assemblies, scanning transmission electron microscopy measurements of the mass per length of BacA filaments, and the distribution of β-strand segments identified by solid-state NMR, we propose that the DUF583 domain adopts a β-helical architecture, in which 18 β-strand segments are arranged in six consecutive windings of a β-helix.
Conservation of a pH-sensitive structure in the C-terminal region of spider silk extends across the entire silk gene family.

PubMed

Strickland, Michelle; Tudorica, Victor; Řezáč, Milan; Thomas, Neil R; Goodacre, Sara L

2018-06-01

Spiders produce multiple silks with different physical properties that allow them to occupy a diverse range of ecological niches, including the underwater environment. Despite this functional diversity, past molecular analyses show a high degree of amino acid sequence similarity between C-terminal regions of silk genes that appear to be independent of the physical properties of the resulting silks; instead, this domain is crucial to the formation of silk fibers. Here, we present an analysis of the C-terminal domain of all known types of spider silk and include silk sequences from the spider Argyroneta aquatica, which spins the majority of its silk underwater. Our work indicates that spiders have retained a highly conserved mechanism of silk assembly, despite the extraordinary diversification of species, silk types and applications of silk over 350 million years. Sequence analysis of the silk C-terminal domain across the entire gene family shows the conservation of two uncommon amino acids that are implicated in the formation of a salt bridge, a functional bond essential to protein assembly. This conservation extends to the novel sequences isolated from A. aquatica. This finding is relevant to research regarding the artificial synthesis of spider silk, suggesting that synthesis of all silk types will be possible using a single process.
Agricultural conservation planning framework: 3. Land use and field boundary database development and structure

USDA-ARS?s Scientific Manuscript database

Conservation planning information is important in identifying options for watershed water quality improvement, and can be developed for use at field, farm, and watershed scales. Translation across scales is a key issue impeding progress at watershed scales because watershed improvement goals must be...
Agricultural conservation planning framework: 1. Developing multi-practice watershed planning scenarios and assessing nutrient reduction potential

USDA-ARS?s Scientific Manuscript database

We show that spatial data on soils, land use, and high-resolution topography, combined with knowledge of conservation practice effectiveness, can be leveraged to identify and assess alternatives to reduce nutrient discharge from small (HUC12) agricultural watersheds. Databases comprising soil attrib...
A structural portrait of the PDZ domain family.

PubMed

Ernst, Andreas; Appleton, Brent A; Ivarsson, Ylva; Zhang, Yingnan; Gfeller, David; Wiesmann, Christian; Sidhu, Sachdev S

2014-10-23

PDZ (PSD-95/Discs-large/ZO1) domains are interaction modules that typically bind to specific C-terminal sequences of partner proteins and assemble signaling complexes in multicellular organisms. We have analyzed the existing database of PDZ domain structures in the context of a specificity tree based on binding specificities defined by peptide-phage binding selections. We have identified 16 structures of PDZ domains in complex with high-affinity ligands and have elucidated four additional structures to assemble a structural database that covers most of the branches of the PDZ specificity tree. A detailed comparison of the structures reveals features that are responsible for the diverse specificities across the PDZ domain family. Specificity differences can be explained by differences in PDZ residues that are in contact with the peptide ligands, but these contacts involve both side-chain and main-chain interactions. Most PDZ domains bind peptides in a canonical conformation in which the ligand main chain adopts an extended β-strand conformation by interacting in an antiparallel fashion with a PDZ β-strand. However, a subset of PDZ domains bind peptides with a bent main-chain conformation and the specificities of these non-canonical domains could not be explained based on canonical structures. Our analysis provides a structural portrait of the PDZ domain family, which serves as a guide in understanding the structural basis for the diverse specificities across the family. Copyright © 2014 Elsevier Ltd. All rights reserved.
The conserved N-terminal domain of herpes simplex virus 1 UL24 protein is sufficient to induce the spatial redistribution of nucleolin.

PubMed

Bertrand, Luc; Pearson, Angela

2008-05-01

UL24 is widely conserved among herpesviruses but its function during infection is poorly understood. Previously, we discovered a genetic link between UL24 and the herpes simplex virus 1-induced dispersal of the nucleolar protein nucleolin. Here, we report that in the absence of viral infection, transiently expressed UL24 accumulated in both the nucleus and the Golgi apparatus. In the majority of transfected cells, nuclear staining for UL24 was diffuse, but a minor staining pattern, whereby UL24 was present in nuclear foci corresponding to nucleoli, was also observed. Expression of UL24 correlated with the dispersal of nucleolin. This dispersal did not appear to be a consequence of a general disaggregation of nucleoli, as foci of fibrillarin staining persisted in cells expressing UL24. The conserved N-terminal region of UL24 was sufficient to cause this change in subcellular distribution of nucleolin. Interestingly, a bipartite nuclear localization signal predicted within the C terminus of UL24 was dispensable for nuclear localization. None of the five individual UL24 homology domains was required for nuclear or Golgi localization, but deletion of these domains resulted in the loss of nucleolin-dispersal activity. We determined that a nucleolar-targeting signal was contained within the first 60 aa of UL24. Our results show that the conserved N-terminal domain of UL24 is sufficient to specifically induce dispersal of nucleolin in the absence of other viral proteins or virus-induced cellular modifications. These results suggest that UL24 directly targets cellular factors that affect the composition of nucleoli.
COPT6 Is a Plasma Membrane Transporter That Functions in Copper Homeostasis in Arabidopsis and Is a Novel Target of SQUAMOSA Promoter-binding Protein-like 7*

PubMed Central

Jung, Ha-il; Gayomba, Sheena R.; Rutzke, Michael A.; Craft, Eric; Kochian, Leon V.; Vatamaniuk, Olena K.

2012-01-01

Among the mechanisms controlling copper homeostasis in plants is the regulation of its uptake and tissue partitioning. Here we characterized a newly identified member of the conserved CTR/COPT family of copper transporters in Arabidopsis thaliana, COPT6. We showed that COPT6 resides at the plasma membrane and mediates copper accumulation when expressed in the Saccharomyces cerevisiae copper uptake mutant. Although the primary sequence of COPT6 contains the family conserved domains, including methionine-rich motifs in the extracellular N-terminal domain and a second transmembrane helix (TM2), it is different from the founding family member, S. cerevisiae Ctr1p. This conclusion was based on the finding that although the positionally conserved Met106 residue in the TM2 of COPT6 is functionally essential, the conserved Met27 in the N-terminal domain is not. Structure-function studies revealed that the N-terminal domain is dispensable for COPT6 function in copper-replete conditions but is important under copper-limiting conditions. In addition, COPT6 interacts with itself and with its homolog, COPT1, unlike Ctr1p, which interacts only with itself. Analyses of the expression pattern showed that although COPT6 is expressed in different cell types of different plant organs, the bulk of its expression is located in the vasculature. We also show that COPT6 expression is regulated by copper availability that, in part, is controlled by a master regulator of copper homeostasis, SPL7. Finally, studies using the A. thaliana copt6-1 mutant and plants overexpressing COPT6 revealed its essential role during copper limitation and excess. PMID:22865877
Conservation of the Human Integrin-Type Beta-Propeller Domain in Bacteria

PubMed Central

Chouhan, Bhanupratap; Denesyuk, Alexander; Heino, Jyrki; Johnson, Mark S.; Denessiouk, Konstantin

2011-01-01

Integrins are heterodimeric cell-surface receptors with key functions in cell-cell and cell-matrix adhesion. Integrin α and β subunits are present throughout the metazoans, but it is unclear whether the subunits predate the origin of multicellular organisms. Several component domains have been detected in bacteria, one of which, a specific 7-bladed β-propeller domain, is a unique feature of the integrin α subunits. Here, we describe a structure-derived motif, which incorporates key features of each blade from the X-ray structures of human αIIbβ3 and αVβ3, includes elements of the FG-GAP/Cage and Ca2+-binding motifs, and is specific only for the metazoan integrin domains. Separately, we searched for the metazoan integrin type β-propeller domains among all available sequences from bacteria and unicellular eukaryotic organisms, which must incorporate seven repeats, corresponding to the seven blades of the β-propeller domain, and so that the newly found structure-derived motif would exist in every repeat. As the result, among 47 available genomes of unicellular eukaryotes we could not find a single instance of seven repeats with the motif. Several sequences contained three repeats, a predicted transmembrane segment, and a short cytoplasmic motif associated with some integrins, but otherwise differ from the metazoan integrin α subunits. Among the available bacterial sequences, we found five examples containing seven sequential metazoan integrin-specific motifs within the seven repeats. The motifs differ in having one Ca2+-binding site per repeat, whereas metazoan integrins have three or four sites. The bacterial sequences are more conserved in terms of motif conservation and loop length, suggesting that the structure is more regular and compact than those example structures from human integrins. Although the bacterial examples are not full-length integrins, the full-length metazoan-type 7-bladed β-propeller domains are present, and sometimes two tandem copies are found. PMID:22022374
A Drosophila haemocyte-specific protein, hemolectin, similar to human von Willebrand factor.

PubMed Central

Goto, A; Kumagai, T; Kumagai, C; Hirose, J; Narita, H; Mori, H; Kadowaki, T; Beck, K; Kitagawa, Y

2001-01-01

We identified a novel Drosophila protein of approximately 400 kDa, hemolectin (d-Hml), secreted from haemocyte-derived Kc167 cells. Its 11.7 kbp cDNA contains an open reading frame of 3843 amino acid residues, with conserved domains in von Willebrand factor (VWF), coagulation factor V/VIII and complement factors. The d-hml gene is located on the third chromosome (position 70C1-5) and consists of 26 exons. The major part of d-Hml consists of well-known motifs with the organization: CP1-EG1-CP2-EG2-CP3-VD1-VD2-VD'-VD3-VC1-VD"-VD"'-FC1-FC2-VC2-LA1-VD4-VD5-VC3-VB1-VB2-VC4-VC5-CK1 (CP, complement-control protein domain; EG, epidermal-growth-factor-like domain; VB, VC, VD, VWF type B-, C- and D-like domains; VD', VD", VD"', truncated C-terminal VDs; FC, coagulation factor V/VIII type C domain; LA, low-density-lipoprotein-receptor class A domain; CK, cysteine knot domain). The organization of VD1-VD2-VD'-VD3, essential for VWF to be processed by furin, to bind to coagulation factor VIII and to form interchain disulphide linkages, is conserved. The 400 kDa form of d-Hml was sensitive to acidic cleavage near the boundary between VD2 and VD', where the cleavage site of pro-VWF is located. Agarose-gel electrophoresis of metabolically radiolabelled d-Hml suggested that it is secreted from Kc167 cells mainly as dimers. Resembling VWF, 7.9% (305 residues) of cysteine residues on the d-Hml sequence had well-conserved positions in each motif. Coinciding with the development of phagocytic haemocytes, d-hml transcript was detected in late embryos and larvae. Its low-level expression in adult flies was induced by injury at any position on the body. PMID:11563973
Convalescent Plasmodium falciparum-specific seroreactivity does not correlate with paediatric malaria severity or Plasmodium antigen exposure.

PubMed

Kessler, Anne; Campo, Joseph J; Harawa, Visopo; Mandala, Wilson L; Rogerson, Stephen J; Mowrey, Wenzhu B; Seydel, Karl B; Kim, Kami

2018-04-25

Antibody immunity is thought to be essential to prevent severe Plasmodium falciparum infection, but the exact correlates of protection are unknown. Over time, children in endemic areas acquire non-sterile immunity to malaria that correlates with development of antibodies to merozoite invasion proteins and parasite proteins expressed on the surface of infected erythrocytes. A 1000 feature P. falciparum 3D7 protein microarray was used to compare P. falciparum-specific seroreactivity during acute infection and 30 days after infection in 23 children with uncomplicated malaria (UM) and 25 children with retinopathy-positive cerebral malaria (CM). All children had broad P. falciparum antibody reactivity during acute disease. IgM reactivity decreased and IgG reactivity increased in convalescence. Antibody reactivity to CIDR domains of "virulent" PfEMP1 proteins was low with robust reactivity to the highly conserved, intracellular ATS domain of PfEMP1 in both groups. Although children with UM and CM differed markedly in parasite burden and PfEMP1 exposure during acute disease, neither acute nor convalescent PfEMP1 seroreactivity differed between groups. Greater seroprevalence to a conserved Group A-associated ICAM binding extracellular domain was observed relative to linked extracellular CIDRα1 domains in both case groups. Pooled immune IgG from Malawian adults revealed greater reactivity to PfEMP1 than observed in children. Children with uncomplicated and cerebral malaria have similar breadth and magnitude of P. falciparum antibody reactivity. The utility of protein microarrays to measure serological recognition of polymorphic PfEMP1 antigens needs to be studied further, but the study findings support the hypothesis that conserved domains of PfEMP1 are more prominent targets of cross reactive antibodies than variable domains in children with symptomatic malaria. Protein microarrays represent an additional tool to identify cross-reactive Plasmodium antigens including PfEMP1 domains that can be investigated as strain-transcendent vaccine candidates.
AglH, a thermophilic UDP-N-acetylglucosamine-1-phosphate:dolichyl phosphate GlcNAc-1-phosphotransferase initiating protein N-glycosylation pathway in Sulfolobus acidocaldarius, is capable of complementing the eukaryal Alg7.

PubMed

Meyer, Benjamin H; Shams-Eldin, Hosam; Albers, Sonja-Verena

2017-01-01

AglH, a predicted UDP-GlcNAc-1-phosphate:dolichyl phosphate GlcNAc-1-phosphotransferase, is initiating the protein N-glycosylation pathway in the thermoacidophilic crenarchaeon Sulfolobus acidocaldarius. AglH successfully replaced the endogenous GlcNAc-1-phosphotransferase activity of Alg7 in a conditional lethal Saccharomyces cerevisiae strain, in which the first step of the eukaryal protein N-glycosylation process was repressed. This study is one of the few examples of cross-domain complementation demonstrating a conserved polyprenyl phosphate transferase reaction within the eukaryal and archaeal domain like it was demonstrated for Methanococcus voltae (Shams-Eldin et al. 2008). The topology prediction and the alignment of the AglH membrane protein with GlcNAc-1-phosphotransferases from the three domains of life show significant conservation of amino acids within the different proposed cytoplasmic loops. Alanine mutations of selected conserved amino acids in the putative cytoplasmic loops II (D 100 ), IV (F 220 ) and V (F 264 ) demonstrated the importance of these amino acids for cross-domain AlgH activity in in vitro complementation assays in S. cerevisiae. Furthermore, antibiotic treatment interfering directly with the activity of dolichyl phosphate GlcNAc-1-phosphotransferases confirmed the essentiality of N-glycosylation for cell survival.
Genome-wide identification and expression analysis of YTH domain-containing RNA-binding protein family in cucumber (Cucumis sativus).

PubMed

Zhou, Yong; Hu, Lifang; Jiang, Lunwei; Liu, Shiqiang

2018-06-01

YTH domain-containing RNA-binding proteins are involved in post-transcriptional regulation and play important roles in the growth and development as well as abiotic stress responses of plants. However, YTH genes have not been previously studied in cucumber (Cucumis sativus). In this study, a total of five YTH genes (CsYTH1-CsYTH5) were identified in cucumber, which could be mapped on three out of the seven cucumber chromosomes. All CsYTH proteins had highly conserved C-terminal YTH domains, and two of them (CsYTH1 and CsYTH4) harbored extra CCCH and P/Q/N-rich domains. The phylogenesis, conserved motifs and exon-intron structure of YTH genes from cucumber, Arabidopsis and rice were also analyzed. The phylogenetically closely clustered YTHs shared similar gene structures and conserved motifs. An analysis of the cis-acting regulatory elements in the upstream region of these genes resulted in the identification of many cis-elements related to stress, hormone and development. Expression analysis based on the transcriptome data showed that some CsYTHs had development- or tissue-specific expression. In addition, their expression levels were altered under various stresses such as salt, drought, cold, and abscisic acid (ABA) treatments. These findings lay the foundation for the functional analysis of CsYTHs in the future.
The Caenorhabditis elegans Iodotyrosine Deiodinase Ortholog SUP-18 Functions through a Conserved Channel SC-Box to Regulate the Muscle Two-Pore Domain Potassium Channel SUP-9

PubMed Central

de la Cruz, Ignacio Perez; Ma, Long; Horvitz, H. Robert

2014-01-01

Loss-of-function mutations in the Caenorhabditis elegans gene sup-18 suppress the defects in muscle contraction conferred by a gain-of-function mutation in SUP-10, a presumptive regulatory subunit of the SUP-9 two-pore domain K+ channel associated with muscle membranes. We cloned sup-18 and found that it encodes the C. elegans ortholog of mammalian iodotyrosine deiodinase (IYD), an NADH oxidase/flavin reductase that functions in iodine recycling and is important for the biosynthesis of thyroid hormones that regulate metabolism. The FMN-binding site of mammalian IYD is conserved in SUP-18, which appears to require catalytic activity to function. Genetic analyses suggest that SUP-10 can function with SUP-18 to activate SUP-9 through a pathway that is independent of the presumptive SUP-9 regulatory subunit UNC-93. We identified a novel evolutionarily conserved serine-cysteine-rich region in the C-terminal cytoplasmic domain of SUP-9 required for its specific activation by SUP-10 and SUP-18 but not by UNC-93. Since two-pore domain K+ channels regulate the resting membrane potentials of numerous cell types, we suggest that the SUP-18 IYD regulates the activity of the SUP-9 channel using NADH as a coenzyme and thus couples the metabolic state of muscle cells to muscle membrane excitability. PMID:24586202
Domain architectures of the Scm3p protein provide insights into centromere function and evolution.

PubMed

Aravind, L; Iyer, Lakshminarayan M; Wu, Carl

2007-10-15

Recently, Scm3p has been shown to be a nonhistone component of centromeric chromatin that binds stoichiometrically to CenH3-H4 histones, and to be required for the assembly of kinetochores in Saccharomyces cerevisiae. Scm3p is conserved across fungi, and displays a remarkable variation in protein size, ranging from approximately 200 amino acids in S. cerevisiae to approximately 1300 amino acids in Neurospora crassa. This is primarily due a variable C-terminal segment that is linked to a conserved N-terminal, CenH3-interacting domain. We have discovered that the extended C-terminal region of Scm3p is strikingly characterized by lineage-specific fusions of single or multiple predicted DNA-binding domains different versions of the MYB and C2H2 zinc finger domains, AT-hooks, and a novel cysteine-rich metal-chelating cluster that are absent from the small versions of Scm3. Instead, S. cerevisiae point centromeres are recognized by components of the CBF3 DNA binding complex, which are conserved amongst close relatives of budding yeast, but are correspondingly absent from more distant fungi that possess regional centromeres. Hence, the C-terminal DNA binding motifs found in large Scm3p proteins may, along with CenH3, serve as a key epigenetic signal by recognizing and accommodating the lineage-specific diversity of centromere DNA in course of evolution.
A small cellulose binding domain protein in Phytophtora is cell wall localized

USDA-ARS?s Scientific Manuscript database

Cellulose binding domains (CBD) are structurally conserved regions linked to catalytic regions of cellulolytic enzymes. While widespread amongst saprophytic fungi that subsist on plant cell wall polysaccharides, they are not generally present in plant pathogenic fungi. A genome wide survey of CBDs w...
What Data to Use for Forest Conservation Planning? A Comparison of Coarse Open and Detailed Proprietary Forest Inventory Data in Finland

PubMed Central

Lehtomäki, Joona; Tuominen, Sakari; Toivonen, Tuuli; Leinonen, Antti

2015-01-01

The boreal region is facing intensifying resource extraction pressure, but the lack of comprehensive biodiversity data makes operative forest conservation planning difficult. Many countries have implemented forest inventory schemes and are making extensive and up-to-date forest databases increasingly available. Some of the more detailed inventory databases, however, remain proprietary and unavailable for conservation planning. Here, we investigate how well different open and proprietary forest inventory data sets suit the purpose of conservation prioritization in Finland. We also explore how much priorities are affected by using the less accurate but open data. First, we construct a set of indices for forest conservation value based on quantitative information commonly found in forest inventories. These include the maturity of the trees, tree species composition, and site fertility. Secondly, using these data and accounting for connectivity between forest types, we investigate the patterns in conservation priority. For prioritization, we use Zonation, a method and software for spatial conservation prioritization. We then validate the prioritizations by comparing them to known areas of high conservation value. We show that the overall priority patterns are relatively consistent across different data sources and analysis options. However, the coarse data cannot be used to accurately identify the high-priority areas as it misses much of the fine-scale variation in forest structures. We conclude that, while inventory data collected for forestry purposes may be useful for forest conservation purposes, it needs to be detailed enough to be able to account for more fine-scaled features of high conservation value. These results underline the importance of making detailed inventory data publicly available. Finally, we discuss how the prioritization methodology we used could be integrated into operative forest management, especially in countries in the boreal zone. PMID:26317227
What Data to Use for Forest Conservation Planning? A Comparison of Coarse Open and Detailed Proprietary Forest Inventory Data in Finland.

PubMed

Lehtomäki, Joona; Tuominen, Sakari; Toivonen, Tuuli; Leinonen, Antti

2015-01-01

The boreal region is facing intensifying resource extraction pressure, but the lack of comprehensive biodiversity data makes operative forest conservation planning difficult. Many countries have implemented forest inventory schemes and are making extensive and up-to-date forest databases increasingly available. Some of the more detailed inventory databases, however, remain proprietary and unavailable for conservation planning. Here, we investigate how well different open and proprietary forest inventory data sets suit the purpose of conservation prioritization in Finland. We also explore how much priorities are affected by using the less accurate but open data. First, we construct a set of indices for forest conservation value based on quantitative information commonly found in forest inventories. These include the maturity of the trees, tree species composition, and site fertility. Secondly, using these data and accounting for connectivity between forest types, we investigate the patterns in conservation priority. For prioritization, we use Zonation, a method and software for spatial conservation prioritization. We then validate the prioritizations by comparing them to known areas of high conservation value. We show that the overall priority patterns are relatively consistent across different data sources and analysis options. However, the coarse data cannot be used to accurately identify the high-priority areas as it misses much of the fine-scale variation in forest structures. We conclude that, while inventory data collected for forestry purposes may be useful for forest conservation purposes, it needs to be detailed enough to be able to account for more fine-scaled features of high conservation value. These results underline the importance of making detailed inventory data publicly available. Finally, we discuss how the prioritization methodology we used could be integrated into operative forest management, especially in countries in the boreal zone.

Structure of a two-CAP-domain protein from the human hookworm parasite Necator americanus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Asojo, Oluwatoyin A., E-mail: oasojo@unmc.edu

2011-05-01

The first structure of a two-CAP-domain protein, Na-ASP-1, from the major human hookworm parasite N. americanus refined to a resolution limit of 2.2 Å is presented. Major proteins secreted by the infective larval stage hookworms upon host entry include Ancylostoma secreted proteins (ASPs), which are characterized by one or two CAP (cysteine-rich secretory protein/antigen 5/pathogenesis related-1) domains. The CAP domain has been reported in diverse phylogenetically unrelated proteins, but has no confirmed function. The first structure of a two-CAP-domain protein, Na-ASP-1, from the major human hookworm parasite Necator americanus was refined to a resolution limit of 2.2 Å. The structuremore » was solved by molecular replacement (MR) using Na-ASP-2, a one-CAP-domain ASP, as the search model. The correct MR solution could only be obtained by truncating the polyalanine model of Na-ASP-2 and removing several loops. The structure reveals two CAP domains linked by an extended loop. Overall, the carboxyl-terminal CAP domain is more similar to Na-ASP-2 than to the amino-terminal CAP domain. A large central cavity extends from the amino-terminal CAP domain to the carboxyl-terminal CAP domain, encompassing the putative CAP-binding cavity. The putative CAP-binding cavity is a characteristic cavity in the carboxyl-terminal CAP domain that contains a His and Glu pair. These residues are conserved in all single-CAP-domain proteins, but are absent in the amino-terminal CAP domain. The conserved His residues are oriented such that they appear to be capable of directly coordinating a zinc ion as observed for CAP proteins from reptile venoms. This first structure of a two-CAP-domain ASP can serve as a template for homology modeling of other two-CAP-domain proteins.« less
The shell-forming proteome of Lottia gigantea reveals both deep conservations and lineage-specific novelties.

PubMed

Marie, Benjamin; Jackson, Daniel J; Ramos-Silva, Paula; Zanella-Cléon, Isabelle; Guichard, Nathalie; Marin, Frédéric

2013-01-01

Proteins that are occluded within the molluscan shell, the so-called shell matrix proteins (SMPs), are an assemblage of biomolecules attractive to study for several reasons. They increase the fracture resistance of the shell by several orders of magnitude, determine the polymorph of CaCO(3) deposited, and regulate crystal nucleation, growth initiation and termination. In addition, they are thought to control the shell microstructures. Understanding how these proteins have evolved is also likely to provide deep insight into events that supported the diversification and expansion of metazoan life during the Cambrian radiation 543 million years ago. Here, we present an analysis of SMPs isolated form the CaCO(3) shell of the limpet Lottia gigantea, a gastropod that constructs an aragonitic cross-lamellar shell. We identified 39 SMPs by combining proteomic analysis with genomic and transcriptomic database interrogations. Among these proteins are various low-complexity domain-containing proteins, enzymes such as peroxidases, carbonic anhydrases and chitinases, acidic calcium-binding proteins and protease inhibitors. This list is likely to contain the most abundant SMPs of the shell matrix. It reveals the presence of both highly conserved and lineage-specific biomineralizing proteins. This mosaic evolutionary pattern suggests that there may be an ancestral molluscan SMP set upon which different conchiferan lineages have elaborated to produce the diversity of shell microstructures we observe nowadays. © 2012 The Authors Journal compilation © 2012 FEBS.
The Escherichia coli Lpt transenvelope protein complex for lipopolysaccharide export is assembled via conserved structurally homologous domains.

PubMed

Villa, Riccardo; Martorana, Alessandra M; Okuda, Suguru; Gourlay, Louise J; Nardini, Marco; Sperandeo, Paola; Dehò, Gianni; Bolognesi, Martino; Kahne, Daniel; Polissi, Alessandra

2013-03-01

Lipopolysaccharide is a major glycolipid component in the outer leaflet of the outer membrane (OM), a peculiar permeability barrier of Gram-negative bacteria that prevents many toxic compounds from entering the cell. Lipopolysaccharide transport (Lpt) across the periplasmic space and its assembly at the Escherichia coli cell surface are carried out by a transenvelope complex of seven essential Lpt proteins spanning the inner membrane (LptBCFG), the periplasm (LptA), and the OM (LptDE), which appears to operate as a unique machinery. LptC is an essential inner membrane-anchored protein with a large periplasm-protruding domain. LptC binds the inner membrane LptBFG ABC transporter and interacts with the periplasmic protein LptA. However, its role in lipopolysaccharide transport is unclear. Here we show that LptC lacking the transmembrane region is viable and can bind the LptBFG inner membrane complex; thus, the essential LptC functions are located in the periplasmic domain. In addition, we characterize two previously described inactive single mutations at two conserved glycines (G56V and G153R, respectively) of the LptC periplasmic domain, showing that neither mutant is able to assemble the transenvelope machinery. However, while LptCG56V failed to copurify any Lpt component, LptCG153R was able to interact with the inner membrane protein complex LptBFG. Overall, our data further support the model whereby the bridge connecting the inner and outer membranes would be based on the conserved structurally homologous jellyroll domain shared by five out of the seven Lpt components.
Transcriptional activation is a conserved feature of the early embryonic factor Zelda that requires a cluster of four zinc fingers for DNA binding and a low-complexity activation domain.

PubMed

Hamm, Danielle C; Bondra, Eliana R; Harrison, Melissa M

2015-02-06

Delayed transcriptional activation of the zygotic genome is a nearly universal phenomenon in metazoans. Immediately following fertilization, development is controlled by maternally deposited products, and it is not until later stages that widespread activation of the zygotic genome occurs. Although the mechanisms driving this genome activation are currently unknown, the transcriptional activator Zelda (ZLD) has been shown to be instrumental in driving this process in Drosophila melanogaster. Here we define functional domains of ZLD required for both DNA binding and transcriptional activation. We show that the C-terminal cluster of four zinc fingers mediates binding to TAGteam DNA elements in the promoters of early expressed genes. All four zinc fingers are required for this activity, and splice isoforms lacking three of the four zinc fingers fail to activate transcription. These truncated splice isoforms dominantly suppress activation by the full-length, embryonically expressed isoform. We map the transcriptional activation domain of ZLD to a central region characterized by low complexity. Despite relatively little sequence conservation within this domain, ZLD orthologs from Drosophila virilis, Anopheles gambiae, and Nasonia vitripennis activate transcription in D. melanogaster cells. Transcriptional activation by these ZLD orthologs suggests that ZLD functions through conserved interactions with a protein cofactor(s). We have identified distinct DNA-binding and activation domains within the critical transcription factor ZLD that controls the initial activation of the zygotic genome. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
The Escherichia coli Lpt Transenvelope Protein Complex for Lipopolysaccharide Export Is Assembled via Conserved Structurally Homologous Domains

PubMed Central

Villa, Riccardo; Martorana, Alessandra M.; Okuda, Suguru; Gourlay, Louise J.; Nardini, Marco; Sperandeo, Paola; Dehò, Gianni; Bolognesi, Martino; Kahne, Daniel

2013-01-01

Lipopolysaccharide is a major glycolipid component in the outer leaflet of the outer membrane (OM), a peculiar permeability barrier of Gram-negative bacteria that prevents many toxic compounds from entering the cell. Lipopolysaccharide transport (Lpt) across the periplasmic space and its assembly at the Escherichia coli cell surface are carried out by a transenvelope complex of seven essential Lpt proteins spanning the inner membrane (LptBCFG), the periplasm (LptA), and the OM (LptDE), which appears to operate as a unique machinery. LptC is an essential inner membrane-anchored protein with a large periplasm-protruding domain. LptC binds the inner membrane LptBFG ABC transporter and interacts with the periplasmic protein LptA. However, its role in lipopolysaccharide transport is unclear. Here we show that LptC lacking the transmembrane region is viable and can bind the LptBFG inner membrane complex; thus, the essential LptC functions are located in the periplasmic domain. In addition, we characterize two previously described inactive single mutations at two conserved glycines (G56V and G153R, respectively) of the LptC periplasmic domain, showing that neither mutant is able to assemble the transenvelope machinery. However, while LptCG56V failed to copurify any Lpt component, LptCG153R was able to interact with the inner membrane protein complex LptBFG. Overall, our data further support the model whereby the bridge connecting the inner and outer membranes would be based on the conserved structurally homologous jellyroll domain shared by five out of the seven Lpt components. PMID:23292770
Paralog-Specific Patterns of Structural Disorder and Phosphorylation in the Vertebrate SH3-SH2-Tyrosine Kinase Protein Family.

PubMed

Dos Santos, Helena G; Siltberg-Liberles, Jessica

2016-09-19

One of the largest multigene families in Metazoa are the tyrosine kinases (TKs). These are important multifunctional proteins that have evolved as dynamic switches that perform tyrosine phosphorylation and other noncatalytic activities regulated by various allosteric mechanisms. TKs interact with each other and with other molecules, ultimately activating and inhibiting different signaling pathways. TKs are implicated in cancer and almost 30 FDA-approved TK inhibitors are available. However, specific binding is a challenge when targeting an active site that has been conserved in multiple protein paralogs for millions of years. A cassette domain (CD) containing SH3-SH2-Tyrosine Kinase domains reoccurs in vertebrate nonreceptor TKs. Although part of the CD function is shared between TKs, it also presents TK specific features. Here, the evolutionary dynamics of sequence, structure, and phosphorylation across the CD in 17 TK paralogs have been investigated in a large-scale study. We establish that TKs often have ortholog-specific structural disorder and phosphorylation patterns, while secondary structure elements, as expected, are highly conserved. Further, domain-specific differences are at play. Notably, we found the catalytic domain to fluctuate more in certain secondary structure elements than the regulatory domains. By elucidating how different properties evolve after gene duplications and which properties are specifically conserved within orthologs, the mechanistic understanding of protein evolution is enriched and regions supposedly critical for functional divergence across paralogs are highlighted. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
A Conserved Acidic Motif in the N-Terminal Domain of Nitrate Reductase Is Necessary for the Inactivation of the Enzyme in the Dark by Phosphorylation and 14-3-3 Binding1

PubMed Central

Pigaglio, Emmanuelle; Durand, Nathalie; Meyer, Christian

1999-01-01

It has previously been shown that the N-terminal domain of tobacco (Nicotiana tabacum) nitrate reductase (NR) is involved in the inactivation of the enzyme by phosphorylation, which occurs in the dark (L. Nussaume, M. Vincentz, C. Meyer, J.P. Boutin, and M. Caboche [1995] Plant Cell 7: 611–621). The activity of a mutant NR protein lacking this N-terminal domain was no longer regulated by light-dark transitions. In this study smaller deletions were performed in the N-terminal domain of tobacco NR that removed protein motifs conserved among higher plant NRs. The resulting truncated NR-coding sequences were then fused to the cauliflower mosaic virus 35S RNA promoter and introduced in NR-deficient mutants of the closely related species Nicotiana plumbaginifolia. We found that the deletion of a conserved stretch of acidic residues led to an active NR protein that was more thermosensitive than the wild-type enzyme, but it was relatively insensitive to the inactivation by phosphorylation in the dark. Therefore, the removal of this acidic stretch seems to have the same effects on NR activation state as the deletion of the N-terminal domain. A hypothetical explanation for these observations is that a specific factor that impedes inactivation remains bound to the truncated enzyme. A synthetic peptide derived from this acidic protein motif was also found to be a good substrate for casein kinase II. PMID:9880364
CicerTransDB 1.0: a resource for expression and functional study of chickpea transcription factors.

PubMed

Gayali, Saurabh; Acharya, Shankar; Lande, Nilesh Vikram; Pandey, Aarti; Chakraborty, Subhra; Chakraborty, Niranjan

2016-07-29

Transcription factor (TF) databases are major resource for systematic studies of TFs in specific species as well as related family members. Even though there are several publicly available multi-species databases, the information on the amount and diversity of TFs within individual species is fragmented, especially for newly sequenced genomes of non-model species of agricultural significance. We constructed CicerTransDB (Cicer Transcription Factor Database), the first database of its kind, which would provide a centralized putatively complete list of TFs in a food legume, chickpea. CicerTransDB, available at www.cicertransdb.esy.es , is based on chickpea (Cicer arietinum L.) annotation v 1.0. The database is an outcome of genome-wide domain study and manual classification of TF families. This database not only provides information of the gene, but also gene ontology, domain and motif architecture. CicerTransDB v 1.0 comprises information of 1124 genes of chickpea and enables the user to not only search, browse and download sequences but also retrieve sequence features. CicerTransDB also provides several single click interfaces, transconnecting to various other databases to ease further analysis. Several webAPI(s) integrated in the database allow end-users direct access of data. A critical comparison of CicerTransDB with PlantTFDB (Plant Transcription Factor Database) revealed 68 novel TFs in the chickpea genome, hitherto unexplored. Database URL: http://www.cicertransdb.esy.es.
The Protein Information Resource: an integrated public resource of functional annotation of proteins

PubMed Central

Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.

2002-01-01

The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247
A Chronostratigraphic Relational Database Ontology

NASA Astrophysics Data System (ADS)

Platon, E.; Gary, A.; Sikora, P.

2005-12-01

A chronostratigraphic research database was donated by British Petroleum to the Stratigraphy Group at the Energy and Geoscience Institute (EGI), University of Utah. These data consists of over 2,000 measured sections representing over three decades of research into the application of the graphic correlation method. The data are global and includes both microfossil (foraminifera, calcareous nannoplankton, spores, pollen, dinoflagellate cysts, etc) and macrofossil data. The objective of the donation was to make the research data available to the public in order to encourage additional chronostratigraphy studies, specifically regarding graphic correlation. As part of the National Science Foundation's Cyberinfrastructure for the Geosciences (GEON) initiative these data have been made available to the public at http://css.egi.utah.edu. To encourage further research using the graphic correlation method, EGI has developed a software package, StrataPlot that will soon be publicly available from the GEON website as a standalone software download. The EGI chronostratigraphy research database, although relatively large, has many data holes relative to some paleontological disciplines and geographical areas, so the challenge becomes how do we expand the data available for chronostratigrahic studies using graphic correlation. There are several public or soon-to-be public databases available to chronostratigraphic research, but they have their own data structures and modes of presentation. The heterogeneous nature of these database schemas hinders their integration and makes it difficult for the user to retrieve and consolidate potentially valuable chronostratigraphic data. The integration of these data sources would facilitate rapid and comprehensive data searches, thus helping advance studies in chronostratigraphy. The GEON project will host a number of databases within the geology domain, some of which contain biostratigraphic data. Ontologies are being developed to provide an integrated query system for the searching across GEON's biostratigraphy databases, as well as databases available in the public domain. Although creating an ontology directly from the existing database metadata would have been effective and straightforward, our effort was directed towards creating a more efficient representation of our database, as well as a general representation of the biostratigraphic domain.
Structure of the GH1 domain of guanylate kinase-associated protein from Rattus norvegicus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tong, Junsen; Yang, Huiseon; Eom, Soo Hyun

2014-09-12

Graphical abstract: - Highlights: • The crystal structure of GKAP homology domain 1 (GH1) was determined. • GKAP GH1 is a three-helix bundle connected by short flexible loops. • The predicted helix α4 associates weakly with the helix α3, suggesting dynamic nature of the GH1 domain. - Abstract: Guanylate-kinase-associated protein (GKAP) is a scaffolding protein that links NMDA receptor-PSD-95 to Shank–Homer complexes by protein–protein interactions at the synaptic junction. GKAP family proteins are characterized by the presence of a C-terminal conserved GKAP homology domain 1 (GH1) of unknown structure and function. In this study, crystal structure of the GH1 domainmore » of GKAP from Rattus norvegicus was determined in fusion with an N-terminal maltose-binding protein at 2.0 Å resolution. The structure of GKAP GH1 displays a three-helix bundle connected by short flexible loops. The predicted helix α4 which was not visible in the crystal structure associates weakly with the helix α3 suggesting dynamic nature of the GH1 domain. The strict conservation of GH1 domain across GKAP family members and the lack of a catalytic active site required for enzyme activity imply that the GH1 domain might serve as a protein–protein interaction module for the synaptic protein clustering.« less
Role of phosphatidylserine in the activation of Rho1-related Pkc1 signaling in Saccharomyces cerevisiae.

PubMed

Nomura, Wataru; Ito, Yusuke; Inoue, Yoshiharu

2017-02-01

Protein kinase C (PKC) belongs to a family of serine/threonine kinases and is evolutionary conserved among eukaryotes. It contains several functional domains, with the C1 domain being identified as a membrane-targeting module. Diacylglycerol (DAG) and phorbol esters bind to the C1 domain to enhance its kinase activity. The C1 domain is conserved in PKC (Pkc1) in the budding yeast Saccharomyces cerevisiae; however, its kinase activity does not respond to DAG. Although the C1 domain of Pkc1 physically interacts with the small GTPase Rho1, the interaction between C1 domain and lipids has not yet been characterized. We herein provide evidence to show the physical interaction between the C1 domain of Pkc1 and phosphatidylserine (PS), but not DAG. The stress-induced activation of Pkc1 signaling was abolished in a cho1 mutant, which was defective in PS synthase. The deletion of CHO1 perturbed the appropriate localization of Pkc1 at the bud tip, and impaired the physical interaction between Pkc1 and GTP-bound Rho1 in vivo. Our results suggest that PS is necessary for Pkc1 signaling due to its role in regulating the localization of Pkc1 as well as the physical interaction between Rho1 and Pkc1. Copyright © 2017 Elsevier Inc. All rights reserved.
Structural requirements for the assembly of LINC complexes and their function in cellular mechanical stiffness

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stewart-Hutchinson, P.J.; Hale, Christopher M.; Wirtz, Denis

The evolutionary-conserved interactions between KASH and SUN domain-containing proteins within the perinuclear space establish physical connections, called LINC complexes, between the nucleus and the cytoskeleton. Here, we show that the KASH domains of Nesprins 1, 2 and 3 interact promiscuously with luminal domains of Sun1 and Sun2. These constructs disrupt endogenous LINC complexes as indicated by the displacement of endogenous Nesprins from the nuclear envelope. We also provide evidence that KASH domains most probably fit a pocket provided by SUN domains and that post-translational modifications are dispensable for that interaction. We demonstrate that the disruption of endogenous LINC complexes affectmore » cellular mechanical stiffness to an extent that compares to the loss of mechanical stiffness previously reported in embryonic fibroblasts derived from mouse lacking A-type lamins, a mouse model of muscular dystrophies and cardiomyopathies. These findings support a model whereby physical connections between the nucleus and the cytoskeleton are mediated by interactions between diverse combinations of Sun proteins and Nesprins through their respective evolutionary-conserved domains. Furthermore, they emphasize, for the first time, the relevance of LINC complexes in cellular mechanical stiffness suggesting a possible involvement of their disruption in various laminopathies, a group of human diseases linked to mutations of A-type lamins.« less
Drosophila Pumilio Protein Contains Multiple Autonomous Repression Domains That Regulate mRNAs Independently of Nanos and Brain Tumor

PubMed Central

Weidmann, Chase A.

2012-01-01

Drosophila melanogaster Pumilio is an RNA-binding protein that potently represses specific mRNAs. In developing embryos, Pumilio regulates a key morphogen, Hunchback, in collaboration with the cofactor Nanos. To investigate repression by Pumilio and Nanos, we created cell-based assays and found that Pumilio inhibits translation and enhances mRNA decay independent of Nanos. Nanos robustly stimulates repression through interactions with the Pumilio RNA-binding domain. We programmed Pumilio to recognize a new binding site, which garners repression of new target mRNAs. We show that cofactors Brain Tumor and eIF4E Homologous Protein are not obligatory for Pumilio and Nanos activity. The conserved RNA-binding domain of Pumilio was thought to be sufficient for its function. Instead, we demonstrate that three unique domains in the N terminus of Pumilio possess the major repressive activity and can function autonomously. The N termini of insect and vertebrate Pumilio and Fem-3 binding factors (PUFs) are related, and we show that corresponding regions of human PUM1 and PUM2 have repressive activity. Other PUF proteins lack these repression domains. Our findings suggest that PUF proteins have evolved new regulatory functions through protein sequences appended to their conserved PUF repeat RNA-binding domains. PMID:22064486
Drosophila Pumilio protein contains multiple autonomous repression domains that regulate mRNAs independently of Nanos and brain tumor.

PubMed

Weidmann, Chase A; Goldstrohm, Aaron C

2012-01-01

Drosophila melanogaster Pumilio is an RNA-binding protein that potently represses specific mRNAs. In developing embryos, Pumilio regulates a key morphogen, Hunchback, in collaboration with the cofactor Nanos. To investigate repression by Pumilio and Nanos, we created cell-based assays and found that Pumilio inhibits translation and enhances mRNA decay independent of Nanos. Nanos robustly stimulates repression through interactions with the Pumilio RNA-binding domain. We programmed Pumilio to recognize a new binding site, which garners repression of new target mRNAs. We show that cofactors Brain Tumor and eIF4E Homologous Protein are not obligatory for Pumilio and Nanos activity. The conserved RNA-binding domain of Pumilio was thought to be sufficient for its function. Instead, we demonstrate that three unique domains in the N terminus of Pumilio possess the major repressive activity and can function autonomously. The N termini of insect and vertebrate Pumilio and Fem-3 binding factors (PUFs) are related, and we show that corresponding regions of human PUM1 and PUM2 have repressive activity. Other PUF proteins lack these repression domains. Our findings suggest that PUF proteins have evolved new regulatory functions through protein sequences appended to their conserved PUF repeat RNA-binding domains.
BEND3 is involved in the human-specific repression of calreticulin: Implication for the evolution of higher brain functions in human.

PubMed

Aghajanirefah, A; Nguyen, L N; Ohadi, M

2016-01-15

Recent emerging evidence indicates that changes in gene expression levels are linked to human evolution. We have previously reported a human-specific nucleotide in the promoter sequence of the calreticulin (CALR) gene at position -220C, which is the site of action of valproic acid. Reversion of this nucleotide to the ancestral A-allele has been detected in patients with degrees of deficit in higher brain cognitive functions. This mutation has since been reported in the 1000 genomes database at an approximate frequency of <0.0004 in humans (rs138452745). In the study reported here, we present update on the status of rs138452745 across evolution, based on the Ensembl and NCBI databases. The DNA pulldown assay was also used to identify the proteins binding to the C- and A-alleles, using two cell lines, SK-N-BE and HeLa. Consistent with our previous findings, the C-allele is human-specific, and the A-allele is the rule across all other species (N=38). This nucleotide resides in a block of 12-nucleotides that is strictly conserved across evolution. The DNA pulldown experiments revealed that in both SK-N-BE and HeLa cells, the transcription repressor BEN domain containing 3 (BEND3) binds to the human-specific C-allele, whereas the nuclear factor I (NFI) family members, NF1A, B, C, and X, specifically bind to the ancestral A-allele. This binding pattern is consistent with a previously reported decreased promoter activity of the C-allele vs. the A-allele. We propose that there is a link between binding of BEND3 to the CALR rs138452745 C-allele and removal of NFI binding site from this nucleotide, and the evolution of human-specific higher brain functions. To our knowledge, CALR rs138452745 is the first instance of enormous nucleotide conservation across evolution, except in the human species. Copyright © 2015 Elsevier B.V. All rights reserved.
Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function

PubMed Central

Ezkurdia, Iakes; del Pozo, Angela; Frankish, Adam; Rodriguez, Jose Manuel; Harrow, Jennifer; Ashman, Keith; Valencia, Alfonso; Tress, Michael L.

2012-01-01

Advances in high-throughput mass spectrometry are making proteomics an increasingly important tool in genome annotation projects. Peptides detected in mass spectrometry experiments can be used to validate gene models and verify the translation of putative coding sequences (CDSs). Here, we have identified peptides that cover 35% of the genes annotated by the GENCODE consortium for the human genome as part of a comprehensive analysis of experimental spectra from two large publicly available mass spectrometry databases. We detected the translation to protein of “novel” and “putative” protein-coding transcripts as well as transcripts annotated as pseudogenes and nonsense-mediated decay targets. We provide a detailed overview of the population of alternatively spliced protein isoforms that are detectable by peptide identification methods. We found that 150 genes expressed multiple alternative protein isoforms. This constitutes the largest set of reliably confirmed alternatively spliced proteins yet discovered. Three groups of genes were highly overrepresented. We detected alternative isoforms for 10 of the 25 possible heterogeneous nuclear ribonucleoproteins, proteins with a key role in the splicing process. Alternative isoforms generated from interchangeable homologous exons and from short indels were also significantly enriched, both in human experiments and in parallel analyses of mouse and Drosophila proteomics experiments. Our results show that a surprisingly high proportion (almost 25%) of the detected alternative isoforms are only subtly different from their constitutive counterparts. Many of the alternative splicing events that give rise to these alternative isoforms are conserved in mouse. It was striking that very few of these conserved splicing events broke Pfam functional domains or would damage globular protein structures. This evidence of a strong bias toward subtle differences in CDS and likely conserved cellular function and structure is remarkable and strongly suggests that the translation of alternative transcripts may be subject to selective constraints. PMID:22446687
From Binding-Induced Dynamic Effects in SH3 Structures to Evolutionary Conserved Sectors.

PubMed

Zafra Ruano, Ana; Cilia, Elisa; Couceiro, José R; Ruiz Sanz, Javier; Schymkowitz, Joost; Rousseau, Frederic; Luque, Irene; Lenaerts, Tom

2016-05-01

Src Homology 3 domains are ubiquitous small interaction modules known to act as docking sites and regulatory elements in a wide range of proteins. Prior experimental NMR work on the SH3 domain of Src showed that ligand binding induces long-range dynamic changes consistent with an induced fit mechanism. The identification of the residues that participate in this mechanism produces a chart that allows for the exploration of the regulatory role of such domains in the activity of the encompassing protein. Here we show that a computational approach focusing on the changes in side chain dynamics through ligand binding identifies equivalent long-range effects in the Src SH3 domain. Mutation of a subset of the predicted residues elicits long-range effects on the binding energetics, emphasizing the relevance of these positions in the definition of intramolecular cooperative networks of signal transduction in this domain. We find further support for this mechanism through the analysis of seven other publically available SH3 domain structures of which the sequences represent diverse SH3 classes. By comparing the eight predictions, we find that, in addition to a dynamic pathway that is relatively conserved throughout all SH3 domains, there are dynamic aspects specific to each domain and homologous subgroups. Our work shows for the first time from a structural perspective, which transduction mechanisms are common between a subset of closely related and distal SH3 domains, while at the same time highlighting the differences in signal transduction that make each family member unique. These results resolve the missing link between structural predictions of dynamic changes and the domain sectors recently identified for SH3 domains through sequence analysis.
From Binding-Induced Dynamic Effects in SH3 Structures to Evolutionary Conserved Sectors

PubMed Central

Ruiz Sanz, Javier; Schymkowitz, Joost; Rousseau, Frederic

2016-01-01

Src Homology 3 domains are ubiquitous small interaction modules known to act as docking sites and regulatory elements in a wide range of proteins. Prior experimental NMR work on the SH3 domain of Src showed that ligand binding induces long-range dynamic changes consistent with an induced fit mechanism. The identification of the residues that participate in this mechanism produces a chart that allows for the exploration of the regulatory role of such domains in the activity of the encompassing protein. Here we show that a computational approach focusing on the changes in side chain dynamics through ligand binding identifies equivalent long-range effects in the Src SH3 domain. Mutation of a subset of the predicted residues elicits long-range effects on the binding energetics, emphasizing the relevance of these positions in the definition of intramolecular cooperative networks of signal transduction in this domain. We find further support for this mechanism through the analysis of seven other publically available SH3 domain structures of which the sequences represent diverse SH3 classes. By comparing the eight predictions, we find that, in addition to a dynamic pathway that is relatively conserved throughout all SH3 domains, there are dynamic aspects specific to each domain and homologous subgroups. Our work shows for the first time from a structural perspective, which transduction mechanisms are common between a subset of closely related and distal SH3 domains, while at the same time highlighting the differences in signal transduction that make each family member unique. These results resolve the missing link between structural predictions of dynamic changes and the domain sectors recently identified for SH3 domains through sequence analysis. PMID:27213566
The Council on Quality and Leadership in Supports for People with Disabilities: Personal Outcomes Chart Book.

ERIC Educational Resources Information Center

National Center on Outcomes Research, Council on Quality and Leadership, Towson, MD.

This report describes the genesis, definition and use of the Personal Outcomes database, a database designed to assess whether programs and services are being effective in helping individuals with disabilities. The database is based on 25 outcome measures in seven domains, including: (1) identity, which is designed to provide a sense of how people…

A cluster of diagnostic Hsp68 amino acid sites that are identified in Drosophila from the melanogaster species group are concentrated around beta-sheet residues involved with substrate binding.

PubMed

Kellett, Mark; McKechnie, Stephen W

2005-04-01

The coding region of the hsp68 gene has been amplified, cloned, and sequenced from 10 Drosophila species, 5 from the melanogaster subgroup and 5 from the montium subgroup. When the predicted amino acid sequences are compared with available Hsp70 sequences, patterns of conservation suggest that the C-terminal region should be subdivided according to predominant secondary structure. Conservation levels between Hsp68 and Hsp70 proteins were high in the N-terminal ATPase and adjacent beta-sheet domains, medium in the alpha-helix domain, and low in the C-terminal mobile domain (78%, 72%, 41%, and 21% identity, respectively). A number of amino acid sites were found to be "diagnostic" for Hsp68 (28 of approximately 635 residues). A few of these occur in the ATPase domain (385 residues) but most (75%) are concentrated in the beta-sheet and alpha-helix domains (34% of the protein) with none in the short mobile domain. Five of the diagnostic sites in the beta-sheet domain are clustered around, but not coincident with, functional sites known to be involved in substrate binding. Nearly all of the Hsp70 family length variation occurs in the mobile domain. Within montium subgroup species, 2 nearly identical hsp68 PCR products that differed in length are either different alleles or products of an ancestral hsp68 duplication.
BioFrameNet: A FrameNet Extension to the Domain of Molecular Biology

ERIC Educational Resources Information Center

Dolbey, Andrew Eric

2009-01-01

In this study I introduce BioFrameNet, an extension of the Berkeley FrameNet lexical database to the domain of molecular biology. I examine the syntactic and semantic combinatorial possibilities exhibited in the lexical items used in this domain in order to get a better understanding of the grammatical properties of the language used in scientific…
An Evaluation of Database Solutions to Spatial Object Association

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kumar, V S; Kurc, T; Saltz, J

2008-06-24

Object association is a common problem encountered in many applications. Spatial object association, also referred to as crossmatch of spatial datasets, is the problem of identifying and comparing objects in two datasets based on their positions in a common spatial coordinate system--one of the datasets may correspond to a catalog of objects observed over time in a multi-dimensional domain; the other dataset may consist of objects observed in a snapshot of the domain at a time point. The use of database management systems to the solve the object association problem provides portability across different platforms and also greater flexibility. Increasingmore » dataset sizes in today's applications, however, have made object association a data/compute-intensive problem that requires targeted optimizations for efficient execution. In this work, we investigate how database-based crossmatch algorithms can be deployed on different database system architectures and evaluate the deployments to understand the impact of architectural choices on crossmatch performance and associated trade-offs. We investigate the execution of two crossmatch algorithms on (1) a parallel database system with active disk style processing capabilities, (2) a high-throughput network database (MySQL Cluster), and (3) shared-nothing databases with replication. We have conducted our study in the context of a large-scale astronomy application with real use-case scenarios.« less
A Standardized Protocol for the Prospective Follow-Up of Cleft Lip and Palate Patients.

PubMed

Salimi, Negar; Jolanta, Aleksejūnienė; Edwin, Yen; Angelina, Loo

2018-01-01

To develop a standardized all-encompassing protocol for the assessment of cleft lip and palate patients with clinical and research implications. Electronic database searches were conducted and 13 major cleft centers worldwide were contacted in order to prepare for the development of the protocol. In preparation, the available evidence was reviewed and potential fistula-related risk determinants from 4 different domains were identified. No standardized protocol for the assessment of cleft patients could be found in any of the electronic database searches that were conducted. Interviews with representatives from several major centers revealed that the majority of centers do not have a standardized comprehensive strategy for the reporting and follow-up of cleft lip and palate patients. The protocol was developed and consisted of the following domains of determinants: (1) the sociodemographic domain, (2) the cleft defect domain, (3) the surgery domain, and (4) the fistula domain. The proposed protocol has the potential to enhance the quality of patient care by ensuring that multiple patient-related aspects are consistently reported. It may also facilitate future multicenter research, which could contribute to the reduction of fistula occurrence in cleft lip and palate patients.
Identifying different transcribed proteins in the newly described Theraphosidae Pamphobeteus verdolaga.

PubMed

Estrada-Gómez, Sebastian; Vargas-Muñoz, Leidy Johana; Saldarriaga-Córdoba, Mónica; Cifuentes, Yeimy; Perafan, Carlos

2017-04-01

Theraphosidae spider venoms are well known for possess a complex mixture of protein and non-protein compounds in their venom. The objective of this study was to report and identify different proteins translated from the venom gland DNA information of the recently described Theraphosidae spider Pamphobeteus verdolaga. Using a venom gland transcriptomic analysis, we reported a set of the first complete sequences of seven different proteins of the recenlty described Theraphosidae spider P. verdolaga. Protein analysis indicates the presence of different proteins on the venom composition of this new spider, some of them uncommon in the Theraphosidae family. MS/MS analysis of P. verdolaga showed different fragments matching sphingomyelinases (sicaritoxin), barytoxins, hexatoxins, latroinsectotoxins, and linear (zadotoxins) peptides. Only four of the MS/MS fragments showed 100% sequence similarity with one of the transcribed proteins. Transcriptomic analysis showed the presence of different groups of proteins like phospholipases, hyaluronidases, inhibitory cysteine knots (ICK) peptides among others. The three database of protein domains used in this study (Pfam, SMART and CDD) showed congruency in the search of unique conserved protein domain for only four of the translated proteins. Those proteins matched with EF-hand proteins, cysteine rich secretory proteins, jingzhaotoxins, theraphotoxins and hexatoxins, from different Mygalomorphae spiders belonging to the families Theraphosidae, Barychelidae and Hexathelidae. None of the analyzed sequences showed a complete 100% similarity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Conservation of Matrix Attachment Region-Binding Filament-Like Protein 1 among Higher Plants1

PubMed Central

Harder, Patricia A.; Silverstein, Rebecca A.; Meier, Iris

2000-01-01

The interaction of chromatin with the nuclear matrix via matrix attachment regions (MARs) on the DNA is considered to be of fundamental importance for higher-order chromatin organization and the regulation of gene expression. We have previously isolated a novel nuclear matrix-localized protein (MFP1) from tomato (Lycopersicon esculentum) that preferentially binds to MAR DNA. Tomato MFP1 has a predicted filament-protein-like structure and is associated with the nuclear envelope via an N-terminal targeting domain. Based on the antigenic relationship, we report here that MFP1 is conserved in a large number of dicot and monocot species. Several cDNAs were cloned from tobacco (Nicotiana tabacum) and shown to correspond to two tobacco MFP1 genes. Comparison of the primary and predicted secondary structures of MFP1 from tomato, tobacco, and Arabidopsis indicates a high degree of conservation of the N-terminal targeting domain, the overall putative coiled-coil structure of the protein, and the C-terminal DNA-binding domain. In addition, we show that tobacco MFP1 is regulated in an organ-specific and developmental fashion, and that this regulation occurs at the level of transcription or RNA stability. PMID:10631266
Induction of ebolavirus cross-species immunity using retrovirus-like particles bearing the Ebola virus glycoprotein lacking the mucin-like domain

PubMed Central

2012-01-01

Background The genus Ebolavirus includes five distinct viruses. Four of these viruses cause hemorrhagic fever in humans. Currently there are no licensed vaccines for any of them; however, several vaccines are under development. Ebola virus envelope glycoprotein (GP1,2) is highly immunogenic, but antibodies frequently arise against its least conserved mucin-like domain (MLD). We hypothesized that immunization with MLD-deleted GP1,2 (GPΔMLD) would induce cross-species immunity by making more conserved regions accessible to the immune system. Methods To test this hypothesis, mice were immunized with retrovirus-like particles (retroVLPs) bearing Ebola virus GPΔMLD, DNA plasmids (plasmo-retroVLP) that can produce such retroVLPs in vivo, or plasmo-retroVLP followed by retroVLPs. Results Cross-species neutralizing antibody and GP1,2-specific cellular immune responses were successfully induced. Conclusion Our findings suggest that GPΔMLD presented through retroVLPs may provide a strategy for development of a vaccine against multiple ebolaviruses. Similar vaccination strategies may be adopted for other viruses whose envelope proteins contain highly variable regions that may mask more conserved domains from the immune system. PMID:22273269
Induction of ebolavirus cross-species immunity using retrovirus-like particles bearing the Ebola virus glycoprotein lacking the mucin-like domain.

PubMed

Ou, Wu; Delisle, Josie; Jacques, Jerome; Shih, Joanna; Price, Graeme; Kuhn, Jens H; Wang, Vivian; Verthelyi, Daniela; Kaplan, Gerardo; Wilson, Carolyn A

2012-01-25

The genus Ebolavirus includes five distinct viruses. Four of these viruses cause hemorrhagic fever in humans. Currently there are no licensed vaccines for any of them; however, several vaccines are under development. Ebola virus envelope glycoprotein (GP1,2) is highly immunogenic, but antibodies frequently arise against its least conserved mucin-like domain (MLD). We hypothesized that immunization with MLD-deleted GP1,2 (GPΔMLD) would induce cross-species immunity by making more conserved regions accessible to the immune system. To test this hypothesis, mice were immunized with retrovirus-like particles (retroVLPs) bearing Ebola virus GPΔMLD, DNA plasmids (plasmo-retroVLP) that can produce such retroVLPs in vivo, or plasmo-retroVLP followed by retroVLPs. Cross-species neutralizing antibody and GP1,2-specific cellular immune responses were successfully induced. Our findings suggest that GPΔMLD presented through retroVLPs may provide a strategy for development of a vaccine against multiple ebolaviruses. Similar vaccination strategies may be adopted for other viruses whose envelope proteins contain highly variable regions that may mask more conserved domains from the immune system.
Numerical Model Sensitivity to Heterogeneous Satellite Derived Vegetation Roughness

NASA Technical Reports Server (NTRS)

Jasinski, Michael; Eastman, Joseph; Borak, Jordan

2011-01-01

The sensitivity of a mesoscale weather prediction model to a 1 km satellite-based vegetation roughness initialization is investigated for a domain within the south central United States. Three different roughness databases are employed: i) a control or standard lookup table roughness that is a function only of land cover type, ii) a spatially heterogeneous roughness database, specific to the domain, that was previously derived using a physically based procedure and Moderate Resolution Imaging Spectroradiometer (MODIS) imagery, and iii) a MODIS climatologic roughness database that like (i) is a function only of land cover type, but possesses domain specific mean values from (ii). The model used is the Weather Research and Forecast Model (WRF) coupled to the Community Land Model within the Land Information System (LIS). For each simulation, a statistical comparison is made between modeled results and ground observations within a domain including Oklahoma, Eastern Arkansas, and Northwest Louisiana during a 4-day period within IHOP 2002. Sensitivity analysis compares the impact the three roughness initializations on time-series temperature, precipitation probability of detection (POD), average wind speed, boundary layer height, and turbulent kinetic energy (TKE). Overall, the results indicate that, for the current investigation, replacement of the standard look-up table values with the satellite-derived values statistically improves model performance for most observed variables. Such natural roughness heterogeneity enhances the surface wind speed, PBL height and TKE production up to 10 percent, with a lesser effect over grassland, and greater effect over mixed land cover domains.
Localizome: a server for identifying transmembrane topologies and TM helices of eukaryotic proteins utilizing domain information

PubMed Central

Lee, Sunghoon; Lee, Byungwook; Jang, Insoo; Kim, Sangsoo; Bhak, Jong

2006-01-01

The Localizome server predicts the transmembrane (TM) helix number and TM topology of a user-supplied eukaryotic protein and presents the result as an intuitive graphic representation. It utilizes hmmpfam to detect the presence of Pfam domains and a prediction algorithm, Phobius, to predict the TM helices. The results are combined and checked against the TM topology rules stored in a protein domain database called LocaloDom. LocaloDom is a curated database that contains TM topologies and TM helix numbers of known protein domains. It was constructed from Pfam domains combined with Swiss-Prot annotations and Phobius predictions. The Localizome server corrects the combined results of the user sequence to conform to the rules stored in LocaloDom. Compared with other programs, this server showed the highest accuracy for TM topology prediction: for soluble proteins, the accuracy and coverage were 99 and 75%, respectively, while for TM protein domain regions, they were 96 and 68%, respectively. With a graphical representation of TM topology and TM helix positions with the domain units, the Localizome server is a highly accurate and comprehensive information source for subcellular localization for soluble proteins as well as membrane proteins. The Localizome server can be found at . PMID:16845118
Structure of Methylobacterium extorquens malyl-CoA lyase: CoA-substrate binding correlates with domain shift

DOE PAGES

Gonzalez, Javier M.; Marti-Arbona, Ricardo; Chen, Julian C. -H.; ...

2017-01-27

Malyl-CoA lyase (MCL) is an Mg 2+-dependent enzyme that catalyzes the reversible cleavage of (2 S)-4-malyl-CoA to yield acetyl-CoA and glyoxylate. MCL enzymes, which are found in a variety of bacteria, are members of the citrate lyase-like family and are involved in the assimilation of one- and two-carbon compounds. Here, the 1.56 Å resolution X-ray crystal structure of MCL from Methylobacterium extorquens AM1 with bound Mg 2+is presented. Structural alignment with the closely related Rhodobacter sphaeroides malyl-CoA lyase complexed with Mg 2+, oxalate and CoA allows a detailed analysis of the domain motion of the enzyme caused by substrate binding.more » Alignment of the structures shows that a simple hinge motion centered on the conserved residues Phe268 and Thr269 moves the C-terminal domain by about 30° relative to the rest of the molecule. Furthermore, this domain motion positions a conserved aspartate residue located in the C-terminal domain in the active site of the adjacent monomer, which may serve as a general acid/base in the catalytic mechanism.« less
Interactions between the PDZ domains of Bazooka (Par-3) and phosphatidic acid: in vitro characterization and role in epithelial development.

PubMed

Yu, Cao Guo; Harris, Tony J C

2012-09-01

Bazooka (Par-3) is a conserved polarity regulator that organizes molecular networks in a wide range of cell types. In epithelia, it functions as a plasma membrane landmark to organize the apical domain. Bazooka is a scaffold protein that interacts with proteins through its three PDZ (postsynaptic density 95, discs large, zonula occludens-1) domains and other regions. In addition, Bazooka has been shown to interact with phosphoinositides. Here we show that the Bazooka PDZ domains interact with the negatively charged phospholipid phosphatidic acid immobilized on solid substrates or in liposomes. The interaction requires multiple PDZ domains, and conserved patches of positively charged amino acid residues appear to mediate the interaction. Increasing or decreasing levels of diacylglycerol kinase or phospholipase D-enzymes that produce phosphatidic acid-reveal a role for phosphatidic acid in Bazooka embryonic epithelial activity but not its localization. Mutating residues implicated in phosphatidic acid binding revealed a possible role in Bazooka localization and function. These data implicate a closer connection between Bazooka and membrane lipids than previously recognized. Bazooka polarity landmarks may be conglomerates of proteins and plasma membrane lipids that modify each other's activities for an integrated effect on cell polarity.
Novel human mutation and CRISPR/Cas genome-edited mice reveal the importance of C-terminal domain of MSX1 in tooth and palate development

PubMed Central

Mitsui, Silvia Naomi; Yasue, Akihiro; Masuda, Kiyoshi; Naruto, Takuya; Minegishi, Yoshiyuki; Oyadomari, Seiichi; Noji, Sumihare; Imoto, Issei; Tanaka, Eiji

2016-01-01

Several mutations, located mainly in the MSX1 homeodomain, have been identified in non-syndromic tooth agenesis predominantly affecting premolars and third molars. We identified a novel frameshift mutation of the highly conserved C-terminal domain of MSX1, known as Msx homology domain 6 (MH6), in a Japanese family with non-syndromic tooth agenesis. To investigate the importance of MH6 in tooth development, Msx1 was targeted in mice with CRISPR/Cas system. Although heterozygous MH6 disruption did not alter craniofacial development, homozygous mice exhibited agenesis of lower incisors with or without cleft palate at E16.5. In addition, agenesis of the upper third molars and the lower second and third molars were observed in 4-week-old mutant mice. Although the upper second molars were present, they were abnormally small. These results suggest that the C-terminal domain of MSX1 is important for tooth and palate development, and demonstrate that that CRISPR/Cas system can be used as a tool to assess causality of human disorders in vivo and to study the importance of conserved domains in genes. PMID:27917906
Novel human mutation and CRISPR/Cas genome-edited mice reveal the importance of C-terminal domain of MSX1 in tooth and palate development.

PubMed

Mitsui, Silvia Naomi; Yasue, Akihiro; Masuda, Kiyoshi; Naruto, Takuya; Minegishi, Yoshiyuki; Oyadomari, Seiichi; Noji, Sumihare; Imoto, Issei; Tanaka, Eiji

2016-12-05

Several mutations, located mainly in the MSX1 homeodomain, have been identified in non-syndromic tooth agenesis predominantly affecting premolars and third molars. We identified a novel frameshift mutation of the highly conserved C-terminal domain of MSX1, known as Msx homology domain 6 (MH6), in a Japanese family with non-syndromic tooth agenesis. To investigate the importance of MH6 in tooth development, Msx1 was targeted in mice with CRISPR/Cas system. Although heterozygous MH6 disruption did not alter craniofacial development, homozygous mice exhibited agenesis of lower incisors with or without cleft palate at E16.5. In addition, agenesis of the upper third molars and the lower second and third molars were observed in 4-week-old mutant mice. Although the upper second molars were present, they were abnormally small. These results suggest that the C-terminal domain of MSX1 is important for tooth and palate development, and demonstrate that that CRISPR/Cas system can be used as a tool to assess causality of human disorders in vivo and to study the importance of conserved domains in genes.
Structure of Methylobacterium extorquens malyl-CoA lyase: CoA-substrate binding correlates with domain shift

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gonzalez, Javier M.; Marti-Arbona, Ricardo; Chen, Julian C. -H.

Malyl-CoA lyase (MCL) is an Mg 2+-dependent enzyme that catalyzes the reversible cleavage of (2 S)-4-malyl-CoA to yield acetyl-CoA and glyoxylate. MCL enzymes, which are found in a variety of bacteria, are members of the citrate lyase-like family and are involved in the assimilation of one- and two-carbon compounds. Here, the 1.56 Å resolution X-ray crystal structure of MCL from Methylobacterium extorquens AM1 with bound Mg 2+is presented. Structural alignment with the closely related Rhodobacter sphaeroides malyl-CoA lyase complexed with Mg 2+, oxalate and CoA allows a detailed analysis of the domain motion of the enzyme caused by substrate binding.more » Alignment of the structures shows that a simple hinge motion centered on the conserved residues Phe268 and Thr269 moves the C-terminal domain by about 30° relative to the rest of the molecule. Furthermore, this domain motion positions a conserved aspartate residue located in the C-terminal domain in the active site of the adjacent monomer, which may serve as a general acid/base in the catalytic mechanism.« less
H2B ubiquitination: Conserved molecular mechanism, diverse physiologic functions of the E3 ligase during meiosis.

PubMed

Wang, Liying; Cao, Chunwei; Wang, Fang; Zhao, Jianguo; Li, Wei

2017-09-03

RNF20/Bre1 mediated H2B ubiquitination (H2Bub) has various physiologic functions. Recently, we found that H2Bub participates in meiotic recombination by promoting chromatin relaxation during meiosis. We then analyzed the phylogenetic relationships among the E3 ligase for H2Bub, its E2 Rad6 and their partner WW domain-containing adaptor with a coiled-coil (WAC) or Lge1, and found that the molecular mechanism underlying H2Bub is evolutionarily conserved from yeast to mammals. However, RNF20 has diverse physiologic functions in different organisms, which might be caused by the evolutionary divergency of their domain/motif architectures. In the current extra view, we not only elucidate the evolutionarily conserved molecular mechanism underlying H2Bub, but also discuss the diverse physiologic functions of RNF20 during meiosis.
Conserved and variable domains of RNase MRP RNA.

PubMed

Dávila López, Marcela; Rosenblad, Magnus Alm; Samuelsson, Tore

2009-01-01

Ribonuclease MRP is a eukaryotic ribonucleoprotein complex consisting of one RNA molecule and 7-10 protein subunits. One important function of MRP is to catalyze an endonucleolytic cleavage during processing of rRNA precursors. RNase MRP is evolutionary related to RNase P which is critical for tRNA processing. A large number of MRP RNA sequences that now are available have been used to identify conserved primary and secondary structure features of the molecule. MRP RNA has structural features in common with P RNA such as a conserved catalytic core, but it also has unique features and is characterized by a domain highly variable between species. Information regarding primary and secondary structure features is of interest not only in basic studies of the function of MRP RNA, but also because mutations in the RNA give rise to human genetic diseases such as cartilage-hair hypoplasia.
HABITAT DISTRIBUTION MODELS FOR 37 VERTEBRATE SPECIES ADDRESSED BY THE MULTI-SPECIES HABITAT CONSERVATION PLAN OF CLARK COUNTY, NEVADA

EPA Science Inventory

Thirty-seven species identified in the Clark County Multi-Species Habitat Conservation Plan were

previously modeled through the Southwest Regional Gap Analysis Project. Existing SWReGAP habitat

models and modeling databases were used to facilitate the revision of mo...
Bridging the gap between habitat-modeling research and bird conservation with dynamic landscape and population models

Treesearch

Frank R., III Thompson

2009-01-01

Habitat models are widely used in bird conservation planning to assess current habitat or populations and to evaluate management alternatives. These models include species-habitat matrix or database models, habitat suitability models, and statistical models that predict abundance. While extremely useful, these approaches have some limitations.
Rational site-directed mutations of the LLP-1 and LLP-2 lentivirus lytic peptide domains in the intracytoplasmic tail of human immunodeficiency virus type 1 gp41 indicate common functions in cell-cell fusion but distinct roles in virion envelope incorporation.

PubMed

Kalia, Vandana; Sarkar, Surojit; Gupta, Phalguni; Montelaro, Ronald C

2003-03-01

Two highly conserved cationic amphipathic alpha-helical motifs, designated lentivirus lytic peptides 1 and 2 (LLP-1 and LLP-2), have been characterized in the carboxyl terminus of the transmembrane (TM) envelope glycoprotein (Env) of lentiviruses. Although various properties have been attributed to these domains, their structural and functional significance is not clearly understood. To determine the specific contributions of the Env LLP domains to Env expression, processing, and incorporation and to viral replication and syncytium induction, site-directed LLP mutants of a primary dualtropic infectious human immunodeficiency virus type 1 (HIV-1) isolate (ME46) were examined. Substitutions were made for highly conserved arginine residues in either the LLP-1 or LLP-2 domain (MX1 or MX2, respectively) or in both domains (MX4). The HIV-1 mutants with altered LLP domains demonstrated distinct phenotypes. The LLP-1 mutants (MX1 and MX4) were replication defective and showed an average of 85% decrease in infectivity, which was associated with an evident decrease in gp41 incorporation into virions without a significant decrease in Env expression or processing in transfected 293T cells. In contrast, MX2 virus was replication competent and incorporated a full complement of Env into its virions, indicating a differential role for the LLP-1 domain in Env incorporation. Interestingly, the replication-competent MX2 virus was impaired in its ability to induce syncytia in T-cell lines. This defect in cell-cell fusion did not correlate with apparent defects in the levels of cell surface Env expression, oligomerization, or conformation. The lack of syncytium formation, however, correlated with a decrease of about 90% in MX2 Env fusogenicity compared to that of wild-type Env in quantitative luciferase-based cell-cell fusion assays. The LLP-1 mutant MX1 and MX4 Envs also exhibited an average of 80% decrease in fusogenicity. Altogether, these results demonstrate for the first time that the highly conserved LLP domains perform critical but distinct functions in Env incorporation and fusogenicity.

Annotation of Protein Domains Reveals Remarkable Conservation in the Functional Make up of Proteomes Across Superkingdoms

PubMed Central

Nasir, Arshan; Naeem, Aisha; Khan, Muhammad Jawad; Lopez-Nicora, Horacio D.; Caetano-Anollés, Gustavo

2011-01-01

The functional repertoire of a cell is largely embodied in its proteome, the collection of proteins encoded in the genome of an organism. The molecular functions of proteins are the direct consequence of their structure and structure can be inferred from sequence using hidden Markov models of structural recognition. Here we analyze the functional annotation of protein domain structures in almost a thousand sequenced genomes, exploring the functional and structural diversity of proteomes. We find there is a remarkable conservation in the distribution of domains with respect to the molecular functions they perform in the three superkingdoms of life. In general, most of the protein repertoire is spent in functions related to metabolic processes but there are significant differences in the usage of domains for regulatory and extra-cellular processes both within and between superkingdoms. Our results support the hypotheses that the proteomes of superkingdom Eukarya evolved via genome expansion mechanisms that were directed towards innovating new domain architectures for regulatory and extra/intracellular process functions needed for example to maintain the integrity of multicellular structure or to interact with environmental biotic and abiotic factors (e.g., cell signaling and adhesion, immune responses, and toxin production). Proteomes of microbial superkingdoms Archaea and Bacteria retained fewer numbers of domains and maintained simple and smaller protein repertoires. Viruses appear to play an important role in the evolution of superkingdoms. We finally identify few genomic outliers that deviate significantly from the conserved functional design. These include Nanoarchaeum equitans, proteobacterial symbionts of insects with extremely reduced genomes, Tenericutes and Guillardia theta. These organisms spend most of their domains on information functions, including translation and transcription, rather than on metabolism and harbor a domain repertoire characteristic of parasitic organisms. In contrast, the functional repertoire of the proteomes of the Planctomycetes-Verrucomicrobia-Chlamydiae superphylum was no different than the rest of bacteria, failing to support claims of them representing a separate superkingdom. In turn, Protista and Bacteria shared similar functional distribution patterns suggesting an ancestral evolutionary link between these groups. PMID:24710297
Coordination through databases can improve prescribed burning as a conservation tool to promote forest biodiversity.

PubMed

Ramberg, Ellinor; Strengbom, Joachim; Granath, Gustaf

2018-04-01

Prescribed fires are a common nature conservation practice. They are executed by several parties with limited coordination among them, and little consideration for wildfire occurrences and habitat requirements of fire-dependent species. Here, we gathered data on prescribed fires and wildfires in Sweden during 2011-2015 to (i) evaluate the importance and spatial extent of prescribed fires compared to wildfires and (ii) illustrate how a database can be used as a management tool for prescribed fires. We found that on average only 0.006% (prescribed 65%, wildfires 35%) of the Swedish forest burns per year, with 58% of the prescribed fires occurring on clearcuts. Also, both wildfires and prescribed fires seem to be important for the survival of fire-dependent species. A national fire database would simplify coordination and make planning and evaluation of prescribed fires more efficient. We propose an adaptive management strategy to improve the outcome of prescribed fires.
Putative calcium-binding domains of the Caenorhabditis elegans BK channel are dispensable for intoxication and ethanol activation

PubMed Central

Davis, S. J.; Scott, L. L.; Ordemann, G.; Philpo, A.; Cohn, J.; Pierce-Shimomura, J. T.

2016-01-01

Alcohol modulates the highly conserved, voltage- and calcium-activated potassium (BK) channel, which contributes to alcohol-mediated behaviors in species from worms to humans. Previous studies have shown that the calcium-sensitive domains, RCK1 and the Ca2+ bowl, are required for ethanol activation of the mammalian BK channel in vitro. In the nematode Caenorhabditis elegans, ethanol activates the BK channel in vivo, and deletion of the worm BK channel, SLO-1, confers strong resistance to intoxication. To determine if the conserved RCK1 and calcium bowl domains were also critical for intoxication and basal BK channel-dependent behaviors in C. elegans, we generated transgenic worms that express mutated SLO-1 channels predicted to have the RCK1, Ca2+ bowl or both domains rendered insensitive to calcium. As expected, mutating these domains inhibited basal function of SLO-1 in vivo as neck and body curvature of these mutants mimicked that of the BK null mutant. Unexpectedly, however, mutating these domains singly or together in SLO-1 had no effect on intoxication in C. elegans. Consistent with these behavioral results, we found that ethanol activated the SLO-1 channel in vitro with or without these domains. By contrast, in agreement with previous in vitro findings, C. elegans harboring a human BK channel with mutated calcium-sensing domains displayed resistance to intoxication. Thus, for the worm SLO-1 channel, the putative calcium-sensitive domains are critical for basal in vivo function but unnecessary for in vivo ethanol action. PMID:26113050
Using the structure-function linkage database to characterize functional domains in enzymes.

PubMed

Brown, Shoshana; Babbitt, Patricia

2014-12-12

The Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu/) is a Web-accessible database designed to link enzyme sequence, structure, and functional information. This unit describes the protocols by which a user may query the database to predict the function of uncharacterized enzymes and to correct misannotated functional assignments. The information in this unit is especially useful in helping a user discriminate functional capabilities of a sequence that is only distantly related to characterized sequences in publicly available databases. Copyright © 2014 John Wiley & Sons, Inc.
[Public scientific knowledge distribution in health information, communication and information technology indexed in MEDLINE and LILACS databases].

PubMed

Packer, Abel Laerte; Tardelli, Adalberto Otranto; Castro, Regina Célia Figueiredo

2007-01-01

This study explores the distribution of international, regional and national scientific output in health information and communication, indexed in the MEDLINE and LILACS databases, between 1996 and 2005. A selection of articles was based on the hierarchical structure of Information Science in MeSH vocabulary. Four specific domains were determined: health information, medical informatics, scientific communications on healthcare and healthcare communications. The variables analyzed were: most-covered subjects and journals, author affiliation and publication countries and languages, in both databases. The Information Science category is represented in nearly 5% of MEDLINE and LILACS articles. The four domains under analysis showed a relative annual increase in MEDLINE. The Medical Informatics domain showed the highest number of records in MEDLINE, representing about half of all indexed articles. The importance of Information Science as a whole is more visible in publications from developed countries and the findings indicate the predominance of the United States, with significant growth in scientific output from China and South Korea and, to a lesser extent, Brazil.
GExplore: a web server for integrated queries of protein domains, gene expression and mutant phenotypes

PubMed Central

2009-01-01

Background The majority of the genes even in well-studied multi-cellular model organisms have not been functionally characterized yet. Mining the numerous genome wide data sets related to protein function to retrieve potential candidate genes for a particular biological process remains a challenge. Description GExplore has been developed to provide a user-friendly database interface for data mining at the gene expression/protein function level to help in hypothesis development and experiment design. It supports combinatorial searches for proteins with certain domains, tissue- or developmental stage-specific expression patterns, and mutant phenotypes. GExplore operates on a stand-alone database and has fast response times, which is essential for exploratory searches. The interface is not only user-friendly, but also modular so that it accommodates additional data sets in the future. Conclusion GExplore is an online database for quick mining of data related to gene and protein function, providing a multi-gene display of data sets related to the domain composition of proteins as well as expression and phenotype data. GExplore is publicly available at: http://genome.sfu.ca/gexplore/ PMID:19917126
A conservation ontology and knowledge base to support delivery of technical assistance to agricultural producers in the united states

USDA-ARS?s Scientific Manuscript database

Information systems supporting the delivery of conservation technical assistance by the United States Department of Agriculture (USDA) to agricultural producers on working lands have become increasingly complex over the past 25 years. They are constrained by inconsistent coordination of domain knowl...
Diversified Structural Basis of a Conserved Molecular Mechanism for pH-Dependent Dimerization in Spider Silk N-Terminal Domains.

PubMed

Otikovs, Martins; Chen, Gefei; Nordling, Kerstin; Landreh, Michael; Meng, Qing; Jörnvall, Hans; Kronqvist, Nina; Rising, Anna; Johansson, Jan; Jaudzems, Kristaps

2015-08-17

Conversion of spider silk proteins from soluble dope to insoluble fibers involves pH-dependent dimerization of the N-terminal domain (NT). This conversion is tightly regulated to prevent premature precipitation and enable rapid silk formation at the end of the duct. Three glutamic acid residues that mediate this process in the NT from Euprosthenops australis major ampullate spidroin 1 are well conserved among spidroins. However, NTs of minor ampullate spidroins from several species, including Araneus ventricosus ((Av)MiSp NT), lack one of the glutamic acids. Here we investigate the pH-dependent structural changes of (Av)MiSp NT, revealing that it uses the same mechanism but involves a non-conserved glutamic acid residue instead. Homology modeling of the structures of other MiSp NTs suggests that these harbor different compensatory residues. This indicates that, despite sequence variations, the molecular mechanism underlying pH-dependent dimerization of NT is conserved among different silk types. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The MPS1 family of protein kinases.

PubMed

Liu, Xuedong; Winey, Mark

2012-01-01

MPS1 protein kinases are found widely, but not ubiquitously, in eukaryotes. This family of potentially dual-specific protein kinases is among several that regulate a number of steps of mitosis. The most widely conserved MPS1 kinase functions involve activities at the kinetochore in both the chromosome attachment and the spindle checkpoint. MPS1 kinases also function at centrosomes. Beyond mitosis, MPS1 kinases have been implicated in development, cytokinesis, and several different signaling pathways. Family members are identified by virtue of a conserved C-terminal kinase domain, though the N-terminal domain is quite divergent. The kinase domain of the human enzyme has been crystallized, revealing an unusual ATP-binding pocket. The activity, level, and subcellular localization of Mps1 family members are tightly regulated during cell-cycle progression. The mitotic functions of Mps1 kinases and their overexpression in some tumors have prompted the identification of Mps1 inhibitors and their active development as anticancer drugs.
Force-dependent isomerization kinetics of a highly conserved proline switch modulates the mechanosensing region of filamin

PubMed Central

Rognoni, Lorenz; Möst, Tobias; Žoldák, Gabriel; Rief, Matthias

2014-01-01

Proline switches, controlled by cis–trans isomerization, have emerged as a particularly effective regulatory mechanism in a wide range of biological processes. In this study, we use single-molecule mechanical measurements to develop a full kinetic and energetic description of a highly conserved proline switch in the force-sensing domain 20 of human filamin and how prolyl isomerization modulates the force-sensing mechanism. Proline isomerization toggles domain 20 between two conformations. A stable cis conformation with slow unfolding, favoring the autoinhibited closed conformation of filamin’s force-sensing domain pair 20–21, and a less stable, uninhibited conformation promoted by the trans form. The data provide detailed insight into the folding mechanisms that underpin the functionality of this binary switch and elucidate its remarkable efficiency in modulating force-sensing, thus combining two previously unconnected regulatory mechanisms, proline switches and mechanosensing. PMID:24706888
Telomere Capping Proteins are Structurally Related to RPA with an additional Telomere-Specific Domain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gelinas, A.; Paschini, M; Reyes, F

Telomeres must be capped to preserve chromosomal stability. The conserved Stn1 and Ten1 proteins are required for proper capping of the telomere, although the mechanistic details of how they contribute to telomere maintenance are unclear. Here, we report the crystal structures of the C-terminal domain of the Saccharomyces cerevisiae Stn1 and the Schizosaccharomyces pombe Ten1 proteins. These structures reveal striking similarities to corresponding subunits in the replication protein A complex, further supporting an evolutionary link between telomere maintenance proteins and DNA repair complexes. Our structural and in vivo data of Stn1 identify a new domain that has evolved to supportmore » a telomere-specific role in chromosome maintenance. These findings endorse a model of an evolutionarily conserved mechanism of DNA maintenance that has developed as a result of increased chromosomal structural complexity.« less
OsBRI1 Activates BR Signaling by Preventing Binding between the TPR and Kinase Domains of OsBSK3 via Phosphorylation.

PubMed

Zhang, Baowen; Wang, Xiaolong; Zhao, Zhiying; Wang, Ruiju; Huang, Xiahe; Zhu, Yali; Yuan, Li; Wang, Yingchun; Xu, Xiaodong; Burlingame, Alma L; Gao, Yingjie; Sun, Yu; Tang, Wenqiang

2016-02-01

Many plant receptor kinases transduce signals through receptor-like cytoplasmic kinases (RLCKs); however, the molecular mechanisms that create an effective on-off switch are unknown. The receptor kinase BR INSENSITIVE1 (BRI1) transduces brassinosteroid (BR) signal by phosphorylating members of the BR-signaling kinase (BSK) family of RLCKs, which contain a kinase domain and a C-terminal tetratricopeptide repeat (TPR) domain. Here, we show that the BR signaling function of BSKs is conserved in Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) and that the TPR domain of BSKs functions as a "phospho-switchable" autoregulatory domain to control BSKs' activity. Genetic studies revealed that OsBSK3 is a positive regulator of BR signaling in rice, while in vivo and in vitro assays demonstrated that OsBRI1 interacts directly with and phosphorylates OsBSK3. The TPR domain of OsBSK3, which interacts directly with the protein's kinase domain, serves as an autoinhibitory domain to prevent OsBSK3 from interacting with bri1-SUPPRESSOR1 (BSU1). Phosphorylation of OsBSK3 by OsBRI1 disrupts the interaction between its TPR and kinase domains, thereby increasing the binding between OsBSK3's kinase domain and BSU1. Our results not only demonstrate that OsBSK3 plays a conserved role in regulating BR signaling in rice, but also provide insight into the molecular mechanism by which BSK family proteins are inhibited under basal conditions but switched on by the upstream receptor kinase BRI1. © 2016 American Society of Plant Biologists. All Rights Reserved.
Update on terrestrial ecological classification in the highlands of West Virginia

Treesearch

James P. Vanderhorst

2010-01-01

The West Virginia Natural Heritage Program (WVNHP) maintains databases on the biological diversity of the state, including species and natural communities, to help focus conservation efforts by agencies and organizations. Information on terrestrial communities (also called vegetation, or habitat, depending on user or audience focus) is maintained in two databases. The...
The perturbation of tryptophan fluorescence by phenylalanine to alanine mutations identifies the hydrophobic core in a subset of bacterial Ig-like domains.

PubMed

Raman, Rajeev; Ptak, Christopher P; Hsieh, Ching-Lin; Oswald, Robert E; Chang, Yung-Fu; Sharma, Yogendra

2013-07-09

Many host-parasite interactions are mediated via surface-exposed proteins containing bacterial immunoglobulin-like (Big) domains. Here, we utilize the spectral properties of a conserved Trp to provide evidence that, along with a Phe, these residues are positioned within the hydrophobic core of a subset of Big_2 domains. The mutation of the Phe to Ala decreases Big_2 domain stability and impairs the ability of LigBCen2 to bind to the host protein, fibronectin.
Zygote arrest 1 (Zar1) is an evolutionarily conserved gene expressed in vertebrate ovaries.

PubMed

Wu, Xuemei; Wang, Pei; Brown, Christopher A; Zilinski, Carolyn A; Matzuk, Martin M

2003-09-01

Zygote arrest 1 (ZAR1) is an ovary-specific maternal factor that plays essential roles during the oocyte-to-embryo transition. In mice, the Zar1 mRNA is detected as a 1.4-kilobase (kb) transcript that is synthesized exclusively in growing oocytes. To further understand the functions of ZAR1, we have cloned the orthologous Zar1 cDNA and/or genes for mouse, rat, human, frog, zebrafish, and pufferfish. The entire mouse Zar1 gene and a related pseudogene span approximately 4.0 kb, contain four exons, and map to adjacent loci on mouse chromosome 5. The human ZAR1 orthologous gene similarly consists of four exons and resides on human chromosome 4p12, which is syntenic with the mouse Zar1 chromosomal locus. Rat (Rattus norvegicus) and pufferfish (Fugu rubripes) Zar1 genes were recognized by database mining and deduced protein alignment analysis. The rat Zar1 gene also maps to a region that is syntenic with the mouse Zar1 gene locus on rat chromosome 14. Frog (Xenopus laevis) and zebrafish (Danio rerio) Zar1 orthologs were cloned by reverse transcription-polymerase chain reaction and rapid amplification of cDNA ends analysis of ovarian mRNA. Unlike mouse and human, the frog Zar1 is detected in multiple tissues, including lung, muscle, and ovary. The Zar1 mRNA appears in the cytoplasm of oocytes and persists until the tailbud stage during frog embryogenesis. Mouse, rat, human, frog, zebrafish, and pufferfish Zar1 genes encode proteins of 361, 361, 424, 295, 329, and 320 amino acids, respectively, and share 50.8%-88.1% amino acid identity. Regions of the N-termini of these ZAR1 orthologs show high sequence identity among these various proteins. However, the C-terminal 103 amino acids of these proteins, encoded by exons 2-4, contain an atypical eight-cysteine Plant Homeo Domain motif and are highly conserved, sharing 80.6%-98.1% identity among these species. These findings suggest that the carboxyl-termini of these ZAR1 proteins contain an important functional domain that is conserved through vertebrate evolution and that may be necessary for normal female reproduction in the transition from oocyte to embryonic life.
Shark class II invariant chain reveals ancient conserved relationships with cathepsins and MHC class II.

PubMed

Criscitiello, Michael F; Ohta, Yuko; Graham, Matthew D; Eubanks, Jeannine O; Chen, Patricia L; Flajnik, Martin F

2012-03-01

The invariant chain (Ii) is the critical third chain required for the MHC class II heterodimer to be properly guided through the cell, loaded with peptide, and expressed on the surface of antigen presenting cells. Here, we report the isolation of the nurse shark Ii gene, and the comparative analysis of Ii splice variants, expression, genomic organization, predicted structure, and function throughout vertebrate evolution. Alternative splicing to yield Ii with and without the putative protease-protective, thyroglobulin-like domain is as ancient as the MHC-based adaptive immune system, as our analyses in shark and lizard further show conservation of this mechanism in all vertebrate classes except bony fish. Remarkable coordinate expression of Ii and class II was found in shark tissues. Conserved Ii residues and cathepsin L orthologs suggest their long co-evolution in the antigen presentation pathway, and genomic analyses suggest 450 million years of conserved Ii exon/intron structure. Other than an extended linker preceding the thyroglobulin-like domain in cartilaginous fish, the Ii gene and protein are predicted to have largely similar physiology from shark to man. Duplicated Ii genes found only in teleosts appear to have become sub-functionalized, as one form is predicted to play the same role as that mediated by Ii mRNA alternative splicing in all other vertebrate classes. No Ii homologs or potential ancestors of any of the functional Ii domains were found in the jawless fish or lower chordates. Copyright © 2011 Elsevier Ltd. All rights reserved.
Overcoming Antigenic Diversity by Enhancing the Immunogenicity of Conserved Epitopes on the Malaria Vaccine Candidate Apical Membrane Antigen-1

PubMed Central

Dutta, Sheetij; Dlugosz, Lisa S.; Drew, Damien R.; Ge, Xiopeng; Ababacar, Diouf; Rovira, Yazmin I.; Moch, J. Kathleen; Shi, Meng; Long, Carole A.; Foley, Michael; Beeson, James G.; Anders, Robin F.; Miura, Kazutoyo; Haynes, J. David; Batchelor, Adrian H.

2013-01-01

Malaria vaccine candidate Apical Membrane Antigen-1 (AMA1) induces protection, but only against parasite strains that are closely related to the vaccine. Overcoming the AMA1 diversity problem will require an understanding of the structural basis of cross-strain invasion inhibition. A vaccine containing four diverse allelic proteins 3D7, FVO, HB3 and W2mef (AMA1 Quadvax or QV) elicited polyclonal rabbit antibodies that similarly inhibited the invasion of four vaccine and 22 non-vaccine strains of P. falciparum. Comparing polyclonal anti-QV with antibodies against a strain-specific, monovalent, 3D7 AMA1 vaccine revealed that QV induced higher levels of broadly inhibitory antibodies which were associated with increased conserved face and domain-3 responses and reduced domain-2 response. Inhibitory monoclonal antibodies (mAb) raised against the QV reacted with a novel cross-reactive epitope at the rim of the hydrophobic trough on domain-1; this epitope mapped to the conserved face of AMA1 and it encompassed the 1e-loop. MAbs binding to the 1e-loop region (1B10, 4E8 and 4E11) were ∼10-fold more potent than previously characterized AMA1-inhibitory mAbs and a mode of action of these 1e-loop mAbs was the inhibition of AMA1 binding to its ligand RON2. Unlike the epitope of a previously characterized 3D7-specific mAb, 1F9, the 1e-loop inhibitory epitope was partially conserved across strains. Another novel mAb, 1E10, which bound to domain-3, was broadly inhibitory and it blocked the proteolytic processing of AMA1. By itself mAb 1E10 was weakly inhibitory but it synergized with a previously characterized, strain-transcending mAb, 4G2, which binds close to the hydrophobic trough on the conserved face and inhibits RON2 binding to AMA1. Novel inhibition susceptible regions and epitopes, identified here, can form the basis for improving the antigenic breadth and inhibitory response of AMA1 vaccines. Vaccination with a few diverse antigenic proteins could provide universal coverage by redirecting the immune response towards conserved epitopes. PMID:24385910
Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars

PubMed Central

2012-01-01

Background Roses (Rosa sp.), which belong to the family Rosaceae, are the most economically important ornamental plants—making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. Results We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: ‘Vital’, ‘Maroussia’, and ‘Sympathy’ and Rosa rugosa Thunb. , respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO) terms, Plant Ontology (PO) terms, and MIPS Functional Catalogue (FunCat) terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach) and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. Conclusions In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a comprehensive genetic resource which can be used to better understand rose flower development and to identify candidate genes for important phenotypes. PMID:23171001
Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars.

PubMed

Kim, Jungeun; Park, June Hyun; Lim, Chan Ju; Lim, Jae Yun; Ryu, Jee-Youn; Lee, Bong-Woo; Choi, Jae-Pil; Kim, Woong Bom; Lee, Ha Yeon; Choi, Yourim; Kim, Donghyun; Hur, Cheol-Goo; Kim, Sukweon; Noh, Yoo-Sun; Shin, Chanseok; Kwon, Suk-Yoon

2012-11-21

Roses (Rosa sp.), which belong to the family Rosaceae, are the most economically important ornamental plants--making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: 'Vital', 'Maroussia', and 'Sympathy' and Rosa rugosa Thunb., respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO) terms, Plant Ontology (PO) terms, and MIPS Functional Catalogue (FunCat) terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach) and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a comprehensive genetic resource which can be used to better understand rose flower development and to identify candidate genes for important phenotypes.
A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

PubMed

Duck, Geraint; Nenadic, Goran; Filannino, Michele; Brass, Andy; Robertson, David L; Stevens, Robert

2016-01-01

Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT), though some are instead seeing rapid growth (e.g., the GO, R). We find a striking imbalance in resource usage with the top 5% of resource names (133 names) accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371.

Dictionary-driven protein annotation.

PubMed

Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

2002-09-01

Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were released publicly after we built the Bio-Dictionary that is used in our experiments. Finally, we have computed the annotations of more than 70 complete genomes and made them available on the World Wide Web at http://cbcsrv.watson.ibm.com/Annotations/.
Developing an ontological explosion knowledge base for business continuity planning purposes.

PubMed

Mohammadfam, Iraj; Kalatpour, Omid; Golmohammadi, Rostam; Khotanlou, Hasan

2013-01-01

Industrial accidents are among the most known challenges to business continuity. Many organisations have lost their reputation following devastating accidents. To manage the risks of such accidents, it is necessary to accumulate sufficient knowledge regarding their roots, causes and preventive techniques. The required knowledge might be obtained through various approaches, including databases. Unfortunately, many databases are hampered by (among other things) static data presentations, a lack of semantic features, and the inability to present accident knowledge as discrete domains. This paper proposes the use of Protégé software to develop a knowledge base for the domain of explosion accidents. Such a structure has a higher capability to improve information retrieval compared with common accident databases. To accomplish this goal, a knowledge management process model was followed. The ontological explosion knowledge base (EKB) was built for further applications, including process accident knowledge retrieval and risk management. The paper will show how the EKB has a semantic feature that enables users to overcome some of the search constraints of existing accident databases.
Cloning and characterization of a novel human STAR domain containing cDNA KHDRBS2.

PubMed

Wang, Liu; Xu, Jian; Zeng, Li; Ye, Xin; Wu, Qihan; Dai, Jianfeng; Ji, Chaoneng; Gu, Shaohua; Zhao, Chunhua; Xie, Yi; Mao, Yumin

2002-12-01

KHDRBS2, KH domain containing, RNA binding, signal transduction associated 2, is an RNA-binding protein that is tyrosine phosphorylated by Src during mitosis. It contains a KH domain,which is embedded in a larger conserved domain called the STAR domain. This protein has a 99% sequence identity with rat SLM-1 (the Sam68-like mammalian protein 1) and 98% sequence identity with mouse SLM-1 in its STAR domain. KHDRBS2 has the characteristic Sam68 SH2 and SH3 domain binding sites. RT-PCR analysis showed its transcript is ubiquitously expressed. The characterization of KHDRBS2 indicates it may link tyrosine kinase signaling cascades with some aspect of RNA metabolism.
PCR Cloning of Partial "nbs" Sequences from Grape ("Vitis aestivalis" Michx)

ERIC Educational Resources Information Center

Chang, Ming-Mei; DiGennaro, Peter; Macula, Anthony

2009-01-01

Plants defend themselves against pathogens via the expressions of disease resistance (R) genes. Many plant R gene products contain the characteristic nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domains. There are highly conserved motifs within the NBS domain which could be targeted for polymerase chain reaction (PCR) cloning of R…
Noise Radiation From a Leading-Edge Slat

NASA Technical Reports Server (NTRS)

Lockard, David P.; Choudhari, Meelan M.

2009-01-01

This paper extends our previous computations of unsteady flow within the slat cove region of a multi-element high-lift airfoil configuration, which showed that both statistical and structural aspects of the experimentally observed unsteady flow behavior can be captured via 3D simulations over a computational domain of narrow spanwise extent. Although such narrow domain simulation can account for the spanwise decorrelation of the slat cove fluctuations, the resulting database cannot be applied towards acoustic predictions of the slat without invoking additional approximations to synthesize the fluctuation field over the rest of the span. This deficiency is partially alleviated in the present work by increasing the spanwise extent of the computational domain from 37.3% of the slat chord to nearly 226% (i.e., 15% of the model span). The simulation database is used to verify consistency with previous computational results and, then, to develop predictions of the far-field noise radiation in conjunction with a frequency-domain Ffowcs-Williams Hawkings solver.
Dual Thermosensitive Hydrogels Assembled from the Conserved C-Terminal Domain of Spider Dragline Silk.

PubMed

Qian, Zhi-Gang; Zhou, Ming-Liang; Song, Wen-Wen; Xia, Xiao-Xia

2015-11-09

Stimuli-responsive hydrogels have great potentials in biomedical and biotechnological applications. Due to the advantages of precise control over molecular weight and being biodegradable, protein-based hydrogels and their applications have been extensively studied. However, protein hydrogels with dual thermosensitive properties are rarely reported. Here we present the first report of dual thermosensitive hydrogels assembled from the conserved C-terminal domain of spider dragline silk. First, we found that recombinant C-terminal domain of major ampullate spidroin 1 (MaSp1) of the spider Nephila clavipes formed hydrogels when cooled to approximately 2 °C or heated to 65 °C. The conformational changes and self-assembly of the recombinant protein were studied to understand the mechanism of the gelation processes using multiple methods. It was proposed that the gelation in the low-temperature regime was dominated by hydrogen bonding and hydrophobic interaction between folded protein molecules, whereas the gelation in the high-temperature regime was due to cross-linking of the exposed hydrophobic patches resulting from partial unfolding of the protein upon heating. More interestingly, genetic fusion of the C-terminal domain to a short repetitive region of N. clavipes MaSp1 resulted in a chimeric protein that formed a hydrogel with significantly improved mechanical properties at low temperatures between 2 and 10 °C. Furthermore, the formation of similar hydrogels was observed for the recombinant C-terminal domains of dragline silk of different spider species, thus demonstrating the conserved ability to form dual thermosensitive hydrogels. These findings may be useful in the design and construction of novel protein hydrogels with tunable multiple thermosensitivity for applications in the future.
A comparative examination of odontogenic gene expression in both toothed and toothless amniotes

PubMed Central

Lainoff, Alexis J.; Moustakas-Verho, Jacqueline E.; Hu, Diane; Kallonen, Aki; Marcucio, Ralph S.; Hlusko, Leslea J.

2015-01-01

A well-known tenet of murine tooth development is that BMP4 and FGF8 antagonistically initiate odontogenesis, but whether this tenet is conserved across amniotes is largely unexplored. Moreover, changes in BMP4-signaling have previously been implicated in evolutionary tooth loss in Aves. Here we demonstrate that Bmp4, Msx1, and Msx2 expression is limited proximally in the red-eared slider turtle (Trachemys scripta) mandible at stages equivalent to those at which odontogenesis is initiated in mice, a similar finding to previously reported results in chicks. To address whether the limited domains in the turtle and the chicken indicate an evolutionary molecular parallelism, or whether the domains simply constitute an ancestral phenotype, we assessed gene expression in a toothed reptile (the American alligator, Alligator mississippiensis) and a toothed non-placental mammal (the gray short-tailed opossum, Monodelphis domestica). We demonstrate that the Bmp4 domain is limited proximally in M. domestica and that the Fgf8 domain is limited distally in A. mississippiensis just preceding odontogenesis. Additionally, we show that Msx1 and Msx2 expression patterns in these species differ from those found in mice. Our data suggest that a limited Bmp4 domain does not necessarily correlate with edentulism, and reveal that the initiation of odontogenesis in non-murine amniotes is more complex than previously imagined. Our data also suggest a partially conserved odontogenic program in T. scripta, as indicated by conserved Pitx2, Pax9, and Barx1 expression patterns and by the presence of a Shh-expressing palatal epithelium, which we hypothesize may represent potential dental rudiments based on the Testudinata fossil record. PMID:25678399
A conserved interaction that is essential for the biogenesis of histone locus bodies.

PubMed

Yang, Xiao-cui; Sabath, Ivan; Kunduru, Lalitha; van Wijnen, Andre J; Marzluff, William F; Dominski, Zbigniew

2014-12-05

Nuclear protein, ataxia-telangiectasia locus (NPAT) and FLICE-associated huge protein (FLASH) are two major components of discrete nuclear structures called histone locus bodies (HLBs). NPAT is a key co-activator of histone gene transcription, whereas FLASH through its N-terminal region functions in 3' end processing of histone primary transcripts. The C-terminal region of FLASH contains a highly conserved domain that is also present at the end of Yin Yang 1-associated protein-related protein (YARP) and its Drosophila homologue, Mute, previously shown to localize to HLBs in Drosophila cells. Here, we show that the C-terminal domain of human FLASH and YARP interacts with the C-terminal region of NPAT and that this interaction is essential and sufficient to drive FLASH and YARP to HLBs in HeLa cells. Strikingly, only the last 16 amino acids of NPAT are sufficient for the interaction. We also show that the C-terminal domain of Mute interacts with a short region at the end of the Drosophila NPAT orthologue, multi sex combs (Mxc). Altogether, our data indicate that the conserved C-terminal domain shared by FLASH, YARP, and Mute recognizes the C-terminal sequence of NPAT orthologues, thus acting as a signal targeting proteins to HLBs. Finally, we demonstrate that the C-terminal domain of human FLASH can be directly joined with its N-terminal region through alternative splicing. The resulting 190-amino acid MiniFLASH, despite lacking 90% of full-length FLASH, contains all regions necessary for 3' end processing of histone pre-mRNA in vitro and accumulates in HLBs. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.
Basic Tilted Helix Bundle - a new protein fold in human FKBP25/FKBP3 and HectD1.

PubMed

Helander, Sara; Montecchio, Meri; Lemak, Alexander; Farès, Christophe; Almlöf, Jonas; Yi, Yanjun; Yee, Adelinda; Arrowsmith, Cheryl; DhePaganon, Sirano; Sunnerhagen, Maria

2014-04-25

In this paper, we describe the structure of a N-terminal domain motif in nuclear-localized FKBP251-73, a member of the FKBP family, together with the structure of a sequence-related subdomain of the E3 ubiquitin ligase HectD1 that we show belongs to the same fold. This motif adopts a compact 5-helix bundle which we name the Basic Tilted Helix Bundle (BTHB) domain. A positively charged surface patch, structurally centered around the tilted helix H4, is present in both FKBP25 and HectD1 and is conserved in both proteins, suggesting a conserved functional role. We provide detailed comparative analysis of the structures of the two proteins and their sequence similarities, and analysis of the interaction of the proposed FKBP25 binding protein YY1. We suggest that the basic motif in BTHB is involved in the observed DNA binding of FKBP25, and that the function of this domain can be affected by regulatory YY1 binding and/or interactions with adjacent domains. Copyright © 2014 Elsevier Inc. All rights reserved.
The Relationship Among Sexual Attitudes, Sexual Fantasy, and Religiosity

PubMed Central

Ahrold, Tierney K.; Farmer, Melissa; Trapnell, Paul D.; Meston, Cindy M.

2015-01-01

Recent research on the impact of religiosity on sexuality has highlighted the role of the individual, and suggests that the effects of religious group and sexual attitudes and fantasy may be mediated through individual differences in spirituality. The present study investigated the role of religion in an ethnically diverse young adult sample (N = 1413, 69% women) using religious group as well as several religiosity domains: spirituality, intrinsic religiosity, paranormal beliefs, and fundamentalism. Differences between religious groups in conservative sexual attitudes were statistically significant but small; as predicted, spirituality mediated these effects. In contrast to the weak effects of religious group, spirituality, intrinsic religiosity, and fundamentalism were strong predictors of women’s conservative sexual attitudes; for men, intrinsic religiosity predicted sexual attitude conservatism but spirituality predicted attitudinal liberalism. For women, both religious group and religiosity domains were significant predictors of frequency of sexual fantasies while, for men, only religiosity domains were significant predictors. These results indicate that individual differences in religiosity domains were better predictors of sexual attitudes and fantasy than religious group and that these associations are moderated by gender. PMID:20364304
Crystal structures of the catalytic domains of pseudouridine synthases RluC and RluD from Escherichia coli.

PubMed

Mizutani, Kenji; Machida, Yoshitaka; Unzai, Satoru; Park, Sam-Yong; Tame, Jeremy R H

2004-04-20

The most frequent modification of RNA, the conversion of uridine bases to pseudouridines, is found in all living organisms and often in highly conserved locations in ribosomal and transfer RNA. RluC and RluD are homologous enzymes which each convert three specific uridine bases in Escherichia coli ribosomal 23S RNA to pseudouridine: bases 955, 2504, and 2580 in the case of RluC and 1911, 1915, and 1917 in the case of RluD. Both have an N-terminal S4 RNA binding domain. While the loss of RluC has little phenotypic effect, loss of RluD results in a much reduced growth rate. We have determined the crystal structures of the catalytic domain of RluC, and full-length RluD. The S4 domain of RluD appears to be highly flexible or unfolded and is completely invisible in the electron density map. Despite the conserved topology shared by the two proteins, the surface shape and charge distribution are very different. The models suggest significant differences in substrate binding by different pseudouridine synthases.
On the role of second number-conserving functional derivatives

NASA Astrophysics Data System (ADS)

Gál, Tamás

2006-06-01

It is found that number-conserving second derivatives, of functional differentiation constrained to the domain of functional variables ρ(x) of a given norm ∫ρ(x)dx, are not obtained via two successive number-conserving differentiations, contrary to the case of unrestricted second derivatives. Investigating the role of second number-conserving derivatives, with the density-functional formulation of time-dependent quantum mechanics in focus, it is shown how number-conserving differentiation handles the dual nature of the Kohn Sham potential arising in the practical use of the theory. On the other hand, it is pointed out that number-conserving derivatives cannot resolve the causality paradox connected with the second derivative of the exchange-correlation part of the action density functional.
Developmental Gene Discovery in a Hemimetabolous Insect: De Novo Assembly and Annotation of a Transcriptome for the Cricket Gryllus bimaculatus

PubMed Central

Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W.; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G.

2013-01-01

Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus. PMID:23671567
Co-Conserved MAPK Features Couple D-Domain Docking Groove to Distal Allosteric Sites via the C-Terminal Flanking Tail

PubMed Central

Nguyen, Tuan; Ruan, Zheng; Oruganty, Krishnadev; Kannan, Natarajan

2015-01-01

Mitogen activated protein kinases (MAPKs) form a closely related family of kinases that control critical pathways associated with cell growth and survival. Although MAPKs have been extensively characterized at the biochemical, cellular, and structural level, an integrated evolutionary understanding of how MAPKs differ from other closely related protein kinases is currently lacking. Here, we perform statistical sequence comparisons of MAPKs and related protein kinases to identify sequence and structural features associated with MAPK functional divergence. We show, for the first time, that virtually all MAPK-distinguishing sequence features, including an unappreciated short insert segment in the β4-β5 loop, physically couple distal functional sites in the kinase domain to the D-domain peptide docking groove via the C-terminal flanking tail (C-tail). The coupling mediated by MAPK-specific residues confers an allosteric regulatory mechanism unique to MAPKs. In particular, the regulatory αC-helix conformation is controlled by a MAPK-conserved salt bridge interaction between an arginine in the αC-helix and an acidic residue in the C-tail. The salt-bridge interaction is modulated in unique ways in individual sub-families to achieve regulatory specificity. Our study is consistent with a model in which the C-tail co-evolved with the D-domain docking site to allosterically control MAPK activity. Our study provides testable mechanistic hypotheses for biochemical characterization of MAPK-conserved residues and new avenues for the design of allosteric MAPK inhibitors. PMID:25799139
Crystal structures of a halophilic archaeal malate synthase from Haloferax volcanii and comparisons with isoforms A and G

PubMed Central

2011-01-01

Background Malate synthase, one of the two enzymes unique to the glyoxylate cycle, is found in all three domains of life, and is crucial to the utilization of two-carbon compounds for net biosynthetic pathways such as gluconeogenesis. In addition to the main isoforms A and G, so named because of their differential expression in E. coli grown on either acetate or glycolate respectively, a third distinct isoform has been identified. These three isoforms differ considerably in size and sequence conservation. The A isoform (MSA) comprises ~530 residues, the G isoform (MSG) is ~730 residues, and this third isoform (MSH-halophilic) is ~430 residues in length. Both isoforms A and G have been structurally characterized in detail, but no structures have been reported for the H isoform which has been found thus far only in members of the halophilic Archaea. Results We have solved the structure of a malate synthase H (MSH) isoform member from Haloferax volcanii in complex with glyoxylate at 2.51 Å resolution, and also as a ternary complex with acetyl-coenzyme A and pyruvate at 1.95 Å. Like the A and G isoforms, MSH is based on a β8/α8 (TIM) barrel. Unlike previously solved malate synthase structures which are all monomeric, this enzyme is found in the native state as a trimer/hexamer equilibrium. Compared to isoforms A and G, MSH displays deletion of an N-terminal domain and a smaller deletion at the C-terminus. The MSH active site is closely superimposable with those of MSA and MSG, with the ternary complex indicating a nucleophilic attack on pyruvate by the enolate intermediate of acetyl-coenzyme A. Conclusions The reported structures of MSH from Haloferax volcanii allow a detailed analysis and comparison with previously solved structures of isoforms A and G. These structural comparisons provide insight into evolutionary relationships among these isoforms, and also indicate that despite the size and sequence variation, and the truncated C-terminal domain of the H isoform, the catalytic mechanism is conserved. Sequence analysis in light of the structure indicates that additional members of isoform H likely exist in the databases but have been misannotated. PMID:21569248
Coiled-coil protein composition of 22 proteomes--differences and common themes in subcellular infrastructure and traffic control.

PubMed

Rose, Annkatrin; Schraegle, Shannon J; Stahlberg, Eric A; Meier, Iris

2005-11-16

Long alpha-helical coiled-coil proteins are involved in diverse organizational and regulatory processes in eukaryotic cells. They provide cables and networks in the cyto- and nucleoskeleton, molecular scaffolds that organize membrane systems and tissues, motors, levers, rotating arms, and possibly springs. Mutations in long coiled-coil proteins have been implemented in a growing number of human diseases. Using the coiled-coil prediction program MultiCoil, we have previously identified all long coiled-coil proteins from the model plant Arabidopsis thaliana and have established a searchable Arabidopsis coiled-coil protein database. Here, we have identified all proteins with long coiled-coil domains from 21 additional fully sequenced genomes. Because regions predicted to form coiled-coils interfere with sequence homology determination, we have developed a sequence comparison and clustering strategy based on masking predicted coiled-coil domains. Comparing and grouping all long coiled-coil proteins from 22 genomes, the kingdom-specificity of coiled-coil protein families was determined. At the same time, a number of proteins with unknown function could be grouped with already characterized proteins from other organisms. MultiCoil predicts proteins with extended coiled-coil domains (more than 250 amino acids) to be largely absent from bacterial genomes, but present in archaea and eukaryotes. The structural maintenance of chromosomes proteins and their relatives are the only long coiled-coil protein family clearly conserved throughout all kingdoms, indicating their ancient nature. Motor proteins, membrane tethering and vesicle transport proteins are the dominant eukaryote-specific long coiled-coil proteins, suggesting that coiled-coil proteins have gained functions in the increasingly complex processes of subcellular infrastructure maintenance and trafficking control of the eukaryotic cell.
Coiled-coil protein composition of 22 proteomes – differences and common themes in subcellular infrastructure and traffic control

PubMed Central

Rose, Annkatrin; Schraegle, Shannon J; Stahlberg, Eric A; Meier, Iris

2005-01-01

Background Long alpha-helical coiled-coil proteins are involved in diverse organizational and regulatory processes in eukaryotic cells. They provide cables and networks in the cyto- and nucleoskeleton, molecular scaffolds that organize membrane systems and tissues, motors, levers, rotating arms, and possibly springs. Mutations in long coiled-coil proteins have been implemented in a growing number of human diseases. Using the coiled-coil prediction program MultiCoil, we have previously identified all long coiled-coil proteins from the model plant Arabidopsis thaliana and have established a searchable Arabidopsis coiled-coil protein database. Results Here, we have identified all proteins with long coiled-coil domains from 21 additional fully sequenced genomes. Because regions predicted to form coiled-coils interfere with sequence homology determination, we have developed a sequence comparison and clustering strategy based on masking predicted coiled-coil domains. Comparing and grouping all long coiled-coil proteins from 22 genomes, the kingdom-specificity of coiled-coil protein families was determined. At the same time, a number of proteins with unknown function could be grouped with already characterized proteins from other organisms. Conclusion MultiCoil predicts proteins with extended coiled-coil domains (more than 250 amino acids) to be largely absent from bacterial genomes, but present in archaea and eukaryotes. The structural maintenance of chromosomes proteins and their relatives are the only long coiled-coil protein family clearly conserved throughout all kingdoms, indicating their ancient nature. Motor proteins, membrane tethering and vesicle transport proteins are the dominant eukaryote-specific long coiled-coil proteins, suggesting that coiled-coil proteins have gained functions in the increasingly complex processes of subcellular infrastructure maintenance and trafficking control of the eukaryotic cell. PMID:16288662
Nanobody Binding to a Conserved Epitope Promotes Norovirus Particle Disassembly

PubMed Central

Koromyslova, Anna D.

2014-01-01

ABSTRACT Human noroviruses are icosahedral single-stranded RNA viruses. The capsid protein is divided into shell (S) and protruding (P) domains, which are connected by a flexible hinge region. There are numerous genetically and antigenically distinct noroviruses, and the dominant strains evolve every other year. Vaccine and antiviral development is hampered by the difficulties in growing human norovirus in cell culture and the continually evolving strains. Here, we show the X-ray crystal structures of human norovirus P domains in complex with two different nanobodies. One nanobody, Nano-85, was broadly reactive, while the other, Nano-25, was strain specific. We showed that both nanobodies bound to the lower region on the P domain and had nanomolar affinities. The Nano-85 binding site mainly comprised highly conserved amino acids among the genetically distinct genogroup II noroviruses. Several of the conserved residues also were recognized by a broadly reactive monoclonal antibody, which suggested this region contained a dominant epitope. Superposition of the P domain nanobody complex structures into a cryoelectron microscopy particle structure revealed that both nanobodies bound at occluded sites on the particles. The flexible hinge region, which contained ∼10 to 12 amino acids, likely permitted a certain degree of P domain movement on the particles in order to accommodate the nanobodies. Interestingly, the Nano-85 binding interaction with intact particles caused the particles to disassemble in vitro. Altogether, these results suggested that the highly conserved Nano-85 binding epitope contained a trigger mechanism for particle disassembly. Principally, this epitope represents a potential site of norovirus vulnerability. IMPORTANCE We characterized two different nanobodies (Nano-85 and Nano-25) that bind to human noroviruses. Both nanobodies bound with high affinities to the lower region of the P domain, which was occluded on intact particles. Nano-25 was specific for GII.10, whereas Nano-85 bound several different GII genotypes, including GII.4, GII.10, and GII.12. We showed that Nano-85 was able to detect norovirus virions in clinical stool specimens using a sandwich enzyme-linked immunosorbent assay. Importantly, we found that Nano-85 binding to intact particles caused the particles to disassemble. We believe that with further testing, Nano-85 not only will work as a diagnostic reagent in norovirus detection systems but also could function as a broadly reactive GII norovirus antiviral. PMID:25520510
Penrose Well Temperatures

DOE Data Explorer

Christopherson, Karen

2013-03-15

Penrose Well Temperatures Geothermal waters have been encountered in several wells near Penrose in Fremont County, Colorado. Most of the wells were drilled for oil and gas exploration and, in a few cases, production. This ESRI point shapefile utilizes data from 95 wells in and around the Penrose area provided by the Colorado Oil and Gas Conservation Commission (COGCC) database at http://cogcc.state.co.us/ . Temperature data from the database were used to calculate a temperature gradient for each well. This information was then used to estimate temperatures at various depths. Projection: UTM Zone 13 NAD27 Extent: West -105.224871 East -105.027633 North 38.486269 South 38.259507 Originators: Colorado Oil and Gas Conservation Commission (COGCC) Karen Christopherson
Forest Conservation Opportunity Areas - Liberal Model (ECO_RES.COA_FORREST33)

EPA Pesticide Factsheets

This layer designates areas with potential for forest conservation. These are areas of natural or semi-natural forest land cover patches that are at least 75 meters away from roads and away from patch edges. OAs were modeled by creating distance grids using the National Land Cover Database and the Census Bureau's TIGER roads files.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.