conserved core sequence: Topics by Science.gov

Sample records for conserved core sequence

DNA sequence analysis of ARS elements from chromosome III of Saccharomyces cerevisiae: identification of a new conserved sequence.

PubMed Central

Palzkill, T G; Oliver, S G; Newlon, C S

1986-01-01

Four fragments of Saccharomyces cerevisiae chromosome III DNA which carry ARS elements have been sequenced. Each fragment contains multiple copies of sequences that have at least 10 out of 11 bases of homology to a previously reported 11 bp core consensus sequence. A survey of these new ARS sequences and previously reported sequences revealed the presence of an additional 11 bp conserved element located on the 3' side of the T-rich strand of the core consensus. Subcloning analysis as well as deletion and transposon insertion mutagenesis of ARS fragments support a role for 3' conserved sequence in promoting ARS activity. PMID:3529036
Hepatitis C virus genotypes in Singapore and Indonesia.

PubMed

Ng, W C; Guan, R; Tan, M F; Seet, B L; Lim, C A; Ngiam, C M; Sjaifoellah Noer, H M; Lesmana, L

1995-01-01

5' untranslated and partial core (C) region sequence of hepatitis C virus (HCV) in 21 Singaporean and 15 Indonesian isolates were amplified by reverse-transcription polymerase chain reaction and sequenced with the use of conserved primer sequences deduced from HCV genomes identified in other geographical regions. The HCV genotypes are predominantly that of Simmonds type 1 and less of type 2 and 3 with the latter genotype currently not detected in Indonesia. The 5' untranslated sequences are related to HCV-1. DK-7 (Denmark), US-11 (United States of America), HCV-J4, SA-10 (South Africa), T-3 (Taiwan), HCV-J6, HCV-J8, Eb-1 and Eb-8. When compared with the prototype HCV-1, insertions are found within the 5' untranslated region of Singaporean isolates and not in the Indonesians. There are Singaporean and Indonesian isolates that have sequences within the 5' untranslated region that differ slightly from each other. Microheterogeneity is observed in the core region of two Singaporeans and one Indonesian isolate. Finally, not all HCV isolates can be amplified with the conserved core sequence primers when compared with the ease with which these isolates can be amplified with 5' untranslated region conserved primers.
A single amino-acid substitution in the Ets domain alters core DNA binding specificity of Ets1 to that of the related transcription factors Elf1 and E74.

PubMed

Bosselut, R; Levin, J; Adjadj, E; Ghysdael, J

1993-11-11

Ets proteins form a family of sequence specific DNA binding proteins which bind DNA through a 85 aminoacids conserved domain, the Ets domain, whose sequence is unrelated to any other characterized DNA binding domain. Unlike all other known Ets proteins, which bind specific DNA sequences centered over either GGAA or GGAT core motifs, E74 and Elf1 selectively bind to GGAA corecontaining sites. Elf1 and E74 differ from other Ets proteins in three residues located in an otherwise highly conserved region of the Ets domain, referred to as conserved region III (CRIII). We show that a restricted selectivity for GGAA core-containing sites could be conferred to Ets1 upon changing a single lysine residue within CRIII to the threonine found in Elf1 and E74 at this position. Conversely, the reciprocal mutation in Elf1 confers to this protein the ability to bind to GGAT core containing EBS. This, together with the fact that mutation of two invariant arginine residues in CRIII abolishes DNA binding, indicates that CRIII plays a key role in Ets domain recognition of the GGAA/T core motif and lead us to discuss a model of Ets proteins--core motif interaction.
Selection of Optimal Polypurine Tract Region Sequences during Moloney Murine Leukemia Virus Replication

PubMed Central

Robson, Nicole D.; Telesnitsky, Alice

2000-01-01

Retrovirus plus-strand synthesis is primed by a cleavage remnant of the polypurine tract (PPT) region of viral RNA. In this study, we tested replication properties for Moloney murine leukemia viruses with targeted mutations in the PPT and in conserved sequences upstream, as well as for pools of mutants with randomized sequences in these regions. The importance of maintaining some purine residues within the PPT was indicated both by examining the evolution of random PPT pools and from the replication properties of targeted mutants. Although many different PPT sequences could support efficient replication and one mutant that contained two differences in the core PPT was found to replicate as well as the wild type, some sequences in the core PPT clearly conferred advantages over others. Contributions of sequences upstream of the core PPT were examined with deletion mutants. A conserved T-stretch within the upstream sequence was examined in detail and found to be unimportant to helper functions. Evolution of virus pools containing randomized T-stretch sequences demonstrated marked preference for the wild-type sequence in six of its eight positions. These findings demonstrate that maintenance of the T-rich element is more important to viral replication than is maintenance of the core PPT. PMID:11044073
Conserved Curvature of RNA Polymerase I Core Promoter Beyond rRNA Genes: The Case of the Tritryps

PubMed Central

Smircich, Pablo; Duhagon, María Ana; Garat, Beatriz

2015-01-01

In trypanosomatids, the RNA polymerase I (RNAPI)-dependent promoters controlling the ribosomal RNA (rRNA) genes have been well identified. Although the RNAPI transcription machinery recognizes the DNA conformation instead of the DNA sequence of promoters, no conformational study has been reported for these promoters. Here we present the in silico analysis of the intrinsic DNA curvature of the rRNA gene core promoters in Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major. We found that, in spite of the absence of sequence conservation, these promoters hold conformational properties similar to other eukaryotic rRNA promoters. Our results also indicated that the intrinsic DNA curvature pattern is conserved within the Leishmania genus and also among strains of T. cruzi and T. brucei. Furthermore, we analyzed the impact of point mutations on the intrinsic curvature and their impact on the promoter activity. Furthermore, we found that the core promoters of protein-coding genes transcribed by RNAPI in T. brucei show the same conserved conformational characteristics. Overall, our results indicate that DNA intrinsic curvature of the rRNA gene core promoters is conserved in these ancient eukaryotes and such conserved curvature might be a requirement of RNAPI machinery for transcription of not only rRNA genes but also protein-coding genes. PMID:26718450
Evolutionary and biophysical relationships among the papillomavirus E2 proteins.

PubMed

Blakaj, Dukagjin M; Fernandez-Fuentes, Narcis; Chen, Zigui; Hegde, Rashmi; Fiser, Andras; Burk, Robert D; Brenowitz, Michael

2009-01-01

Infection by human papillomavirus (HPV) may result in clinical conditions ranging from benign warts to invasive cancer. The HPV E2 protein represses oncoprotein transcription and is required for viral replication. HPV E2 binds to palindromic DNA sequences of highly conserved four base pair sequences flanking an identical length variable 'spacer'. E2 proteins directly contact the conserved but not the spacer DNA. Variation in naturally occurring spacer sequences results in differential protein affinity that is dependent on their sensitivity to the spacer DNA's unique conformational and/or dynamic properties. This article explores the biophysical character of this core viral protein with the goal of identifying characteristics that associated with risk of virally caused malignancy. The amino acid sequence, 3d structure and electrostatic features of the E2 protein DNA binding domain are highly conserved; specific interactions with DNA binding sites have also been conserved. In contrast, the E2 protein's transactivation domain does not have extensive surfaces of highly conserved residues. Rather, regions of high conservation are localized to small surface patches. Implications to cancer biology are discussed.
Biogeographic Comparison of Lophelia-Associated Bacterial Communities in the Western Atlantic Reveals Conserved Core Microbiome

PubMed Central

Kellogg, Christina A.; Goldsmith, Dawn B.; Gray, Michael A.

2017-01-01

Over the last decade, publications on deep-sea corals have tripled. Most attention has been paid to Lophelia pertusa, a globally distributed scleractinian coral that creates critical three-dimensional habitat in the deep ocean. The bacterial community associated with L. pertusa has been previously described by a number of studies at sites in the Mediterranean Sea, Norwegian fjords, off Great Britain, and in the Gulf of Mexico (GOM). However, use of different methodologies prevents direct comparisons in most cases. Our objectives were to address intra-regional variation and to identify any conserved bacterial core community. We collected samples from three distinct colonies of L. pertusa at each of four locations within the western Atlantic: three sites within the GOM and one off the east coast of the United States. Amplicon libraries of 16S rRNA genes were generated using primers targeting the V4–V5 hypervariable region and 454 pyrosequencing. The dominant phylum was Proteobacteria (75–96%). At the family level, 80–95% of each sample was comprised of five groups: Pirellulaceae, Pseudonocardiaceae, Rhodobacteraceae, Sphingomonadaceae, and unclassified Oceanospirillales. Principal coordinate analysis based on weighted UniFrac distances showed a clear distinction between the GOM and Atlantic samples. Interestingly, the replicate samples from each location did not always cluster together, indicating there is not a strong site-specific influence. The core bacterial community, conserved in 100% of the samples, was dominated by the operational taxonomic units of genera Novosphingobium and Pseudonocardia, both known degraders of aromatic hydrocarbons. The sequence of another core member, Propionibacterium, was also found in prior studies of L. pertusa from Norway and Great Britain, suggesting a role as a conserved symbiont. By examining more than 40,000 sequences per sample, we found that GOM samples were dominated by the identified conserved core sequences, whereas open Atlantic samples had a much higher proportion of locally consistent bacteria. Further, predictive functional profiling highlights the potential for the L. pertusa microbiome to contribute to chemoautotrophy, nutrient cycling, and antibiotic production. PMID:28522997
Biogeographic Comparison of Lophelia-Associated Bacterial Communities in the Western Atlantic Reveals Conserved Core Microbiome.

PubMed

Kellogg, Christina A; Goldsmith, Dawn B; Gray, Michael A

2017-01-01

Over the last decade, publications on deep-sea corals have tripled. Most attention has been paid to Lophelia pertusa , a globally distributed scleractinian coral that creates critical three-dimensional habitat in the deep ocean. The bacterial community associated with L. pertusa has been previously described by a number of studies at sites in the Mediterranean Sea, Norwegian fjords, off Great Britain, and in the Gulf of Mexico (GOM). However, use of different methodologies prevents direct comparisons in most cases. Our objectives were to address intra-regional variation and to identify any conserved bacterial core community. We collected samples from three distinct colonies of L. pertusa at each of four locations within the western Atlantic: three sites within the GOM and one off the east coast of the United States. Amplicon libraries of 16S rRNA genes were generated using primers targeting the V4-V5 hypervariable region and 454 pyrosequencing. The dominant phylum was Proteobacteria (75-96%). At the family level, 80-95% of each sample was comprised of five groups: Pirellulaceae, Pseudonocardiaceae, Rhodobacteraceae, Sphingomonadaceae, and unclassified Oceanospirillales. Principal coordinate analysis based on weighted UniFrac distances showed a clear distinction between the GOM and Atlantic samples. Interestingly, the replicate samples from each location did not always cluster together, indicating there is not a strong site-specific influence. The core bacterial community, conserved in 100% of the samples, was dominated by the operational taxonomic units of genera Novosphingobium and Pseudonocardia , both known degraders of aromatic hydrocarbons. The sequence of another core member, Propionibacterium , was also found in prior studies of L. pertusa from Norway and Great Britain, suggesting a role as a conserved symbiont. By examining more than 40,000 sequences per sample, we found that GOM samples were dominated by the identified conserved core sequences, whereas open Atlantic samples had a much higher proportion of locally consistent bacteria. Further, predictive functional profiling highlights the potential for the L. pertusa microbiome to contribute to chemoautotrophy, nutrient cycling, and antibiotic production.
Biogeographic comparison of Lophelia-associated bacterial communities in the Western Atlantic reveals conserved core microbiome

USGS Publications Warehouse

Kellogg, Christina A.; Goldsmith, Dawn; Gray, Michael A.

2017-01-01

Over the last decade, publications on deep-sea corals have tripled. Most attention has been paid to Lophelia pertusa, a globally distributed scleractinian coral that creates critical three-dimensional habitat in the deep ocean. The bacterial community associated with L. pertusa has been previously described by a number of studies at sites in the Mediterranean Sea, Norwegian fjords, off Great Britain, and in the Gulf of Mexico (GOM). However, use of different methodologies prevents direct comparisons in most cases. Our objectives were to address intra-regional variation and to identify any conserved bacterial core community. We collected samples from three distinct colonies of L. pertusa at each of four locations within the western Atlantic: three sites within the GOM and one off the east coast of the United States. Amplicon libraries of 16S rRNA genes were generated using primers targeting the V4–V5 hypervariable region and 454 pyrosequencing. The dominant phylum was Proteobacteria (75–96%). At the family level, 80–95% of each sample was comprised of five groups: Pirellulaceae, Pseudonocardiaceae, Rhodobacteraceae, Sphingomonadaceae, and unclassified Oceanospirillales. Principal coordinate analysis based on weighted UniFrac distances showed a clear distinction between the GOM and Atlantic samples. Interestingly, the replicate samples from each location did not always cluster together, indicating there is not a strong site-specific influence. The core bacterial community, conserved in 100% of the samples, was dominated by the operational taxonomic units of genera Novosphingobium and Pseudonocardia, both known degraders of aromatic hydrocarbons. The sequence of another core member, Propionibacterium, was also found in prior studies of L. pertusa from Norway and Great Britain, suggesting a role as a conserved symbiont. By examining more than 40,000 sequences per sample, we found that GOM samples were dominated by the identified conserved core sequences, whereas open Atlantic samples had a much higher proportion of locally consistent bacteria. Further, predictive functional profiling highlights the potential for the L. pertusa microbiome to contribute to chemoautotrophy, nutrient cycling, and antibiotic production.
Ancient Exaptation of a CORE-SINE Retroposon into a Highly Conserved Mammalian Neuronal Enhancer of the Proopiomelanocortin Gene

PubMed Central

Bumaschny, Viviana F; Low, Malcolm J; Rubinstein, Marcelo

2007-01-01

The proopiomelanocortin gene (POMC) is expressed in the pituitary gland and the ventral hypothalamus of all jawed vertebrates, producing several bioactive peptides that function as peripheral hormones or central neuropeptides, respectively. We have recently determined that mouse and human POMC expression in the hypothalamus is conferred by the action of two 5′ distal and unrelated enhancers, nPE1 and nPE2. To investigate the evolutionary origin of the neuronal enhancer nPE2, we searched available vertebrate genome databases and determined that nPE2 is a highly conserved element in placentals, marsupials, and monotremes, whereas it is absent in nonmammalian vertebrates. Following an in silico paleogenomic strategy based on genome-wide searches for paralog sequences, we discovered that opossum and wallaby nPE2 sequences are highly similar to members of the superfamily of CORE-short interspersed nucleotide element (SINE) retroposons, in particular to MAR1 retroposons that are widely present in marsupial genomes. Thus, the neuronal enhancer nPE2 originated from the exaptation of a CORE-SINE retroposon in the lineage leading to mammals and remained under purifying selection in all mammalian orders for the last 170 million years. Expression studies performed in transgenic mice showed that two nonadjacent nPE2 subregions are essential to drive reporter gene expression into POMC hypothalamic neurons, providing the first functional example of an exapted enhancer derived from an ancient CORE-SINE retroposon. In addition, we found that this CORE-SINE family of retroposons is likely to still be active in American and Australian marsupial genomes and that several highly conserved exonic, intronic and intergenic sequences in the human genome originated from the exaptation of CORE-SINE retroposons. Together, our results provide clear evidence of the functional novelties that transposed elements contributed to their host genomes throughout evolution. PMID:17922573
The ARTT motif and a unified structural understanding of substraterecognition in ADP ribosylating bacterial toxins and eukaryotic ADPribosyltransferases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Han, S.; Tainer, J.A.

2001-08-01

ADP-ribosylation is a widely occurring and biologically critical covalent chemical modification process in pathogenic mechanisms, intracellular signaling systems, DNA repair, and cell division. The reaction is catalyzed by ADP-ribosyltransferases, which transfer the ADP-ribose moiety of NAD to a target protein with nicotinamide release. A family of bacterial toxins and eukaryotic enzymes has been termed the mono-ADP-ribosyltransferases, in distinction to the poly-ADP-ribosyltransferases, which catalyze the addition of multiple ADP-ribose groups to the carboxyl terminus of eukaryotic nucleoproteins. Despite the limited primary sequence homology among the different ADP-ribosyltransferases, a central cleft bearing NAD-binding pocket formed by the two perpendicular b-sheet core hasmore » been remarkably conserved between bacterial toxins and eukaryotic mono- and poly-ADP-ribosyltransferases. The majority of bacterial toxins and eukaryotic mono-ADP-ribosyltransferases are characterized by conserved His and catalytic Glu residues. In contrast, Diphtheria toxin, Pseudomonas exotoxin A, and eukaryotic poly-ADP-ribosyltransferases are characterized by conserved Arg and catalytic Glu residues. The NAD-binding core of a binary toxin and a C3-like toxin family identified an ARTT motif (ADP-ribosylating turn-turn motif) that is implicated in substrate specificity and recognition by structural and mutagenic studies. Here we apply structure-based sequence alignment and comparative structural analyses of all known structures of ADP-ribosyltransfeases to suggest that this ARTT motif is functionally important in many ADP-ribosylating enzymes that bear a NAD binding cleft as characterized by conserved Arg and catalytic Glu residues. Overall, structure-based sequence analysis reveals common core structures and conserved active sites of ADP-ribosyltransferases to support similar NAD binding mechanisms but differing mechanisms of target protein binding via sequence variations within the ARTT motif structural framework. Thus, we propose here that the ARTT motif represents an experimentally testable general recognition motif region for many ADP-ribosyltransferases and thereby potentially provides a unified structural understanding of substrate recognition in ADP-ribosylation processes.« less
The Murine Norovirus Core Subgenomic RNA Promoter Consists of a Stable Stem-Loop That Can Direct Accurate Initiation of RNA Synthesis

PubMed Central

Yunus, Muhammad Amir; Lin, Xiaoyan; Bailey, Dalan; Karakasiliotis, Ioannis; Chaudhry, Yasmin; Vashist, Surender; Zhang, Guo; Thorne, Lucy; Kao, C. Cheng

2014-01-01

ABSTRACT All members of the Caliciviridae family of viruses produce a subgenomic RNA during infection. The subgenomic RNA typically encodes only the major and minor capsid proteins, but in murine norovirus (MNV), the subgenomic RNA also encodes the VF1 protein, which functions to suppress host innate immune responses. To date, the mechanism of norovirus subgenomic RNA synthesis has not been characterized. We have previously described the presence of an evolutionarily conserved RNA stem-loop structure on the negative-sense RNA, the complementary sequence of which codes for the viral RNA-dependent RNA polymerase (NS7). The conserved stem-loop is positioned 6 nucleotides 3′ of the start site of the subgenomic RNA in all caliciviruses. We demonstrate that the conserved stem-loop is essential for MNV viability. Mutant MNV RNAs with substitutions in the stem-loop replicated poorly until they accumulated mutations that revert to restore the stem-loop sequence and/or structure. The stem-loop sequence functions in a noncoding context, as it was possible to restore the replication of an MNV mutant by introducing an additional copy of the stem-loop between the NS7- and VP1-coding regions. Finally, in vitro biochemical data suggest that the stem-loop sequence is sufficient for the initiation of viral RNA synthesis by the recombinant MNV RNA-dependent RNA polymerase, confirming that the stem-loop forms the core of the norovirus subgenomic promoter. IMPORTANCE Noroviruses are a significant cause of viral gastroenteritis, and it is important to understand the mechanism of norovirus RNA synthesis. Here we describe the identification of an RNA stem-loop structure that functions as the core of the norovirus subgenomic RNA promoter in cells and in vitro. This work provides new insights into the molecular mechanisms of norovirus RNA synthesis and the sequences that determine the recognition of viral RNA by the RNA-dependent RNA polymerase. PMID:25392209
The murine norovirus core subgenomic RNA promoter consists of a stable stem-loop that can direct accurate initiation of RNA synthesis.

PubMed

Yunus, Muhammad Amir; Lin, Xiaoyan; Bailey, Dalan; Karakasiliotis, Ioannis; Chaudhry, Yasmin; Vashist, Surender; Zhang, Guo; Thorne, Lucy; Kao, C Cheng; Goodfellow, Ian

2015-01-15

All members of the Caliciviridae family of viruses produce a subgenomic RNA during infection. The subgenomic RNA typically encodes only the major and minor capsid proteins, but in murine norovirus (MNV), the subgenomic RNA also encodes the VF1 protein, which functions to suppress host innate immune responses. To date, the mechanism of norovirus subgenomic RNA synthesis has not been characterized. We have previously described the presence of an evolutionarily conserved RNA stem-loop structure on the negative-sense RNA, the complementary sequence of which codes for the viral RNA-dependent RNA polymerase (NS7). The conserved stem-loop is positioned 6 nucleotides 3' of the start site of the subgenomic RNA in all caliciviruses. We demonstrate that the conserved stem-loop is essential for MNV viability. Mutant MNV RNAs with substitutions in the stem-loop replicated poorly until they accumulated mutations that revert to restore the stem-loop sequence and/or structure. The stem-loop sequence functions in a noncoding context, as it was possible to restore the replication of an MNV mutant by introducing an additional copy of the stem-loop between the NS7- and VP1-coding regions. Finally, in vitro biochemical data suggest that the stem-loop sequence is sufficient for the initiation of viral RNA synthesis by the recombinant MNV RNA-dependent RNA polymerase, confirming that the stem-loop forms the core of the norovirus subgenomic promoter. Noroviruses are a significant cause of viral gastroenteritis, and it is important to understand the mechanism of norovirus RNA synthesis. Here we describe the identification of an RNA stem-loop structure that functions as the core of the norovirus subgenomic RNA promoter in cells and in vitro. This work provides new insights into the molecular mechanisms of norovirus RNA synthesis and the sequences that determine the recognition of viral RNA by the RNA-dependent RNA polymerase. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
The impact of p53 protein core domain structural alteration on ovarian cancer survival.

PubMed

Rose, Stephen L; Robertson, Andrew D; Goodheart, Michael J; Smith, Brian J; DeYoung, Barry R; Buller, Richard E

2003-09-15

Although survival with a p53 missense mutation is highly variable, p53-null mutation is an independent adverse prognostic factor for advanced stage ovarian cancer. By evaluating ovarian cancer survival based upon a structure function analysis of the p53 protein, we tested the hypothesis that not all missense mutations are equivalent. The p53 gene was sequenced from 267 consecutive ovarian cancers. The effect of individual missense mutations on p53 structure was analyzed using the International Agency for Research on Cancer p53 Mutational Database, which specifies the effects of p53 mutations on p53 core domain structure. Mutations in the p53 core domain were classified as either explained or not explained in structural or functional terms by their predicted effects on protein folding, protein-DNA contacts, or mutation in highly conserved residues. Null mutations were classified by their mechanism of origin. Mutations were sequenced from 125 tumors. Effects of 62 of the 82 missense mutations (76%) could be explained by alterations in the p53 protein. Twenty-three (28%) of the explained mutations occurred in highly conserved regions of the p53 core protein. Twenty-two nonsense point mutations and 21 frameshift null mutations were sequenced. Survival was independent of missense mutation type and mechanism of null mutation. The hypothesis that not all missense mutations are equivalent is, therefore, rejected. Furthermore, p53 core domain structural alteration secondary to missense point mutation is not functionally equivalent to a p53-null mutation. The poor prognosis associated with p53-null mutation is independent of the mutation mechanism.
In silico evolution of the Drosophila gap gene regulatory sequence under elevated mutational pressure.

PubMed

Chertkova, Aleksandra A; Schiffman, Joshua S; Nuzhdin, Sergey V; Kozlov, Konstantin N; Samsonova, Maria G; Gursky, Vitaly V

2017-02-07

Cis-regulatory sequences are often composed of many low-affinity transcription factor binding sites (TFBSs). Determining the evolutionary and functional importance of regulatory sequence composition is impeded without a detailed knowledge of the genotype-phenotype map. We simulate the evolution of regulatory sequences involved in Drosophila melanogaster embryo segmentation during early development. Natural selection evaluates gene expression dynamics produced by a computational model of the developmental network. We observe a dramatic decrease in the total number of transcription factor binding sites through the course of evolution. Despite a decrease in average sequence binding energies through time, the regulatory sequences tend towards organisations containing increased high affinity transcription factor binding sites. Additionally, the binding energies of separate sequence segments demonstrate ubiquitous mutual correlations through time. Fewer than 10% of initial TFBSs are maintained throughout the entire simulation, deemed 'core' sites. These sites have increased functional importance as assessed under wild-type conditions and their binding energy distributions are highly conserved. Furthermore, TFBSs within close proximity of core sites exhibit increased longevity, reflecting functional regulatory interactions with core sites. In response to elevated mutational pressure, evolution tends to sample regulatory sequence organisations with fewer, albeit on average, stronger functional transcription factor binding sites. These organisations are also shaped by the regulatory interactions among core binding sites with sites in their local vicinity.
CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCR primer design

PubMed Central

Rose, Timothy M.; Henikoff, Jorja G.; Henikoff, Steven

2003-01-01

We have developed a new primer design strategy for PCR amplification of distantly related gene sequences based on consensus-degenerate hybrid oligonucleotide primers (CODEHOPs). An interactive program has been written to design CODEHOP PCR primers from conserved blocks of amino acids within multiply-aligned protein sequences. Each CODEHOP consists of a pool of related primers containing all possible nucleotide sequences encoding 3–4 highly conserved amino acids within a 3′ degenerate core. A longer 5′ non-degenerate clamp region contains the most probable nucleotide predicted for each flanking codon. CODEHOPs are used in PCR amplification to isolate distantly related sequences encoding the conserved amino acid sequence. The primer design software and the CODEHOP PCR strategy have been utilized for the identification and characterization of new gene orthologs and paralogs in different plant, animal and bacterial species. In addition, this approach has been successful in identifying new pathogen species. The CODEHOP designer (http://blocks.fhcrc.org/codehop.html) is linked to BlockMaker and the Multiple Alignment Processor within the Blocks Database World Wide Web (http://blocks.fhcrc.org). PMID:12824413
CORAL: aligning conserved core regions across domain families.

PubMed

Fong, Jessica H; Marchler-Bauer, Aron

2009-08-01

Homologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile-profile method CORAL that aligns individual core regions as gap-free units. CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved 'readability' that facilitate manual refinement. CORAL will be included in future versions of the NCBI Cn3D/CDTree software, which can be downloaded at http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml. Supplementary data are available at Bioinformatics online.
Early Evolution of Conserved Regulatory Sequences Associated with Development in Vertebrates

PubMed Central

McEwen, Gayle K.; Goode, Debbie K.; Parker, Hugo J.; Woolfe, Adam; Callaway, Heather; Elgar, Greg

2009-01-01

Comparisons between diverse vertebrate genomes have uncovered thousands of highly conserved non-coding sequences, an increasing number of which have been shown to function as enhancers during early development. Despite their extreme conservation over 500 million years from humans to cartilaginous fish, these elements appear to be largely absent in invertebrates, and, to date, there has been little understanding of their mode of action or the evolutionary processes that have modelled them. We have now exploited emerging genomic sequence data for the sea lamprey, Petromyzon marinus, to explore the depth of conservation of this type of element in the earliest diverging extant vertebrate lineage, the jawless fish (agnathans). We searched for conserved non-coding elements (CNEs) at 13 human gene loci and identified lamprey elements associated with all but two of these gene regions. Although markedly shorter and less well conserved than within jawed vertebrates, identified lamprey CNEs are able to drive specific patterns of expression in zebrafish embryos, which are almost identical to those driven by the equivalent human elements. These CNEs are therefore a unique and defining characteristic of all vertebrates. Furthermore, alignment of lamprey and other vertebrate CNEs should permit the identification of persistent sequence signatures that are responsible for common patterns of expression and contribute to the elucidation of the regulatory language in CNEs. Identifying the core regulatory code for development, common to all vertebrates, provides a foundation upon which regulatory networks can be constructed and might also illuminate how large conserved regulatory sequence blocks evolve and become fixed in genomic DNA. PMID:20011110
The Replacement of 10 Non-Conserved Residues in the Core Protein of JFH-1 Hepatitis C Virus Improves Its Assembly and Secretion

PubMed Central

Etienne, Loïc; Blanchard, Emmanuelle; Boyer, Audrey; Desvignes, Virginie; Gaillard, Julien; Meunier, Jean-Christophe; Roingeard, Philippe; Hourioux, Christophe

2015-01-01

Hepatitis C virus (HCV) assembly is still poorly understood. It is thought that trafficking of the HCV core protein to the lipid droplet (LD) surface is essential for its multimerization and association with newly synthesized HCV RNA to form the viral nucleocapsid. We carried out a mapping analysis of several complete HCV genomes of all genotypes, and found that the genotype 2 JFH-1 core protein contained 10 residues different from those of other genotypes. The replacement of these 10 residues of the JFH-1 strain sequence with the most conserved residues deduced from sequence alignments greatly increased virus production. Confocal microscopy of the modified JFH-1 strain in cell culture showed that the mutated JFH-1 core protein, C10M, was present mostly at the endoplasmic reticulum (ER) membrane, but not at the surface of the LDs, even though its trafficking to these organelles was possible. The non-structural 5A protein of HCV was also redirected to ER membranes and colocalized with the C10M core protein. Using a Semliki forest virus vector to overproduce core protein, we demonstrated that the C10M core protein was able to form HCV-like particles, unlike the native JFH-1 core protein. Thus, the substitution of a few selected residues in the JFH-1 core protein modified the subcellular distribution and assembly properties of the protein. These findings suggest that the early steps of HCV assembly occur at the ER membrane rather than at the LD surface. The C10M-JFH-1 strain will be a valuable tool for further studies of HCV morphogenesis. PMID:26339783
Are sdAs helium core stars?

NASA Astrophysics Data System (ADS)

Pelisoli, Ingrid; Kepler, S. O.; Koester, Detlev

2017-12-01

Evolved stars with a helium core can be formed by non-conservative mass exchange interaction with a companion or by strong mass loss. Their masses are smaller than 0.5 M⊙. In the database of the Sloan Digital Sky Survey (SDSS), there are several thousand stars which were classified by the pipeline as dwarf O, B and A stars. Considering the lifetimes of these classes on the main sequence, and their distance modulus at the SDSS bright saturation, if these were common main sequence stars, there would be a considerable population of young stars very far from the galactic disk. Their spectra are dominated by Balmer lines which suggest effective temperatures around 8 000-10 000 K. Several thousand have significant proper motions, indicative of distances smaller than 1 kpc. Many show surface gravity in intermediate values between main sequence and white dwarf, 4.75 < log g < 6.5, hence they have been called sdA stars. Their physical nature and evolutionary history remains a puzzle. We propose they are not H-core main sequence stars, but helium core stars and the outcomes of binary evolution. We report the discovery of two new extremely-low mass white dwarfs among the sdAs to support this statement.

Localization of yeast RNA polymerase I core subunits by immunoelectron microscopy.

PubMed Central

Klinger, C; Huet, J; Song, D; Petersen, G; Riva, M; Bautz, E K; Sentenac, A; Oudet, P; Schultz, P

1996-01-01

Immunoelectron microscopy was used to determine the spatial organization of the yeast RNA polymerase I core subunits on a three-dimensional model of the enzyme. Images of antibody-labeled enzymes were compared with the native enzyme to determine the localization of the antibody binding site on the surface of the model. Monoclonal antibodies were used as probes to identify the two largest subunits homologous to the bacterial beta and beta' subunits. The epitopes for the two monoclonal antibodies were mapped using subunit-specific phage display libraries, thus allowing a direct correlation of the structural data with functional information on conserved sequence elements. An epitope close to conserved region C of the beta-like subunit is located at the base of the finger-like domain, whereas a sequence between conserved regions C and D of the beta'-like subunit is located in the apical region of the enzyme. Polyclonal antibodies outlined the alpha-like subunit AC40 and subunit AC19 which were found co-localized also in the apical region of the enzyme. The spatial location of the subunits is correlated with their biological activity and the inhibitory effect of the antibodies. Images PMID:8887555
Prevalence of mutations in hepatitis C virus core protein associated with alteration of NF-kappaB activation.

PubMed

Mann, Elizabeth A; Stanford, Sandra; Sherman, Kenneth E

2006-10-01

The hepatitis C virus (HCV) core protein is a key structural element of the virion but also affects a number of cellular pathways, including nuclear factor kappaB (NF-kappaB) signaling. NF-kappaB is a transcription factor that regulates both anti-apoptotic and pro-inflammatory genes and its activation may contribute to HCV-mediated pathogenesis. Amino acid sequence divergence in core is seen at the genotype level as well as within patient isolates. Recent work has implicated amino acids 9-11 of core in the modulation of NF-kappaB activation. We report that the sequence RKT is highly conserved (93%) at this position across all HCV genotypes, based on sequences collected in the Los Alamos HCV database. Of the 13 types of variants present in the database, the two most prevalent substitutions are RQT and RKP. We further show that core encoding RKP fails to activate NF-kappaB signaling in vitro while NF-kappaB activation by core encoding RQT does not differ from control RKT core. The effect of RKP core is specific to NF-kappaB signaling as activator protein 1 (AP-1) activity is not altered. Further studies are needed to assess potential associations between specific amino acid substitutions at positions 9-11 and liver disease progression and/or response to treatment in individual patients.
Characterization of a native hammerhead ribozyme derived from schistosomes

PubMed Central

OSBORNE, EDITH M.; SCHAAK, JANELL E.; DEROSE, VICTORIA J.

2005-01-01

A recent re-examination of the role of the helices surrounding the conserved core of the hammerhead ribozyme has identified putative loop–loop interactions between stems I and II in native hammerhead sequences. These extended hammerhead sequences are more active at low concentrations of divalent cations than are minimal hammerheads. The loop–loop interactions are proposed to stabilize a more active conformation of the conserved core. Here, a kinetic and thermodynamic characterization of an extended hammerhead sequence derived from Schistosoma mansoni is performed. Biphasic kinetics are observed, suggesting the presence of at least two conformers, one cleaving with a fast rate and the other with a slow rate. Replacing loop II with a poly(U) sequence designed to eliminate the interaction between the two loops results in greatly diminished activity, suggesting that the loop–loop interactions do aid in forming a more active conformation. Previous studies with minimal hammerheads have shown deleterious effects of Rp-phosphorothioate substitutions at the cleavage site and 5′ to A9, both of which could be rescued with Cd2+. Here, phosphorothioate modifications at the cleavage site and 5′ to A9 were made in the schistosome-derived sequence. In Mg2+, both phosphorothioate substitutions decreased the overall fraction cleaved without significantly affecting the observed rate of cleavage. The addition of Cd2+ rescued cleavage in both cases, suggesting that these are still putative metal binding sites in this native sequence. PMID:15659358
Biochemical Characterization of a Mycobacteriophage Derived DnaB Ortholog Reveals New Insight into the Evolutionary Origin of DnaB Helicases

PubMed Central

Bhowmik, Priyanka; Das Gupta, Sujoy K.

2015-01-01

The bacterial replicative helicases known as DnaB are considered to be members of the RecA superfamily. All members of this superfamily, including DnaB, have a conserved C- terminal domain, known as the RecA core. We unearthed a series of mycobacteriophage encoded proteins in which the RecA core domain alone was present. These proteins were phylogenetically related to each other and formed a distinct clade within the RecA superfamily. A mycobacteriophage encoded protein, Wildcat Gp80 that roots deep in the DnaB family, was found to possess a core domain having significant sequence homology (Expect value < 10-5) with members of this novel cluster. This indicated that Wildcat Gp80, and by extrapolation, other members of the DnaB helicase family, may have evolved from a single domain RecA core polypeptide belonging to this novel group. Biochemical investigations confirmed that Wildcat Gp80 was a helicase. Surprisingly, our investigations also revealed that a thioredoxin tagged truncated version of the protein in which the N-terminal sequences were removed was fully capable of supporting helicase activity, although its ATP dependence properties were different. DnaB helicase activity is thus, primarily a function of the RecA core although additional N-terminal sequences may be necessary for fine tuning its activity and stability. Based on sequence comparison and biochemical studies we propose that DnaB helicases may have evolved from single domain RecA core proteins having helicase activities of their own, through the incorporation of additional N-terminal sequences. PMID:26237048
The crystal structure of Erwinia amylovora AmyR, a member of the YbjN protein family, shows similarity to type III secretion chaperones but suggests different cellular functions

PubMed Central

Bartho, Joseph D.; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O.; Zhao, Youfu; Walsh, Martin A.

2017-01-01

AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family. PMID:28426806
The crystal structure of Erwinia amylovora AmyR, a member of the YbjN protein family, shows similarity to type III secretion chaperones but suggests different cellular functions.

PubMed

Bartho, Joseph D; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O; Zhao, Youfu; Walsh, Martin A; Benini, Stefano

2017-01-01

AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family.
Energy-converting [NiFe] hydrogenases: more than just H2 activation.

PubMed

Hedderich, Reiner; Forzi, Lucia

2005-01-01

The well-characterized [NiFe] hydrogenases have a key function in the H2 metabolism of various microorganisms. A subfamily of the [NiFe] hydrogenases with unique properties has recently been identified. The six conserved subunits that build the core of these membrane-bound hydrogenases share sequence similarity with subunits that form the catalytic core of energy-conserving NADH:quinone oxidoreductases (complex I). The physiological role of some of these hydrogenases is to catalyze the reduction of H+ with electrons derived from reduced ferredoxins or polyferredoxins. This exergonic reaction is coupled to energy conservation by means of electron-transport phosphorylation. Other members of this hydrogenase subfamily mainly function in providing the cell with reduced ferredoxin using H2 as electron donor in a reaction driven by reverse electron transport. These hydrogenases have therefore been designated as energy-converting [NiFe] hydrogenases. Copyright 2005 S. Karger AG, Basel.
Fanconi Anemia Core Complex Gene Promoters Harbor Conserved Transcription Regulatory Elements

PubMed Central

Meier, Daniel; Schindler, Detlev

2011-01-01

The Fanconi anemia (FA) gene family is a recent addition to the complex network of proteins that respond to and repair certain types of DNA damage in the human genome. Since little is known about the regulation of this novel group of genes at the DNA level, we characterized the promoters of the eight genes (FANCA, B, C, E, F, G, L and M) that compose the FA core complex. The promoters of these genes show the characteristic attributes of housekeeping genes, such as a high GC content and CpG islands, a lack of TATA boxes and a low conservation. The promoters functioned in a monodirectional way and were, in their most active regions, comparable in strength to the SV40 promoter in our reporter plasmids. They were also marked by a distinctive transcriptional start site (TSS). In the 5′ region of each promoter, we identified a region that was able to negatively regulate the promoter activity in HeLa and HEK 293 cells in isolation. The central and 3′ regions of the promoter sequences harbor binding sites for several common and rare transcription factors, including STAT, SMAD, E2F, AP1 and YY1, which indicates that there may be cross-connections to several established regulatory pathways. Electrophoretic mobility shift assays and siRNA experiments confirmed the shared regulatory responses between the prominent members of the TGF-β and JAK/STAT pathways and members of the FA core complex. Although the promoters are not well conserved, they share region and sequence specific regulatory motifs and transcription factor binding sites (TBFs), and we identified a bi-partite nature to these promoters. These results support a hypothesis based on the co-evolution of the FA core complex genes that was expanded to include their promoters. PMID:21826217
Fanconi anemia core complex gene promoters harbor conserved transcription regulatory elements.

PubMed

Meier, Daniel; Schindler, Detlev

2011-01-01

The Fanconi anemia (FA) gene family is a recent addition to the complex network of proteins that respond to and repair certain types of DNA damage in the human genome. Since little is known about the regulation of this novel group of genes at the DNA level, we characterized the promoters of the eight genes (FANCA, B, C, E, F, G, L and M) that compose the FA core complex. The promoters of these genes show the characteristic attributes of housekeeping genes, such as a high GC content and CpG islands, a lack of TATA boxes and a low conservation. The promoters functioned in a monodirectional way and were, in their most active regions, comparable in strength to the SV40 promoter in our reporter plasmids. They were also marked by a distinctive transcriptional start site (TSS). In the 5' region of each promoter, we identified a region that was able to negatively regulate the promoter activity in HeLa and HEK 293 cells in isolation. The central and 3' regions of the promoter sequences harbor binding sites for several common and rare transcription factors, including STAT, SMAD, E2F, AP1 and YY1, which indicates that there may be cross-connections to several established regulatory pathways. Electrophoretic mobility shift assays and siRNA experiments confirmed the shared regulatory responses between the prominent members of the TGF-β and JAK/STAT pathways and members of the FA core complex. Although the promoters are not well conserved, they share region and sequence specific regulatory motifs and transcription factor binding sites (TBFs), and we identified a bi-partite nature to these promoters. These results support a hypothesis based on the co-evolution of the FA core complex genes that was expanded to include their promoters.
Genetic Structure and Selection of a Core Collection for Long Term Conservation of Avocado in Mexico

PubMed Central

Guzmán, Luis F.; Machida-Hirano, Ryoko; Borrayo, Ernesto; Cortés-Cruz, Moisés; Espíndola-Barquera, María del Carmen; Heredia García, Elena

2017-01-01

Mexico, as the center of origin of avocado (Persea americama Mill.), harbors a wide genetic diversity of this species, whose identification may provide the grounds to not only understand its unique population structure and domestication history, but also inform the efforts aimed at its conservation. Although molecular characterization of cultivated avocado germplasm has been studied by several research groups, this had not been the case in Mexico. In order to elucidate the genetic structure of avocado in Mexico and the sustainable use of its genetic resources, 318 avocado accessions conserved in the germplasm collection in the National Avocado Genebank were analyzed using 28 markers [9 expressed sequence tag-Simple Sequence Repeats (SSRs) and 19 genomic SSRs]. Deviation from Hardy Weinberg Equilibrium and high inter-locus linkage disequilibrium were observed especially in drymifolia, and guatemalensis. Total averages of the observed and expected heterozygosity were 0.59 and 0.75, respectively. Although clear genetic differentiation was not observed among 3 botanical races: americana, drymifolia, and guatemalensis, the analyzed Mexican population can be classified into two groups that correspond to two different ecological regions. We developed a core-collection by K-means clustering method. The selected 36 individuals as core-collection successfully represented more than 80% of total alleles and showed heterozygosity values equal to or higher than those of the original collection, despite its constituting slightly more than 10% of the latter. Accessions selected as members of the core collection have now become candidates to be introduced in cryopreservation implying a minimum loss of genetic diversity and a back-up for existing field collections of such important genetic resources. PMID:28286510
Conformation of Tax-response elements in the human T-cell leukemia virus type I promoter.

PubMed

Cox, J M; Sloan, L S; Schepartz, A

1995-12-01

HTLV-I Tax is believed to activate viral gene expression by binding bZIP proteins (such as CREB) and increasing their affinities for proviral TRE target sites. Each 21 bp TRE target site contains an imperfect copy of the intrinsically bent CRE target site (the TRE core) surrounded by highly conserved flanking sequences. These flanking sequences are essential for maximal increases in DNA affinity and transactivation, but they are not, apparently, contacted by protein. Here we employ non-denaturing gel electrophoresis to evaluate TRE conformation in the presence and absence of bZIP proteins, and to explore the role of DNA conformation in viral transactivation. Our results show that the TRE-1 flanking sequences modulate the structure and modestly increase the affinity of a CREB bZIP peptide for the TRE-1 core recognition sequence. These flanking sequences are also essential for a maximal increase in stability of the CREB-DNA complex in the presence of Tax. The CRE-like TRE core and the TRE flanking sequences are both essential for formation of stable CREB-TRE-1 and Tax-CREB-TRE-1 complexes. These two DNA segments may have co-evolved into a unique structure capable of recognizing Tax and a bZIP protein.
Evolution of the arginase fold and functional diversity

PubMed Central

Dowling, Daniel P.; Costanzo, Luigi Di; Gennadios, Heather A.; Christianson, David W.

2009-01-01

The large number of protein structures deposited in the Protein Data Bank allows for the identification of novel structural superfamilies based on conservation of fold in addition to conservation of amino acid sequence. Since sequence diverges more rapidly than fold in protein evolution, proteins with little or no significant sequence identity are occasionally observed to adopt similar folds, thereby reflecting unanticipated evolutionary relationships. Here, we review the unique α/β fold first observed in the manganese metalloenzyme rat liver arginase, consisting of a parallel 8 stranded β-sheet surrounded by several helices, and its evolutionary relationship with the zinc-requiring and/or iron-requiring histone deacetylases and acetylpolyamine amidohydrolases. Structural comparisons reveal key features of the core α/β fold that contribute to the divergent metal ion specificity and stoichiometry required for the chemical and biological functions of these enzymes. PMID:18360740
An evolutionary analysis identifies a conserved pentapeptide stretch containing the two essential lysine residues for rice L-myo-inositol 1-phosphate synthase catalytic activity

PubMed Central

Basak, Papri; Maitra-Majee, Susmita; Das, Jayanta Kumar; Mukherjee, Abhishek; Ghosh Dastidar, Shubhra; Pal Choudhury, Pabitra

2017-01-01

A molecular evolutionary analysis of a well conserved protein helps to determine the essential amino acids in the core catalytic region. Based on the chemical properties of amino acid residues, phylogenetic analysis of a total of 172 homologous sequences of a highly conserved enzyme, L-myo-inositol 1-phosphate synthase or MIPS from evolutionarily diverse organisms was performed. This study revealed the presence of six phylogenetically conserved blocks, out of which four embrace the catalytic core of the functional protein. Further, specific amino acid modifications targeting the lysine residues, known to be important for MIPS catalysis, were performed at the catalytic site of a MIPS from monocotyledonous model plant, Oryza sativa (OsMIPS1). Following this study, OsMIPS mutants with deletion or replacement of lysine residues in the conserved blocks were made. Based on the enzyme kinetics performed on the deletion/replacement mutants, phylogenetic and structural comparison with the already established crystal structures from non-plant sources, an evolutionarily conserved peptide stretch was identified at the active pocket which contains the two most important lysine residues essential for catalytic activity. PMID:28950028
Comparative genomics and prediction of conditionally dispensable sequences in legume-infecting Fusarium oxysporum formae speciales facilitates identification of candidate effectors.

PubMed

Williams, Angela H; Sharma, Mamta; Thatcher, Louise F; Azam, Sarwar; Hane, James K; Sperschneider, Jana; Kidd, Brendan N; Anderson, Jonathan P; Ghosh, Raju; Garg, Gagan; Lichtenzveig, Judith; Kistler, H Corby; Shea, Terrance; Young, Sarah; Buck, Sally-Anne G; Kamphuis, Lars G; Saxena, Rachit; Pande, Suresh; Ma, Li-Jun; Varshney, Rajeev K; Singh, Karam B

2016-03-05

Soil-borne fungi of the Fusarium oxysporum species complex cause devastating wilt disease on many crops including legumes that supply human dietary protein needs across many parts of the globe. We present and compare draft genome assemblies for three legume-infecting formae speciales (ff. spp.): F. oxysporum f. sp. ciceris (Foc-38-1) and f. sp. pisi (Fop-37622), significant pathogens of chickpea and pea respectively, the world's second and third most important grain legumes, and lastly f. sp. medicaginis (Fom-5190a) for which we developed a model legume pathosystem utilising Medicago truncatula. Focusing on the identification of pathogenicity gene content, we leveraged the reference genomes of Fusarium pathogens F. oxysporum f. sp. lycopersici (tomato-infecting) and F. solani (pea-infecting) and their well-characterised core and dispensable chromosomes to predict genomic organisation in the newly sequenced legume-infecting isolates. Dispensable chromosomes are not essential for growth and in Fusarium species are known to be enriched in host-specificity and pathogenicity-associated genes. Comparative genomics of the publicly available Fusarium species revealed differential patterns of sequence conservation across F. oxysporum formae speciales, with legume-pathogenic formae speciales not exhibiting greater sequence conservation between them relative to non-legume-infecting formae speciales, possibly indicating the lack of a common ancestral source for legume pathogenicity. Combining predicted dispensable gene content with in planta expression in the model legume-infecting isolate, we identified small conserved regions and candidate effectors, four of which shared greatest similarity to proteins from another legume-infecting ff. spp. We demonstrate that distinction of core and potential dispensable genomic regions of novel F. oxysporum genomes is an effective tool to facilitate effector discovery and the identification of gene content possibly linked to host specificity. While the legume-infecting isolates didn't share large genomic regions of pathogenicity-related content, smaller regions and candidate effector proteins were highly conserved, suggesting that they may play specific roles in inducing disease on legume hosts.
Evolution of the cytoskeleton

PubMed Central

Erickson, Harold P.

2009-01-01

Summary The eukaryotic cytoskeleton appears to have evolved from ancestral precursors related to prokaryotic FtsZ and MreB. FtsZ and MreB show 40−50% sequence identity across different bacterial and archaeal species. Here I suggest that this represents the limit of divergence that is consistent with maintaining their functions for cytokinesis and cell shape. Previous analyses have noted that tubulin and actin are highly conserved across eukaryotic species, but so divergent from their prokaryotic relatives as to be hardly recognizable from sequence comparisons. One suggestion for this extreme divergence of tubulin and actin is that it occurred as they evolved very different functions from FtsZ and MreB. I will present new arguments favoring this suggestion, and speculate on pathways. Moreover, the extreme conservation of tubulin and actin across eukaryotic species is not due to an intrinsic lack of variability, but is attributed to their acquisition of elaborate mechanisms for assembly dynamics and their interactions with multiple motor and binding proteins. A new structure-based sequence alignment identifies amino acids that are conserved from FtsZ to tubulins. The highly conserved amino acids are not those forming the subunit core or protofilament interface, but those involved in binding and hydrolysis of GTP. PMID:17563102
Spliced leader RNA of trypanosomes: in vivo mutational analysis reveals extensive and distinct requirements for trans splicing and cap4 formation.

PubMed Central

Lücke, S; Xu, G L; Palfi, Z; Cross, M; Bellofatto, V; Bindereif, A

1996-01-01

In trypanosomes mRNAs are generated through trans splicing. The spliced leader (SL) RNA, which donates the 5'-terminal mini-exon to each of the protein coding exons, plays a central role in the trans splicing process. We have established in vivo assays to study in detail trans splicing, cap4 modification, and RNP assembly of the SL RNA in the trypanosomatid species Leptomonas seymouri. First, we found that extensive sequences within the mini-exon are required for SL RNA function in vivo, although a conserved length of 39 nt is not essential. In contrast, the intron sequence appears to be surprisingly tolerant to mutation; only the stem-loop II structure is indispensable. The asymmetry of the sequence requirements in the stem I region suggests that this domain may exist in different functional conformations. Second, distinct mini-exon sequences outside the modification site are important for efficient cap4 formation. Third, all SL RNA mutations tested allowed core RNP assembly, suggesting flexible requirements for core protein binding. In sum, the results of our mutational analysis provide evidence for a discrete domain structure of the SL RNA and help to explain the strong phylogenetic conservation of the mini-exon sequence and of the overall SL RNA secondary structure; they also suggest that there may be certain differences between trans splicing in nematodes and trypanosomes. This approach provides a basis for studying RNA-RNA interactions in the trans spliceosome. Images PMID:8861965
Comparative and genetic analysis of the four sequenced Paenibacillus polymyxa genomes reveals a diverse metabolism and conservation of genes relevant to plant-growth promotion and competitiveness.

PubMed

Eastman, Alexander W; Heinrichs, David E; Yuan, Ze-Chun

2014-10-03

Members of the genus Paenibacillus are important plant growth-promoting rhizobacteria that can serve as bio-reactors. Paenibacillus polymyxa promotes the growth of a variety of economically important crops. Our lab recently completed the genome sequence of Paenibacillus polymyxa CR1. As of January 2014, four P. polymyxa genomes have been completely sequenced but no comparative genomic analyses have been reported. Here we report the comparative and genetic analyses of four sequenced P. polymyxa genomes, which revealed a significantly conserved core genome. Complex metabolic pathways and regulatory networks were highly conserved and allow P. polymyxa to rapidly respond to dynamic environmental cues. Genes responsible for phytohormone synthesis, phosphate solubilization, iron acquisition, transcriptional regulation, σ-factors, stress responses, transporters and biomass degradation were well conserved, indicating an intimate association with plant hosts and the rhizosphere niche. In addition, genes responsible for antimicrobial resistance and non-ribosomal peptide/polyketide synthesis are present in both the core and accessory genome of each strain. Comparative analyses also reveal variations in the accessory genome, including large plasmids present in strains M1 and SC2. Furthermore, a considerable number of strain-specific genes and genomic islands are irregularly distributed throughout each genome. Although a variety of plant-growth promoting traits are encoded by all strains, only P. polymyxa CR1 encodes the unique nitrogen fixation cluster found in other Paenibacillus sp. Our study revealed that genomic loci relevant to host interaction and ecological fitness are highly conserved within the P. polymyxa genomes analysed, despite variations in the accessory genome. This work suggets that plant-growth promotion by P. polymyxa is mediated largely through phytohormone production, increased nutrient availability and bio-control mechanisms. This study provides an in-depth understanding of the genome architecture of this species, thus facilitating future genetic engineering and applications in agriculture, industry and medicine. Furthermore, this study highlights the current gap in our understanding of complex plant biomass metabolism in Gram-positive bacteria.
Core histone genes of Giardia intestinalis: genomic organization, promoter structure, and expression

PubMed Central

Yee, Janet; Tang, Anita; Lau, Wei-Ling; Ritter, Heather; Delport, Dewald; Page, Melissa; Adam, Rodney D; Müller, Miklós; Wu, Gang

2007-01-01

Background Giardia intestinalis is a protist found in freshwaters worldwide, and is the most common cause of parasitic diarrhea in humans. The phylogenetic position of this parasite is still much debated. Histones are small, highly conserved proteins that associate tightly with DNA to form chromatin within the nucleus. There are two classes of core histone genes in higher eukaryotes: DNA replication-independent histones and DNA replication-dependent ones. Results We identified two copies each of the core histone H2a, H2b and H3 genes, and three copies of the H4 gene, at separate locations on chromosomes 3, 4 and 5 within the genome of Giardia intestinalis, but no gene encoding a H1 linker histone could be recognized. The copies of each gene share extensive DNA sequence identities throughout their coding and 5' noncoding regions, which suggests these copies have arisen from relatively recent gene duplications or gene conversions. The transcription start sites are at triplet A sequences 1–27 nucleotides upstream of the translation start codon for each gene. We determined that a 50 bp region upstream from the start of the histone H4 coding region is the minimal promoter, and a highly conserved 15 bp sequence called the histone motif (him) is essential for its activity. The Giardia core histone genes are constitutively expressed at approximately equivalent levels and their mRNAs are polyadenylated. Competition gel-shift experiments suggest that a factor within the protein complex that binds him may also be a part of the protein complexes that bind other promoter elements described previously in Giardia. Conclusion In contrast to other eukaryotes, the Giardia genome has only a single class of core histone genes that encode replication-independent histones. Our inability to locate a gene encoding the linker histone H1 leads us to speculate that the H1 protein may not be required for the compaction of Giardia's small and gene-rich genome. PMID:17425802
Identification of Cis-Acting Promoter Elements in Cold- and Dehydration-Induced Transcriptional Pathways in Arabidopsis, Rice, and Soybean

PubMed Central

Maruyama, Kyonoshin; Todaka, Daisuke; Mizoi, Junya; Yoshida, Takuya; Kidokoro, Satoshi; Matsukura, Satoko; Takasaki, Hironori; Sakurai, Tetsuya; Yamamoto, Yoshiharu Y.; Yoshiwara, Kyouko; Kojima, Mikiko; Sakakibara, Hitoshi; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko

2012-01-01

The genomes of three plants, Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and soybean (Glycine max), have been sequenced, and their many genes and promoters have been predicted. In Arabidopsis, cis-acting promoter elements involved in cold- and dehydration-responsive gene expression have been extensively analysed; however, the characteristics of such cis-acting promoter sequences in cold- and dehydration-inducible genes of rice and soybean remain to be clarified. In this study, we performed microarray analyses using the three species, and compared characteristics of identified cold- and dehydration-inducible genes. Transcription profiles of the cold- and dehydration-responsive genes were similar among these three species, showing representative upregulated (dehydrin/LEA) and downregulated (photosynthesis-related) genes. All (46 = 4096) hexamer sequences in the promoters of the three species were investigated, revealing the frequency of conserved sequences in cold- and dehydration-inducible promoters. A core sequence of the abscisic acid-responsive element (ABRE) was the most conserved in dehydration-inducible promoters of all three species, suggesting that transcriptional regulation for dehydration-inducible genes is similar among these three species, with the ABRE-dependent transcriptional pathway. In contrast, for cold-inducible promoters, the conserved hexamer sequences were diversified among these three species, suggesting the existence of diverse transcriptional regulatory pathways for cold-inducible genes among the species. PMID:22184637
[Inverse PCR amplification of the complete major capsid protein gene of lymphocystis disease virus isolated from Rachycentron canadum and the phylogenetic analysis of the virus].

PubMed

Fu, Xiao-Zhe; Shi, Cun-Bin; Li, Ning-Qiu; Pan, Hou-Jun; Chang, Ou-Qin; Wu, Shu-Qin

2007-09-01

The major capsid protein of lymphocystis disease virus isolated from Rachycentron canadum (LCDV-rc) was amplified and analysed. The 457bp DNA core fragment was amplified with the degenerate primers designed according to the conserved sequences of MCP gene of iridoviruses, then the flaking sequences adjacent to the core region were amplified by inverse PCR, and the complete sequence was obtained by combining all of them. The open reading frame of the gene is 1380bp in length, encoding a putative protein of 459 aa with molecular weight 51.12 kD and pI 6.87. Constructing the phylogenetic tree for comparing the MCP amino acid of iridoviruses, the results indicated that LCDV-rc is most homologous to the other Lymphocystis viruses and all of them constitute a branch. Accordingly LCDV-rc is identified as Lymphocystivirus.

Bioinformatic Analyses of Unique (Orphan) Core Genes of the Genus Acidithiobacillus: Functional Inferences and Use As Molecular Probes for Genomic and Metagenomic/Transcriptomic Interrogation

PubMed Central

González, Carolina; Lazcano, Marcelo; Valdés, Jorge; Holmes, David S.

2016-01-01

Using phylogenomic and gene compositional analyses, five highly conserved gene families have been detected in the core genome of the phylogenetically coherent genus Acidithiobacillus of the class Acidithiobacillia. These core gene families are absent in the closest extant genus Thermithiobacillus tepidarius that subtends the Acidithiobacillus genus and roots the deepest in this class. The predicted proteins encoded by these core gene families are not detected by a BLAST search in the NCBI non-redundant database of more than 90 million proteins using a relaxed cut-off of 1.0e−5. None of the five families has a clear functional prediction. However, bioinformatic scrutiny, using pI prediction, motif/domain searches, cellular location predictions, genomic context analyses, and chromosome topology studies together with previously published transcriptomic and proteomic data, suggests that some may have functions associated with membrane remodeling during cell division perhaps in response to pH stress. Despite the high level of amino acid sequence conservation within each family, there is sufficient nucleotide variation of the respective genes to permit the use of the DNA sequences to distinguish different species of Acidithiobacillus, making them useful additions to the armamentarium of tools for phylogenetic analysis. Since the protein families are unique to the Acidithiobacillus genus, they can also be leveraged as probes to detect the genus in environmental metagenomes and metatranscriptomes, including industrial biomining operations, and acid mine drainage (AMD). PMID:28082953
Bioinformatic Analyses of Unique (Orphan) Core Genes of the Genus Acidithiobacillus: Functional Inferences and Use As Molecular Probes for Genomic and Metagenomic/Transcriptomic Interrogation.

PubMed

González, Carolina; Lazcano, Marcelo; Valdés, Jorge; Holmes, David S

2016-01-01

Using phylogenomic and gene compositional analyses, five highly conserved gene families have been detected in the core genome of the phylogenetically coherent genus Acidithiobacillus of the class Acidithiobacillia . These core gene families are absent in the closest extant genus Thermithiobacillus tepidarius that subtends the Acidithiobacillus genus and roots the deepest in this class. The predicted proteins encoded by these core gene families are not detected by a BLAST search in the NCBI non-redundant database of more than 90 million proteins using a relaxed cut-off of 1.0e -5 . None of the five families has a clear functional prediction. However, bioinformatic scrutiny, using pI prediction, motif/domain searches, cellular location predictions, genomic context analyses, and chromosome topology studies together with previously published transcriptomic and proteomic data, suggests that some may have functions associated with membrane remodeling during cell division perhaps in response to pH stress. Despite the high level of amino acid sequence conservation within each family, there is sufficient nucleotide variation of the respective genes to permit the use of the DNA sequences to distinguish different species of Acidithiobacillus , making them useful additions to the armamentarium of tools for phylogenetic analysis. Since the protein families are unique to the Acidithiobacillus genus, they can also be leveraged as probes to detect the genus in environmental metagenomes and metatranscriptomes, including industrial biomining operations, and acid mine drainage (AMD).
Genomic dissection of conserved transcriptional regulation in intestinal epithelial cells

PubMed Central

Camp, J. Gray; Weiser, Matthew; Cocchiaro, Jordan L.; Kingsley, David M.; Furey, Terrence S.; Sheikh, Shehzad Z.; Rawls, John F.

2017-01-01

The intestinal epithelium serves critical physiologic functions that are shared among all vertebrates. However, it is unknown how the transcriptional regulatory mechanisms underlying these functions have changed over the course of vertebrate evolution. We generated genome-wide mRNA and accessible chromatin data from adult intestinal epithelial cells (IECs) in zebrafish, stickleback, mouse, and human species to determine if conserved IEC functions are achieved through common transcriptional regulation. We found evidence for substantial common regulation and conservation of gene expression regionally along the length of the intestine from fish to mammals and identified a core set of genes comprising a vertebrate IEC signature. We also identified transcriptional start sites and other putative regulatory regions that are differentially accessible in IECs in all 4 species. Although these sites rarely showed sequence conservation from fish to mammals, surprisingly, they drove highly conserved IEC expression in a zebrafish reporter assay. Common putative transcription factor binding sites (TFBS) found at these sites in multiple species indicate that sequence conservation alone is insufficient to identify much of the functionally conserved IEC regulatory information. Among the rare, highly sequence-conserved, IEC-specific regulatory regions, we discovered an ancient enhancer upstream from her6/HES1 that is active in a distinct population of Notch-positive cells in the intestinal epithelium. Together, these results show how combining accessible chromatin and mRNA datasets with TFBS prediction and in vivo reporter assays can reveal tissue-specific regulatory information conserved across 420 million years of vertebrate evolution. We define an IEC transcriptional regulatory network that is shared between fish and mammals and establish an experimental platform for studying how evolutionarily distilled regulatory information commonly controls IEC development and physiology. PMID:28850571
The Malarial Host-Targeting Signal Is Conserved in the Irish Potato Famine Pathogen

PubMed Central

Liolios, Konstantinos; Win, Joe; Kanneganti, Thirumala-Devi; Young, Carolyn; Kamoun, Sophien; Haldar, Kasturi

2006-01-01

Animal and plant eukaryotic pathogens, such as the human malaria parasite Plasmodium falciparum and the potato late blight agent Phytophthora infestans, are widely divergent eukaryotic microbes. Yet they both produce secretory virulence and pathogenic proteins that alter host cell functions. In P. falciparum, export of parasite proteins to the host erythrocyte is mediated by leader sequences shown to contain a host-targeting (HT) motif centered on an RxLx (E, D, or Q) core: this motif appears to signify a major pathogenic export pathway with hundreds of putative effectors. Here we show that a secretory protein of P. infestans, which is perceived by plant disease resistance proteins and induces hypersensitive plant cell death, contains a leader sequence that is equivalent to the Plasmodium HT-leader in its ability to export fusion of green fluorescent protein (GFP) from the P. falciparum parasite to the host erythrocyte. This export is dependent on an RxLR sequence conserved in P. infestans leaders, as well as in leaders of all ten secretory oomycete proteins shown to function inside plant cells. The RxLR motif is also detected in hundreds of secretory proteins of P. infestans, Phytophthora sojae, and Phytophthora ramorum and has high value in predicting host-targeted leaders. A consensus motif further reveals E/D residues enriched within ~25 amino acids downstream of the RxLR, which are also needed for export. Together the data suggest that in these plant pathogenic oomycetes, a consensus HT motif may reside in an extended sequence of ~25–30 amino acids, rather than in a short linear sequence. Evidence is presented that although the consensus is much shorter in P. falciparum, information sufficient for vacuolar export is contained in a region of ~30 amino acids, which includes sequences flanking the HT core. Finally, positional conservation between Phytophthora RxLR and P. falciparum RxLx (E, D, Q) is consistent with the idea that the context of their presentation is constrained. These studies provide the first evidence to our knowledge that eukaryotic microbes share equivalent pathogenic HT signals and thus conserved mechanisms to access host cells across plant and animal kingdoms that may present unique targets for prophylaxis across divergent pathogens. PMID:16733545
Ectomycorrhizal diversity and community structure in stands of Quercus oleoides in the seasonally dry tropical forests of Costa Rica

NASA Astrophysics Data System (ADS)

Desai, Nikhilesh S.; Wilson, Andrew W.; Powers, Jennifer S.; Mueller, Gregory M.; Egerton-Warburton, Louise M.

2016-12-01

Most conservation efforts in seasonally dry tropical forests have overlooked less obvious targets for conservation, such as mycorrhizal fungi, that are critical to plant growth and ecosystem structure. We documented the diversity of ectomycorrhizal (EMF) and arbuscular mycorrhizal (AMF) fungal communities in Quercus oleoides (Fagaceae) in Guanacaste province, Costa Rica. Soil cores and sporocarps were collected from regenerating Q. oleoides plots differing in stand age (early vs late regeneration) during the wet season. Sequencing of the nuclear ribosomal ITS region in EMF root tips and sporocarps identified 37 taxa in the Basidiomycota; EMF Ascomycota were uncommon. The EMF community was dominated by one species (Thelephora sp. 1; 70% of soil cores), more than half of all EMF species were found only once in an individual soil core, and there were few conspecific taxa. Most EMF taxa were also restricted to either Early or Late plots. Levels of EMF species richness and diversity, and AMF root colonization were similar between plots. Our results highlight the need for comprehensive spatiotemporal samplings of EMF communities in Q. oleoides to identify and prioritize rare EMF for conservation, and document their genetic and functional diversity.
The hepatitis C virus Core protein is a potent nucleic acid chaperone that directs dimerization of the viral (+) strand RNA in vitro

PubMed Central

Cristofari, Gaël; Ivanyi-Nagy, Roland; Gabus, Caroline; Boulant, Steeve; Lavergne, Jean-Pierre; Penin, François; Darlix, Jean-Luc

2004-01-01

The hepatitis C virus (HCV) is an important human pathogen causing chronic hepatitis, liver cirrhosis and hepatocellular carcinoma. HCV is an enveloped virus with a positive-sense, single-stranded RNA genome encoding a single polyprotein that is processed to generate viral proteins. Several hundred molecules of the structural Core protein are thought to coat the genome in the viral particle, as do nucleocapsid (NC) protein molecules in Retroviruses, another class of enveloped viruses containing a positive-sense RNA genome. Retroviral NC proteins also possess nucleic acid chaperone properties that play critical roles in the structural remodelling of the genome during retrovirus replication. This analogy between HCV Core and retroviral NC proteins prompted us to investigate the putative nucleic acid chaperoning properties of the HCV Core protein. Here we report that Core protein chaperones the annealing of complementary DNA and RNA sequences and the formation of the most stable duplex by strand exchange. These results show that the HCV Core is a nucleic acid chaperone similar to retroviral NC proteins. We also find that the Core protein directs dimerization of HCV (+) RNA 3′ untranslated region which is promoted by a conserved palindromic sequence possibly involved at several stages of virus replication. PMID:15141033
The hepatitis C virus Core protein is a potent nucleic acid chaperone that directs dimerization of the viral (+) strand RNA in vitro.

PubMed

Cristofari, Gaël; Ivanyi-Nagy, Roland; Gabus, Caroline; Boulant, Steeve; Lavergne, Jean-Pierre; Penin, François; Darlix, Jean-Luc

2004-01-01

The hepatitis C virus (HCV) is an important human pathogen causing chronic hepatitis, liver cirrhosis and hepatocellular carcinoma. HCV is an enveloped virus with a positive-sense, single-stranded RNA genome encoding a single polyprotein that is processed to generate viral proteins. Several hundred molecules of the structural Core protein are thought to coat the genome in the viral particle, as do nucleocapsid (NC) protein molecules in Retroviruses, another class of enveloped viruses containing a positive-sense RNA genome. Retroviral NC proteins also possess nucleic acid chaperone properties that play critical roles in the structural remodelling of the genome during retrovirus replication. This analogy between HCV Core and retroviral NC proteins prompted us to investigate the putative nucleic acid chaperoning properties of the HCV Core protein. Here we report that Core protein chaperones the annealing of complementary DNA and RNA sequences and the formation of the most stable duplex by strand exchange. These results show that the HCV Core is a nucleic acid chaperone similar to retroviral NC proteins. We also find that the Core protein directs dimerization of HCV (+) RNA 3' untranslated region which is promoted by a conserved palindromic sequence possibly involved at several stages of virus replication.
Conservation in the face of diversity: multistrain analysis of an intracellular bacterium

USDA-ARS?s Scientific Manuscript database

Comparisons of multiple strains revealed that A. marginale has a closed-core genome with few highly plastic regions, which include the msp2 and msp3 genes, as well as the aaap locus. Comparison of the Florida and St. Maries genome sequences found that SNPs comprise 0.8% of the longer Florida genome,...
The most conserved genome segments for life detection on Earth and other planets.

PubMed

Isenbarger, Thomas A; Carr, Christopher E; Johnson, Sarah Stewart; Finney, Michael; Church, George M; Gilbert, Walter; Zuber, Maria T; Ruvkun, Gary

2008-12-01

On Earth, very simple but powerful methods to detect and classify broad taxa of life by the polymerase chain reaction (PCR) are now standard practice. Using DNA primers corresponding to the 16S ribosomal RNA gene, one can survey a sample from any environment for its microbial inhabitants. Due to massive meteoritic exchange between Earth and Mars (as well as other planets), a reasonable case can be made for life on Mars or other planets to be related to life on Earth. In this case, the supremely sensitive technologies used to study life on Earth, including in extreme environments, can be applied to the search for life on other planets. Though the 16S gene has become the standard for life detection on Earth, no genome comparisons have established that the ribosomal genes are, in fact, the most conserved DNA segments across the kingdoms of life. We present here a computational comparison of full genomes from 13 diverse organisms from the Archaea, Bacteria, and Eucarya to identify genetic sequences conserved across the widest divisions of life. Our results identify the 16S and 23S ribosomal RNA genes as well as other universally conserved nucleotide sequences in genes encoding particular classes of transfer RNAs and within the nucleotide binding domains of ABC transporters as the most conserved DNA sequence segments across phylogeny. This set of sequences defines a core set of DNA regions that have changed the least over billions of years of evolution and provides a means to identify and classify divergent life, including ancestrally related life on other planets.
A dehydrin cognate protein from pea (Pisum sativum L.) with an atypical pattern of expression.

PubMed

Robertson, M; Chandler, P M

1994-11-01

Dehydrins are a family of proteins characterised by conserved amino acid motifs, and induced in plants by dehydration or treatment with ABA. An antiserum was raised against a synthetic oligopeptide based on the most highly conserved dehydrin amino acid motif, the lysine-rich (core sequence KIKEK-LPG). This antiserum detected a novel M(r) 40,000 polypeptide and enabled isolation of a corresponding cDNA clone, pPsB61 (B61). The deduced amino acid sequence contained two lysine-rich blocks, however the remainder of the sequenced differed markedly from other pea dehydrins. Surprisingly, the sequence contained a stretch of serine residues, a characteristic common to dehydrins from many plant species but which is missing in pea dehydrin. The expression patterns of B61 mRNA and polypeptide were distinctively different from those of the pea dehydrins during seed development, germination and in young seedlings exposed to dehydration stress or treated with ABA. In particular, dehydration stress led to slightly reduced levels of B61 RNA, and ABA application to young seedlings had no marked effect on its abundance. The M(r) 40,000 polypeptide is thus related to pea dehydrin by the presence of the most highly conserved amino acid sequence motifs, but lacks the characteristic expression pattern of dehydrin. By analogy with heat shock cognate proteins we refer to this protein as a dehydrin cognate.
A family of selfish minicircular chromosomes with jumbled chloroplast gene fragments from a dinoflagellate.

PubMed

Zhang, Z; Cavalier-Smith, T; Green, B R

2001-08-01

Chloroplast genes of several dinoflagellate species are located on unigenic DNA minicircular chromosomes. We have now completely sequenced five aberrant minicircular chromosomes from the dinoflagellate Heterocapsa triquetra. These probably nonfunctional DNA circles lack complete genes, with each being composed of several short fragments of two or three different chloroplast genes and a common conserved region with a tripartite 9G-9A-9G core like the putative replicon origin of functional single-gene circular chloroplast chromosomes. Their sequences imply that all five circles evolved by differential deletions and duplications from common ancestral circles bearing fragments of four genes: psbA, psbC, 16S rRNA, and 23S rRNA. It appears that recombination between separate unigenic chromosomes initially gave intermediate heterodimers, which were subsequently stabilized by deletions that included part or all of one putative replicon origin. We suggest that homologous recombination at the 9G-9A-9G core regions produced a psbA/psbC heterodimer which generated two distinct chimeric circles by differential deletions and duplications. A 23S/16S rRNA heterodimer more likely formed by illegitimate recombination between 16S and 23S rRNA genes. Homologous recombination between the 9G-9A-9G core regions of both heterodimers and additional differential deletions and duplications could then have yielded the other three circles. Near identity of the gene fragments and 9G-9A-9G cores, despite diverging adjacent regions, may be maintained by gene conversion. The conserved organization of the 9G-9A-9G cores alone favors the idea that they are replicon origins and suggests that they may enable the aberrant minicircles to parasitize the chloroplast's replication machinery as selfish circles.
Mechanism for Coordinated RNA Packaging and Genome Replication by Rotavirus Polymerase VP1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lu, Xiaohui; McDonald, Sarah M.; Tortorici, M. Alejandra

2009-04-08

Rotavirus RNA-dependent RNA polymerase VP1 catalyzes RNA synthesis within a subviral particle. This activity depends on core shell protein VP2. A conserved sequence at the 3' end of plus-strand RNA templates is important for polymerase association and genome replication. We have determined the structure of VP1 at 2.9 {angstrom} resolution, as apoenzyme and in complex with RNA. The cage-like enzyme is similar to reovirus {lambda}3, with four tunnels leading to or from a central, catalytic cavity. A distinguishing characteristic of VP1 is specific recognition, by conserved features of the template-entry channel, of four bases, UGUG, in the conserved 3' sequence.more » Well-defined interactions with these bases position the RNA so that its 3' end overshoots the initiating register, producing a stable but catalytically inactive complex. We propose that specific 3' end recognition selects rotavirus RNA for packaging and that VP2 activates the autoinhibited VP1/RNA complex to coordinate packaging and genome replication.« less
Molecular characterization and epidemic history of hepatitis C virus using core sequences of isolates from Central Province, Saudi Arabia.

PubMed

Shier, Medhat K; Iles, James C; El-Wetidy, Mohammad S; Ali, Hebatallah H; Al Qattan, Mohammad M

2017-01-01

The source of HCV transmission in Saudi Arabia is unknown. This study aimed to determine HCV genotypes in a representative sample of chronically infected patients in Saudi Arabia. All HCV isolates were genotyped and subtyped by sequencing of the HCV core region and 54 new HCV isolates were identified. Three sets of primers targeting the core region were used for both amplification and sequencing of all isolates resulting in a 326 bp fragment. Most HCV isolates were genotype 4 (85%), whereas only a few isolates were recognized as genotype 1 (15%). With the assistance of Genbank database and BLAST, subtyping results showed that most of genotype 4 isolates were 4d whereas most of genotype 1 isolates were 1b. Nucleotide conservation and variation rates of HCV core sequences showed that 4a and 1b have the highest levels of variation. Phylogenetic analysis of sequences by Maximum Likelihood and Bayesian Coalescent methods was used to explore the source of HCV transmission by investigating the relationship between Saudi Arabia and other countries in the Middle East and Africa. Coalescent analysis showed that transmissions of HCV from Egypt to Saudi Arabia are estimated to have occurred in three major clusters: 4d was introduced into the country before 1900, the major 4a clade's MRCA was introduced between 1900 and 1920, and the remaining lineages were introduced between 1940 and 1960 from Egypt and Middle Africa. Results showed that no lineages seem to have crossed from Egypt to Saudi Arabia in the last 15 years. Finally, sequencing and characterization of new HCV isolates from Saudi Arabia will enrich the HCV database and help further studies related to treatment and management of the virus.
Molecular characterization and epidemic history of hepatitis C virus using core sequences of isolates from Central Province, Saudi Arabia

PubMed Central

Iles, James C.; El-Wetidy, Mohammad S.; Ali, Hebatallah H.; Al Qattan, Mohammad M.

2017-01-01

The source of HCV transmission in Saudi Arabia is unknown. This study aimed to determine HCV genotypes in a representative sample of chronically infected patients in Saudi Arabia. All HCV isolates were genotyped and subtyped by sequencing of the HCV core region and 54 new HCV isolates were identified. Three sets of primers targeting the core region were used for both amplification and sequencing of all isolates resulting in a 326 bp fragment. Most HCV isolates were genotype 4 (85%), whereas only a few isolates were recognized as genotype 1 (15%). With the assistance of Genbank database and BLAST, subtyping results showed that most of genotype 4 isolates were 4d whereas most of genotype 1 isolates were 1b. Nucleotide conservation and variation rates of HCV core sequences showed that 4a and 1b have the highest levels of variation. Phylogenetic analysis of sequences by Maximum Likelihood and Bayesian Coalescent methods was used to explore the source of HCV transmission by investigating the relationship between Saudi Arabia and other countries in the Middle East and Africa. Coalescent analysis showed that transmissions of HCV from Egypt to Saudi Arabia are estimated to have occurred in three major clusters: 4d was introduced into the country before 1900, the major 4a clade’s MRCA was introduced between 1900 and 1920, and the remaining lineages were introduced between 1940 and 1960 from Egypt and Middle Africa. Results showed that no lineages seem to have crossed from Egypt to Saudi Arabia in the last 15 years. Finally, sequencing and characterization of new HCV isolates from Saudi Arabia will enrich the HCV database and help further studies related to treatment and management of the virus. PMID:28863156
Cis-regulatory signatures of orthologous stress-associated bZIP transcription factors from rice, sorghum and Arabidopsis based on phylogenetic footprints

PubMed Central

2012-01-01

Background The potential contribution of upstream sequence variation to the unique features of orthologous genes is just beginning to be unraveled. A core subset of stress-associated bZIP transcription factors from rice (Oryza sativa) formed ten clusters of orthologous groups (COG) with genes from the monocot sorghum (Sorghum bicolor) and dicot Arabidopsis (Arabidopsis thaliana). The total cis-regulatory information content of each stress-associated COG was examined by phylogenetic footprinting to reveal ortholog-specific, lineage-specific and species-specific conservation patterns. Results The most apparent pattern observed was the occurrence of spatially conserved ‘core modules’ among the COGs but not among paralogs. These core modules are comprised of various combinations of two to four putative transcription factor binding site (TFBS) classes associated with either developmental or stress-related functions. Outside the core modules are specific stress (ABA, oxidative, abiotic, biotic) or organ-associated signals, which may be functioning as ‘regulatory fine-tuners’ and further define lineage-specific and species-specific cis-regulatory signatures. Orthologous monocot and dicot promoters have distinct TFBS classes involved in disease and oxidative-regulated expression, while the orthologous rice and sorghum promoters have distinct combinations of root-specific signals, a pattern that is not particularly conserved in Arabidopsis. Conclusions Patterns of cis-regulatory conservation imply that each ortholog has distinct signatures, further suggesting that they are potentially unique in a regulatory context despite the presumed conservation of broad biological function during speciation. Based on the observed patterns of conservation, we postulate that core modules are likely primary determinants of basal developmental programming, which may be integrated with and further elaborated by additional intrinsic or extrinsic signals in conjunction with lineage-specific or species-specific regulatory fine-tuners. This synergy may be critical for finer-scale spatio-temporal regulation, hence unique expression profiles of homologous transcription factors from different species with distinct zones of ecological adaptation such as rice, sorghum and Arabidopsis. The patterns revealed from these comparisons set the stage for further empirical validation by functional genomics. PMID:22992304
Sequence diversity within the reovirus S2 gene: reovirus genes reassort in nature, and their termini are predicted to form a panhandle motif.

PubMed Central

Chapell, J D; Goral, M I; Rodgers, S E; dePamphilis, C W; Dermody, T S

1994-01-01

To better understand genetic diversity within mammalian reoviruses, we determined S2 nucleotide and deduced sigma 2 amino acid sequences of nine reovirus strains and compared these sequences with those of prototype strains of the three reovirus serotypes. The S2 gene and sigma 2 protein are highly conserved among the four type 1, one type 2, and seven type 3 strains studied. Phylogenetic analyses based on S2 nucleotide sequences of the 12 reovirus strains indicate that diversity within the S2 gene is independent of viral serotype. Additionally, we found marked topological differences between phylogenetic trees generated from S1 and S2 gene nucleotide sequences of the seven type 3 strains. These results demonstrate that reovirus S1 and S2 genes have distinct evolutionary histories, thus providing phylogenetic evidence for lateral transfer of reovirus genes in nature. When variability among the 12 sigma 2-encoding S2 nucleotide sequences was analyzed at synonymous positions, we found that approximately 60 nucleotides at the 5' terminus and 30 nucleotides at the 3' terminus were markedly conserved in comparison with other sigma 2-encoding regions of S2. Predictions of RNA secondary structures indicate that the more conserved S2 sequences participate in the formation of an extended region of duplex RNA interrupted by a pair of stem-loops. Among the 12 deduced sigma 2 amino acid sequences examined, substitutions were observed at only 11% of amino acid positions. This finding suggests that constraints on the structure or function of sigma 2, perhaps in part because of its location in the virion core, have limited sequence diversity within this protein. PMID:8289378
Active diuretic peptidomimetic insect kinin analogs that contain Beta-turn mimetic motif 4-aminopyroglutamate and lack native peptide bonds

USDA-ARS?s Scientific Manuscript database

The multifunctional arthropod 'insect kinins' share the evolutionarily conserved C-terminal pentapeptide core sequence Phe-X1-X2-Trp-Gly-NH2, where X1 = His, Asn, Ser, or Tyr and X2 = Ser, Pro, or Ala. Insect kinins regulate diuresis in many species of insects, including the cricket. Insect kinins...
Comprehensive analysis of single molecule sequencing-derived complete genome and whole transcriptome of Hyposidra talaca nuclear polyhedrosis virus.

PubMed

Nguyen, Thong T; Suryamohan, Kushal; Kuriakose, Boney; Janakiraman, Vasantharajan; Reichelt, Mike; Chaudhuri, Subhra; Guillory, Joseph; Divakaran, Neethu; Rabins, P E; Goel, Ridhi; Deka, Bhabesh; Sarkar, Suman; Ekka, Preety; Tsai, Yu-Chih; Vargas, Derek; Santhosh, Sam; Mohan, Sangeetha; Chin, Chen-Shan; Korlach, Jonas; Thomas, George; Babu, Azariah; Seshagiri, Somasekar

2018-06-12

We sequenced the Hyposidra talaca NPV (HytaNPV) double stranded circular DNA genome using PacBio single molecule sequencing technology. We found that the HytaNPV genome is 139,089 bp long with a GC content of 39.6%. It encodes 141 open reading frames (ORFs) including the 37 baculovirus core genes, 25 genes conserved among lepidopteran baculoviruses, 72 genes known in baculovirus, and 7 genes unique to the HytaNPV genome. It is a group II alphabaculovirus that codes for the F protein and lacks the gp64 gene found in group I alphabaculovirus viruses. Using RNA-seq, we confirmed the expression of the ORFs identified in the HytaNPV genome. Phylogenetic analysis showed HytaNPV to be closest to BusuNPV, SujuNPV and EcobNPV that infect other tea pests, Buzura suppressaria, Sucra jujuba, and Ectropis oblique, respectively. We identified repeat elements and a conserved non-coding baculovirus element in the genome. Analysis of the putative promoter sequences identified motif consistent with the temporal expression of the genes observed in the RNA-seq data.
Saturation scanning of ubiquitin variants reveals a common hot spot for binding to USP2 and USP21.

PubMed

Leung, Isabel; Dekel, Ayelet; Shifman, Julia M; Sidhu, Sachdev S

2016-08-02

A detailed understanding of the molecular mechanisms whereby ubiquitin (Ub) recognizes enzymes in the Ub proteasome system is crucial for understanding the biological function of Ub. Many structures of Ub complexes have been solved and, in most cases, reveal a large structural epitope on a common face of the Ub molecule. However, owing to the generally weak nature of these interactions, it has been difficult to map in detail the functional contributions of individual Ub side chains to affinity and specificity. Here we took advantage of Ub variants (Ubvs) that bind tightly to particular Ub-specific proteases (USPs) and used phage display and saturation scanning mutagenesis to comprehensively map functional epitopes within the structural epitopes. We found that Ubvs that bind to USP2 or USP21 contain a remarkably similar core functional epitope, or "hot spot," consisting mainly of positions that are conserved as the wild type sequence, but also some positions that prefer mutant sequences. The Ubv core functional epitope contacts residues that are conserved in the human USP family, and thus it is likely important for the interactions of Ub across many family members.
A dimer of the lymphoid protein RAG1 recognizes the recombination signal sequence and the complex stably incorporates the high mobility group protein HMG2.

PubMed

Rodgers, K K; Villey, I J; Ptaszek, L; Corbett, E; Schatz, D G; Coleman, J E

1999-07-15

RAG1 and RAG2 are the two lymphoid-specific proteins required for the cleavage of DNA sequences known as the recombination signal sequences (RSSs) flanking V, D or J regions of the antigen-binding genes. Previous studies have shown that RAG1 alone is capable of binding to the RSS, whereas RAG2 only binds as a RAG1/RAG2 complex. We have expressed recombinant core RAG1 (amino acids 384-1008) in Escherichia coli and demonstrated catalytic activity when combined with RAG2. This protein was then used to determine its oligomeric forms and the dissociation constant of binding to the RSS. Electrophoretic mobility shift assays show that up to three oligomeric complexes of core RAG1 form with a single RSS. Core RAG1 was found to exist as a dimer both when free in solution and as the minimal species bound to the RSS. Competition assays show that RAG1 recognizes both the conserved nonamer and heptamer sequences of the RSS. Zinc analysis shows the core to contain two zinc ions. The purified RAG1 protein overexpressed in E.coli exhibited the expected cleavage activity when combined with RAG2 purified from transfected 293T cells. The high mobility group protein HMG2 is stably incorporated into the recombinant RAG1/RSS complex and can increase the affinity of RAG1 for the RSS in the absence of RAG2.

Unusual features of fibrillarin cDNA and gene structure in Euglena gracilis: evolutionary conservation of core proteins and structural predictions for methylation-guide box C/D snoRNPs throughout the domain Eucarya.

PubMed

Russell, Anthony G; Watanabe, Yoh-ichi; Charette, J Michael; Gray, Michael W

2005-01-01

Box C/D ribonucleoprotein (RNP) particles mediate O2'-methylation of rRNA and other cellular RNA species. In higher eukaryotic taxa, these RNPs are more complex than their archaeal counterparts, containing four core protein components (Snu13p, Nop56p, Nop58p and fibrillarin) compared with three in Archaea. This increase in complexity raises questions about the evolutionary emergence of the eukaryote-specific proteins and structural conservation in these RNPs throughout the eukaryotic domain. In protists, the primarily unicellular organisms comprising the bulk of eukaryotic diversity, the protein composition of box C/D RNPs has not yet been extensively explored. This study describes the complete gene, cDNA and protein sequences of the fibrillarin homolog from the protozoon Euglena gracilis, the first such information to be obtained for a nucleolus-localized protein in this organism. The E.gracilis fibrillarin gene contains a mixture of intron types exhibiting markedly different sizes. In contrast to most other E.gracilis mRNAs characterized to date, the fibrillarin mRNA lacks a spliced leader (SL) sequence. The predicted fibrillarin protein sequence itself is unusual in that it contains a glycine-lysine (GK)-rich domain at its N-terminus rather than the glycine-arginine-rich (GAR) domain found in most other eukaryotic fibrillarins. In an evolutionarily diverse collection of protists that includes E.gracilis, we have also identified putative homologs of the other core protein components of box C/D RNPs, thereby providing evidence that the protein composition seen in the higher eukaryotic complexes was established very early in eukaryotic cell evolution.
Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

PubMed Central

Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

2005-01-01

We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
HMG-D is an architecture-specific protein that preferentially binds to DNA containing the dinucleotide TG.

PubMed Central

Churchill, M E; Jones, D N; Glaser, T; Hefner, H; Searles, M A; Travers, A A

1995-01-01

The high mobility group (HMG) protein HMG-D from Drosophila melanogaster is a highly abundant chromosomal protein that is closely related to the vertebrate HMG domain proteins HMG1 and HMG2. In general, chromosomal HMG domain proteins lack sequence specificity. However, using both NMR spectroscopy and standard biochemical techniques we show that binding of HMG-D to a single DNA site is sequence selective. The preferred duplex DNA binding site comprises at least 5 bp and contains the deformable dinucleotide TG embedded in A/T-rich sequences. The TG motif constitutes a common core element in the binding sites of the well-characterized sequence-specific HMG domain proteins. We show that a conserved aromatic residue in helix 1 of the HMG domain may be involved in recognition of this core sequence. In common with other HMG domain proteins HMG-D binds preferentially to DNA sites that are stably bent and underwound, therefore HMG-D can be considered an architecture-specific protein. Finally, we show that HMG-D bends DNA and may confer a superhelical DNA conformation at a natural DNA binding site in the Drosophila fushi tarazu scaffold-associated region. Images PMID:7720717
Structure-Related Roles for the Conservation of the HIV-1 Fusion Peptide Sequence Revealed by Nuclear Magnetic Resonance.

PubMed

Serrano, Soraya; Huarte, Nerea; Rujas, Edurne; Andreu, David; Nieva, José L; Jiménez, María Angeles

2017-10-17

Despite extensive characterization of the human immunodeficiency virus type 1 (HIV-1) hydrophobic fusion peptide (FP), the structure-function relationships underlying its extraordinary degree of conservation remain poorly understood. Specifically, the fact that the tandem repeat of the FLGFLG tripeptide is absolutely conserved suggests that high hydrophobicity may not suffice to unleash FP function. Here, we have compared the nuclear magnetic resonance (NMR) structures adopted in nonpolar media by two FP surrogates, wtFP-tag and scrFP-tag, which had equal hydrophobicity but contained wild-type and scrambled core sequences LFLGFLG and FGLLGFL, respectively. In addition, these peptides were tagged at their C-termini with an epitope sequence that folded independently, thereby allowing Western blot detection without interfering with FP structure. We observed similar α-helical FP conformations for both specimens dissolved in the low-polarity medium 25% (v/v) 1,1,1,3,3,3-hexafluoro-2-propanol (HFIP), but important differences in contact with micelles of the membrane mimetic dodecylphosphocholine (DPC). Thus, whereas wtFP-tag preserved a helix displaying a Gly-rich ridge, the scrambled sequence lost in great part the helical structure upon being solubilized in DPC. Western blot analyses further revealed the capacity of wtFP-tag to assemble trimers in membranes, whereas membrane oligomers were not observed in the case of the scrFP-tag sequence. We conclude that, beyond hydrophobicity, preserving sequence order is an important feature for defining the secondary structures and oligomeric states adopted by the HIV FP in membranes.
A conserved intronic U1 snRNP-binding sequence promotes trans-splicing in Drosophila

PubMed Central

Gao, Jun-Li; Fan, Yu-Jie; Wang, Xiu-Ye; Zhang, Yu; Pu, Jia; Li, Liang; Shao, Wei; Zhan, Shuai; Hao, Jianjiang

2015-01-01

Unlike typical cis-splicing, trans-splicing joins exons from two separate transcripts to produce chimeric mRNA and has been detected in most eukaryotes. Trans-splicing in trypanosomes and nematodes has been characterized as a spliced leader RNA-facilitated reaction; in contrast, its mechanism in higher eukaryotes remains unclear. Here we investigate mod(mdg4), a classic trans-spliced gene in Drosophila, and report that two critical RNA sequences in the middle of the last 5′ intron, TSA and TSB, promote trans-splicing of mod(mdg4). In TSA, a 13-nucleotide (nt) core motif is conserved across Drosophila species and is essential and sufficient for trans-splicing, which binds U1 small nuclear RNP (snRNP) through strong base-pairing with U1 snRNA. In TSB, a conserved secondary structure acts as an enhancer. Deletions of TSA and TSB using the CRISPR/Cas9 system result in developmental defects in flies. Although it is not clear how the 5′ intron finds the 3′ introns, compensatory changes in U1 snRNA rescue trans-splicing of TSA mutants, demonstrating that U1 recruitment is critical to promote trans-splicing in vivo. Furthermore, TSA core-like motifs are found in many other trans-spliced Drosophila genes, including lola. These findings represent a novel mechanism of trans-splicing, in which RNA motifs in the 5′ intron are sufficient to bring separate transcripts into close proximity to promote trans-splicing. PMID:25838544
Modular architecture of the T4 phage superfamily: A conserved core genome and a plastic periphery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Comeau, Andre M.; Bertrand, Claire; Letarov, Andrei

2007-06-05

Among the most numerous objects in the biosphere, phages show enormous diversity in morphology and genetic content. We have sequenced 7 T4-like phages and compared their genome architecture. All seven phages share a core genome with T4 that is interrupted by several hyperplastic regions (HPRs) where most of their divergence occurs. The core primarily includes homologues of essential T4 genes, such as the virion structure and DNA replication genes. In contrast, the HPRs contain mostly novel genes of unknown function and origin. A few of the HPR genes that can be assigned putative functions, such as a series of novelmore » Internal Proteins, are implicated in phage adaptation to the host. Thus, the T4-like genome appears to be partitioned into discrete segments that fulfil different functions and behave differently in evolution. Such partitioning may be critical for these large and complex phages to maintain their flexibility, while simultaneously allowing them to conserve their highly successful virion design and mode of replication.« less
Maintaining replication origins in the face of genomic change.

PubMed

Di Rienzi, Sara C; Lindstrom, Kimberly C; Mann, Tobias; Noble, William S; Raghuraman, M K; Brewer, Bonita J

2012-10-01

Origins of replication present a paradox to evolutionary biologists. As a collection, they are absolutely essential genomic features, but individually are highly redundant and nonessential. It is therefore difficult to predict to what extent and in what regard origins are conserved over evolutionary time. Here, through a comparative genomic analysis of replication origins and chromosomal replication patterns in the budding yeasts Saccharomyces cerevisiae and Lachancea waltii, we assess to what extent replication origins survived genomic change produced from 150 million years of evolution. We find that L. waltii origins exhibit a core consensus sequence and nucleosome occupancy pattern highly similar to those of S. cerevisiae origins. We further observe that the overall progression of chromosomal replication is similar between L. waltii and S. cerevisiae. Nevertheless, few origins show evidence of being conserved in location between the two species. Among the conserved origins are those surrounding centromeres and adjacent to histone genes, suggesting that proximity to an origin may be important for their regulation. We conclude that, over evolutionary time, origins maintain sequence, structure, and regulation, but are continually being created and destroyed, with the result that their locations are generally not conserved.
Maintaining replication origins in the face of genomic change

PubMed Central

Di Rienzi, Sara C.; Lindstrom, Kimberly C.; Mann, Tobias; Noble, William S.; Raghuraman, M.K.; Brewer, Bonita J.

2012-01-01

Origins of replication present a paradox to evolutionary biologists. As a collection, they are absolutely essential genomic features, but individually are highly redundant and nonessential. It is therefore difficult to predict to what extent and in what regard origins are conserved over evolutionary time. Here, through a comparative genomic analysis of replication origins and chromosomal replication patterns in the budding yeasts Saccharomyces cerevisiae and Lachancea waltii, we assess to what extent replication origins survived genomic change produced from 150 million years of evolution. We find that L. waltii origins exhibit a core consensus sequence and nucleosome occupancy pattern highly similar to those of S. cerevisiae origins. We further observe that the overall progression of chromosomal replication is similar between L. waltii and S. cerevisiae. Nevertheless, few origins show evidence of being conserved in location between the two species. Among the conserved origins are those surrounding centromeres and adjacent to histone genes, suggesting that proximity to an origin may be important for their regulation. We conclude that, over evolutionary time, origins maintain sequence, structure, and regulation, but are continually being created and destroyed, with the result that their locations are generally not conserved. PMID:22665441
Unique core genomes of the bacterial family vibrionaceae: insights into niche adaptation and speciation.

PubMed

Kahlke, Tim; Goesmann, Alexander; Hjerde, Erik; Willassen, Nils Peder; Haugen, Peik

2012-05-10

The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon exist and to investigate their potential role in adaptation of bacteria to their specific niche. These genes were named unique core genes. Additionally, we investigate the existence and importance of unique core genes that are found in isolates of phylogenetically non-coherent groups. These groups of isolates, that share a genetic feature without sharing a closest common ancestor, are termed genophyletic groups. The bacterial family Vibrionaceae was used as the model, and we compiled and compared genome sequences of 64 different isolates. Using the software orthoMCL we determined clusters of homologous genes among the investigated genome sequences. We used multilocus sequence analysis to build a host phylogeny and mapped the numbers of unique core genes of all distinct groups of isolates onto the tree. The results show that unique core genes are more likely to be found in monophyletic groups of isolates. Genophyletic groups of isolates, in contrast, are less common especially for large groups of isolate. The subsequent annotation of unique core genes that are present in genophyletic groups indicate a high degree of horizontally transferred genes. Finally, the annotation of the unique core genes of Vibrio cholerae revealed genes involved in aerotaxis and biosynthesis of the iron-chelator vibriobactin. The presented work indicates that genes specific for any taxon inside the bacterial family Vibrionaceae exist. These unique core genes encode conserved metabolic functions that can shed light on the adaptation of a species to its ecological niche. Additionally, our study suggests that unique core genes can be used to aid classification of bacteria and contribute to a bacterial species definition on a genomic level. Furthermore, these genes may be of importance in clinical diagnostics and drug development.
Atomic interaction networks in the core of protein domains and their native folds.

PubMed

Soundararajan, Venkataramanan; Raman, Rahul; Raguram, S; Sasisekharan, V; Sasisekharan, Ram

2010-02-23

Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be "signature" of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1-2 angstroms (mean 1.61A) C(alpha) RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the 'twilight' and 'midnight' zones wherein <30% and <10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plague-causative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence-identity-independent applications demonstrated in this work, we suggest that the PCAIN is a fundamental fold feature that could be a valuable addition to the arsenal of protein modeling and analysis tools.
Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds

PubMed Central

Soundararajan, Venkataramanan; Raman, Rahul; Raguram, S.; Sasisekharan, V.; Sasisekharan, Ram

2010-01-01

Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be “signature” of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1–2 angstroms (mean 1.61A) Cα RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the ‘twilight’ and ‘midnight’ zones wherein <30% and <10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plague-causative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence-identity-independent applications demonstrated in this work, we suggest that the PCAIN is a fundamental fold feature that could be a valuable addition to the arsenal of protein modeling and analysis tools. PMID:20186337
Core genome conservation of Staphylococcus haemolyticus limits sequence based population structure analysis.

PubMed

Cavanagh, Jorunn Pauline; Klingenberg, Claus; Hanssen, Anne-Merethe; Fredheim, Elizabeth Aarag; Francois, Patrice; Schrenzel, Jacques; Flægstad, Trond; Sollid, Johanna Ericson

2012-06-01

The notoriously multi-resistant Staphylococcus haemolyticus is an emerging pathogen causing serious infections in immunocompromised patients. Defining the population structure is important to detect outbreaks and spread of antimicrobial resistant clones. Currently, the standard typing technique is pulsed-field gel electrophoresis (PFGE). In this study we describe novel molecular typing schemes for S. haemolyticus using multi locus sequence typing (MLST) and multi locus variable number of tandem repeats (VNTR) analysis. Seven housekeeping genes (MLST) and five VNTR loci (MLVF) were selected for the novel typing schemes. A panel of 45 human and veterinary S. haemolyticus isolates was investigated. The collection had diverse PFGE patterns (38 PFGE types) and was sampled over a 20 year-period from eight countries. MLST resolved 17 sequence types (Simpsons index of diversity [SID]=0.877) and MLVF resolved 14 repeat types (SID=0.831). We found a low sequence diversity. Phylogenetic analysis clustered the isolates in three (MLST) and one (MLVF) clonal complexes, respectively. Taken together, neither the MLST nor the MLVF scheme was suitable to resolve the population structure of this S. haemolyticus collection. Future MLVF and MLST schemes will benefit from addition of more variable core genome sequences identified by comparing different fully sequenced S. haemolyticus genomes. Copyright © 2012 Elsevier B.V. All rights reserved.
The utility of transcriptomics in fish conservation.

PubMed

Connon, Richard E; Jeffries, Ken M; Komoroske, Lisa M; Todgham, Anne E; Fangue, Nann A

2018-01-29

There is growing recognition of the need to understand the mechanisms underlying organismal resilience (i.e. tolerance, acclimatization) to environmental change to support the conservation management of sensitive and economically important species. Here, we discuss how functional genomics can be used in conservation biology to provide a cellular-level understanding of organismal responses to environmental conditions. In particular, the integration of transcriptomics with physiological and ecological research is increasingly playing an important role in identifying functional physiological thresholds predictive of compensatory responses and detrimental outcomes, transforming the way we can study issues in conservation biology. Notably, with technological advances in RNA sequencing, transcriptome-wide approaches can now be applied to species where no prior genomic sequence information is available to develop species-specific tools and investigate sublethal impacts that can contribute to population declines over generations and undermine prospects for long-term conservation success. Here, we examine the use of transcriptomics as a means of determining organismal responses to environmental stressors and use key study examples of conservation concern in fishes to highlight the added value of transcriptome-wide data to the identification of functional response pathways. Finally, we discuss the gaps between the core science and policy frameworks and how thresholds identified through transcriptomic evaluations provide evidence that can be more readily used by resource managers. © 2018. Published by The Company of Biologists Ltd.
Use of ancient sedimentary DNA as a novel conservation tool for high-altitude tropical biodiversity.

PubMed

Boessenkool, Sanne; McGlynn, Gayle; Epp, Laura S; Taylor, David; Pimentel, Manuel; Gizaw, Abel; Nemomissa, Sileshi; Brochmann, Christian; Popp, Magnus

2014-04-01

Conservation of biodiversity may in the future increasingly depend upon the availability of scientific information to set suitable restoration targets. In traditional paleoecology, sediment-based pollen provides a means to define preanthropogenic impact conditions, but problems in establishing the exact provenance and ecologically meaningful levels of taxonomic resolution of the evidence are limiting. We explored the extent to which the use of sedimentary ancient DNA (sedaDNA) may complement pollen data in reconstructing past alpine environments in the tropics. We constructed a record of afro-alpine plants retrieved from DNA preserved in sediment cores from 2 volcanic crater sites in the Albertine Rift, eastern Africa. The record extended well beyond the onset of substantial anthropogenic effects on tropical mountains. To ensure high-quality taxonomic inference from the sedaDNA sequences, we built an extensive DNA reference library covering the majority of the afro-alpine flora, by sequencing DNA from taxonomically verified specimens. Comparisons with pollen records from the same sediment cores showed that plant diversity recovered with sedaDNA improved vegetation reconstructions based on pollen records by revealing both additional taxa and providing increased taxonomic resolution. Furthermore, combining the 2 measures assisted in distinguishing vegetation change at different geographic scales; sedaDNA almost exclusively reflects local vegetation, whereas pollen can potentially originate from a wide area that in highlands in particular can span several ecozones. Our results suggest that sedaDNA may provide information on restoration targets and the nature and magnitude of human-induced environmental changes, including in high conservation priority, biodiversity hotspots, where understanding of preanthropogenic impact (or reference) conditions is highly limited. © 2013 Society for Conservation Biology.
Long-read whole genome sequencing and comparative analysis of six strains of the human pathogen Orientia tsutsugamushi.

PubMed

Batty, Elizabeth M; Chaemchuen, Suwittra; Blacksell, Stuart; Richards, Allen L; Paris, Daniel; Bowden, Rory; Chan, Caroline; Lachumanan, Ramkumar; Day, Nicholas; Donnelly, Peter; Chen, Swaine; Salje, Jeanne

2018-06-01

Orientia tsutsugamushi is a clinically important but neglected obligate intracellular bacterial pathogen of the Rickettsiaceae family that causes the potentially life-threatening human disease scrub typhus. In contrast to the genome reduction seen in many obligate intracellular bacteria, early genetic studies of Orientia have revealed one of the most repetitive bacterial genomes sequenced to date. The dramatic expansion of mobile elements has hampered efforts to generate complete genome sequences using short read sequencing methodologies, and consequently there have been few studies of the comparative genomics of this neglected species. We report new high-quality genomes of O. tsutsugamushi, generated using PacBio single molecule long read sequencing, for six strains: Karp, Kato, Gilliam, TA686, UT76 and UT176. In comparative genomics analyses of these strains together with existing reference genomes from Ikeda and Boryong strains, we identify a relatively small core genome of 657 genes, grouped into core gene islands and separated by repeat regions, and use the core genes to infer the first whole-genome phylogeny of Orientia. Complete assemblies of multiple Orientia genomes verify initial suggestions that these are remarkable organisms. They have larger genomes compared with most other Rickettsiaceae, with widespread amplification of repeat elements and massive chromosomal rearrangements between strains. At the gene level, Orientia has a relatively small set of universally conserved genes, similar to other obligate intracellular bacteria, and the relative expansion in genome size can be accounted for by gene duplication and repeat amplification. Our study demonstrates the utility of long read sequencing to investigate complex bacterial genomes and characterise genomic variation.
Crystal structure of the Msx-1 homeodomain/DNA complex.

PubMed

Hovde, S; Abate-Shen, C; Geiger, J H

2001-10-09

The Msx-1 homeodomain protein plays a crucial role in craniofacial, limb, and nervous system development. Homeodomain DNA-binding domains are comprised of 60 amino acids that show a high degree of evolutionary conservation. We have determined the structure of the Msx-1 homeodomain complexed to DNA at 2.2 A resolution. The structure has an unusually well-ordered N-terminal arm with a unique trajectory across the minor groove of the DNA. DNA specificity conferred by bases flanking the core TAAT sequence is explained by well ordered water-mediated interactions at Q50. Most interactions seen at the TAAT sequence are typical of the interactions seen in other homeodomain structures. Comparison of the Msx-1-HD structure to all other high resolution HD-DNA complex structures indicate a remarkably well-conserved sphere of hydration between the DNA and protein in these complexes.
RUDI, a short interspersed element of the V-SINE superfamily widespread in molluscan genomes.

PubMed

Luchetti, Andrea; Šatović, Eva; Mantovani, Barbara; Plohl, Miroslav

2016-06-01

Short interspersed elements (SINEs) are non-autonomous retrotransposons that are widespread in eukaryotic genomes. They exhibit a chimeric sequence structure consisting of a small RNA-related head, an anonymous body and an AT-rich tail. Although their turnover and de novo emergence is rapid, some SINE elements found in distantly related species retain similarity in certain core segments (or highly conserved domains, HCD). We have characterized a new SINE element named RUDI in the bivalve molluscs Ruditapes decussatus and R. philippinarum and found this element to be widely distributed in the genomes of a number of mollusc species. An unexpected structural feature of RUDI is the HCD domain type V, which was first found in non-amniote vertebrate SINEs and in the SINE from one cnidarian species. In addition to the V domain, the overall sequence conservation pattern of RUDI elements resembles that found in ancient AmnSINE (~310 Myr old) and Au SINE (~320 Myr old) families, suggesting that RUDI might be among the most ancient SINE families. Sequence conservation suggests a monophyletic origin of RUDI. Nucleotide variability and phylogenetic analyses suggest long-term vertical inheritance combined with at least one horizontal transfer event as the most parsimonious explanation for the observed taxonomic distribution.
Structural insight into the specificity of the B3 DNA-binding domains provided by the co-crystal structure of the C-terminal fragment of BfiI restriction enzyme

PubMed Central

Golovenko, Dmitrij; Manakova, Elena; Zakrys, Linas; Zaremba, Mindaugas; Sasnauskas, Giedrius; Gražulis, Saulius; Siksnys, Virginijus

2014-01-01

The B3 DNA-binding domains (DBDs) of plant transcription factors (TF) and DBDs of EcoRII and BfiI restriction endonucleases (EcoRII-N and BfiI-C) share a common structural fold, classified as the DNA-binding pseudobarrel. The B3 DBDs in the plant TFs recognize a diverse set of target sequences. The only available co-crystal structure of the B3-like DBD is that of EcoRII-N (recognition sequence 5′-CCTGG-3′). In order to understand the structural and molecular mechanisms of specificity of B3 DBDs, we have solved the crystal structure of BfiI-C (recognition sequence 5′-ACTGGG-3′) complexed with 12-bp cognate oligoduplex. Structural comparison of BfiI-C–DNA and EcoRII-N–DNA complexes reveals a conserved DNA-binding mode and a conserved pattern of interactions with the phosphodiester backbone. The determinants of the target specificity are located in the loops that emanate from the conserved structural core. The BfiI-C–DNA structure presented here expands a range of templates for modeling of the DNA-bound complexes of the B3 family of plant TFs. PMID:24423868
The N14 anti-afamin antibody Fab: a rare VL1 CDR glycosylation, crystallographic re-sequencing, molecular plasticity and conservative versus enthusiastic modelling.

PubMed

Naschberger, Andreas; Fürnrohr, Barbara G; Lenac Rovis, Tihana; Malic, Suzana; Scheffzek, Klaus; Dieplinger, Hans; Rupp, Bernhard

2016-12-01

The monoclonal antibody N14 is used as a detection antibody in ELISA kits for the human glycoprotein afamin, a member of the albumin family, which has recently gained interest in the capture and stabilization of Wnt signalling proteins, and for its role in metabolic syndrome and papillary thyroid carcinoma. As a rare occurrence, the N14 Fab is N-glycosylated at Asn26L at the onset of the V L 1 antigen-binding loop, with the α-1-6 core fucosylated complex glycan facing out of the L1 complementarity-determining region. The crystal structures of two non-apparent (pseudo) isomorphous crystals of the N14 Fab were analyzed, which differ significantly in the elbow angles, thereby cautioning against the overinterpretation of domain movements upon antigen binding. In addition, the map quality at 1.9 Å resolution was sufficient to crystallographically re-sequence the variable V L and V H domains and to detect discrepancies in the hybridoma-derived sequence. Finally, a conservatively refined parsimonious model is presented and its statistics are compared with those from a less conservatively built model that has been modelled more enthusiastically. Improvements to the PDB validation reports affecting ligands, clashscore and buried surface calculations are suggested.
Plant polyadenylation factors: conservation and variety in the polyadenylation complex in plants.

PubMed

Hunt, Arthur G; Xing, Denghui; Li, Qingshun Q

2012-11-20

Polyadenylation, an essential step in eukaryotic gene expression, requires both cis-elements and a plethora of trans-acting polyadenylation factors. The polyadenylation factors are largely conserved across mammals and fungi. The conservation seems also extended to plants based on the analyses of Arabidopsis polyadenylation factors. To extend this observation, we systemically identified the orthologs of yeast and human polyadenylation factors from 10 plant species chosen based on both the availability of their genome sequences and their positions in the evolutionary tree, which render them representatives of different plant lineages. The evolutionary trajectories revealed several interesting features of plant polyadenylation factors. First, the number of genes encoding plant polyadenylation factors was clearly increased from "lower" to "higher" plants. Second, the gene expansion in higher plants was biased to some polyadenylation factors, particularly those involved in RNA binding. Finally, while there are clear commonalities, the differences in the polyadenylation apparatus were obvious across different species, suggesting an ongoing process of evolutionary change. These features lead to a model in which the plant polyadenylation complex consists of a conserved core, which is rather rigid in terms of evolutionary conservation, and a panoply of peripheral subunits, which are less conserved and associated with the core in various combinations, forming a collection of somewhat distinct complex assemblies. The multiple forms of plant polyadenylation complex, together with the diversified polyA signals may explain the intensive alternative polyadenylation (APA) and its regulatory role in biological functions of higher plants.

Conserved Structural Elements in the V3 Crown of HIV-1 gp120

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang, X.; Burke, V; Totrov, M

2010-01-01

Binding of the third variable region (V3) of the HIV-1 envelope glycoprotein gp120 to the cell-surface coreceptors CCR5 or CXCR4 during viral entry suggests that there are conserved structural elements in this sequence-variable region. These conserved elements could serve as epitopes to be targeted by a vaccine against HIV-1. Here we perform a systematic structural analysis of representative human anti-V3 monoclonal antibodies in complex with V3 peptides, revealing that the crown of V3 has four conserved structural elements: an arch, a band, a hydrophobic core and the peptide backbone. These are either unaffected by or are subject to minimal sequencemore » variation. As these regions are targeted by cross-clade neutralizing human antibodies, they provide a blueprint for the design of vaccine immunogens that could elicit broadly cross-reactive protective antibodies.« less
A sequence-based genetic map of Medicago truncatula and comparison of marker colinearity with M. sativa.

PubMed Central

Choi, Hong-Kyu; Kim, Dongjin; Uhm, Taesik; Limpens, Eric; Lim, Hyunju; Mun, Jeong-Hwan; Kalo, Peter; Penmetsa, R Varma; Seres, Andrea; Kulikova, Olga; Roe, Bruce A; Bisseling, Ton; Kiss, Gyorgy B; Cook, Douglas R

2004-01-01

A core genetic map of the legume Medicago truncatula has been established by analyzing the segregation of 288 sequence-characterized genetic markers in an F(2) population composed of 93 individuals. These molecular markers correspond to 141 ESTs, 80 BAC end sequence tags, and 67 resistance gene analogs, covering 513 cM. In the case of EST-based markers we used an intron-targeted marker strategy with primers designed to anneal in conserved exon regions and to amplify across intron regions. Polymorphisms were significantly more frequent in intron vs. exon regions, thus providing an efficient mechanism to map transcribed genes. Genetic and cytogenetic analysis produced eight well-resolved linkage groups, which have been previously correlated with eight chromosomes by means of FISH with mapped BAC clones. We anticipated that mapping of conserved coding regions would have utility for comparative mapping among legumes; thus 60 of the EST-based primer pairs were designed to amplify orthologous sequences across a range of legume species. As an initial test of this strategy, we used primers designed against M. truncatula exon sequences to rapidly map genes in M. sativa. The resulting comparative map, which includes 68 bridging markers, indicates that the two Medicago genomes are highly similar and establishes the basis for a Medicago composite map. PMID:15082563
Genomic Structure of the Luciferase Gene from the Bioluminescent Beetle, Nyctophila cf. Caucasica

PubMed Central

Day, John C.; Chaichi, Mohammad J.; Najafil, Iraj; Whiteley, Andrew S.

2006-01-01

The gene coding for beetle luciferase, the enzyme responsible for bioluminescence in over two thousand coleopteran species has, to date, only been characterized from one Palearctic species of Lampyridae. Here we report the characterization of the luciferase gene from a female beetle of an Iranian lampyrid species, Nyctophila cf. caucasica (Coleoptera:Lampyridae). The luciferase gene was composed of seven exons, coding for 547 amino acids, separated by six introns spanning 1976 bp of genomic DNA. The deduced amino acid sequences of the luciferase gene of N. caucasica showed 98.9% homology to that of the Palearctic species Lampyris noctiluca. Analysis of the 810 bp upstream region of the luciferase gene revealed three TATA boxes and several other consensus transcriptional factor recognition sequences presenting evidence for a putative core promoter region conserved in Lampyrinae from -190 through to -155 upstream of the luciferase start codon. Along with the core promoter region the luciferase gene was compared with orthologous sequences from other lampyrid species and found to have greatest identity to Lampyris turkistanicus and Lampyris noctiluca. The significant sequence identity to the former is discussed in relation to taxonomic issues of Iranian lampyrids. PMID:20298115
Dynamic Evolution of Pathogenicity Revealed by Sequencing and Comparative Genomics of 19 Pseudomonas syringae Isolates

PubMed Central

Romanchuk, Artur; Chang, Jeff H.; Mukhtar, M. Shahid; Cherkis, Karen; Roach, Jeff; Grant, Sarah R.; Jones, Corbin D.; Dangl, Jeffery L.

2011-01-01

Closely related pathogens may differ dramatically in host range, but the molecular, genetic, and evolutionary basis for these differences remains unclear. In many Gram- negative bacteria, including the phytopathogen Pseudomonas syringae, type III effectors (TTEs) are essential for pathogenicity, instrumental in structuring host range, and exhibit wide diversity between strains. To capture the dynamic nature of virulence gene repertoires across P. syringae, we screened 11 diverse strains for novel TTE families and coupled this nearly saturating screen with the sequencing and assembly of 14 phylogenetically diverse isolates from a broad collection of diseased host plants. TTE repertoires vary dramatically in size and content across all P. syringae clades; surprisingly few TTEs are conserved and present in all strains. Those that are likely provide basal requirements for pathogenicity. We demonstrate that functional divergence within one conserved locus, hopM1, leads to dramatic differences in pathogenicity, and we demonstrate that phylogenetics-informed mutagenesis can be used to identify functionally critical residues of TTEs. The dynamism of the TTE repertoire is mirrored by diversity in pathways affecting the synthesis of secreted phytotoxins, highlighting the likely role of both types of virulence factors in determination of host range. We used these 14 draft genome sequences, plus five additional genome sequences previously reported, to identify the core genome for P. syringae and we compared this core to that of two closely related non-pathogenic pseudomonad species. These data revealed the recent acquisition of a 1 Mb megaplasmid by a sub-clade of cucumber pathogens. This megaplasmid encodes a type IV secretion system and a diverse set of unknown proteins, which dramatically increases both the genomic content of these strains and the pan-genome of the species. PMID:21799664
Crystal Structure of the Marburg Virus Nucleoprotein Core Domain Chaperoned by a VP35 Peptide Reveals a Conserved Drug Target for Filovirus.

PubMed

Zhu, Tengfei; Song, Hao; Peng, Ruchao; Shi, Yi; Qi, Jianxun; Gao, George F

2017-09-15

Filovirus nucleoprotein (NP), viral protein 35 (VP35), and polymerase L are essential for viral replication and nucleocapsid formation. Here, we identify a 28-residue peptide (NP binding peptide [NPBP]) from Marburg virus (MARV) VP35 through sequence alignment with previously identified Ebola virus (EBOV) NPBP, which bound to the core region (residues 18 to 344) of the N-terminal portion of MARV NP with high affinity. The crystal structure of the MARV NP core/NPBP complex at a resolution of 2.6 Å revealed that NPBP binds to the C-terminal region of the NP core via electrostatic and nonpolar interactions. Further structural analysis revealed that the MARV and EBOV NP cores hold a conserved binding pocket for NPBP, and this pocket could serve as a promising target for the design of universal drugs against filovirus infection. In addition, cross-binding assays confirmed that the NP core of MARV or EBOV can bind the NPBP from the other virus, although with moderately reduced binding affinities that result from termini that are distinct between the MARV and EBOV NPBPs. IMPORTANCE Historically, Marburg virus (MARV) has caused severe disease with up to 90% lethality. Among the viral proteins produced by MARV, NP and VP35 are both multifunctional proteins that are essential for viral replication. In its relative, Ebola virus (EBOV), an N-terminal peptide from VP35 binds to the NP N-terminal region with high affinity. Whether this is a common mechanism among filoviruses is an unsolved question. Here, we present the crystal structure of a complex that consists of the core domain of MARV NP and the NPBP peptide from VP35. As we compared MARV NPBP with EBOV NPBP, several different features at the termini were identified. Although these differences reduce the affinity of the NP core for NPBPs across genera, a conserved pocket in the C-terminal region of the NP core makes cross-species binding possible. Our results expand our knowledge of filovirus NP-VP35 interactions and provide more details for therapeutic intervention. Copyright © 2017 American Society for Microbiology.
Protein interface classification by evolutionary analysis

PubMed Central

2012-01-01

Background Distinguishing biologically relevant interfaces from lattice contacts in protein crystals is a fundamental problem in structural biology. Despite efforts towards the computational prediction of interface character, many issues are still unresolved. Results We present here a protein-protein interface classifier that relies on evolutionary data to detect the biological character of interfaces. The classifier uses a simple geometric measure, number of core residues, and two evolutionary indicators based on the sequence entropy of homolog sequences. Both aim at detecting differential selection pressure between interface core and rim or rest of surface. The core residues, defined as fully buried residues (>95% burial), appear to be fundamental determinants of biological interfaces: their number is in itself a powerful discriminator of interface character and together with the evolutionary measures it is able to clearly distinguish evolved biological contacts from crystal ones. We demonstrate that this definition of core residues leads to distinctively better results than earlier definitions from the literature. The stringent selection and quality filtering of structural and sequence data was key to the success of the method. Most importantly we demonstrate that a more conservative selection of homolog sequences - with relatively high sequence identities to the query - is able to produce a clearer signal than previous attempts. Conclusions An evolutionary approach like the one presented here is key to the advancement of the field, which so far was missing an effective method exploiting the evolutionary character of protein interfaces. Its coverage and performance will only improve over time thanks to the incessant growth of sequence databases. Currently our method reaches an accuracy of 89% in classifying interfaces of the Ponstingl 2003 datasets and it lends itself to a variety of useful applications in structural biology and bioinformatics. We made the corresponding software implementation available to the community as an easy-to-use graphical web interface at http://www.eppic-web.org. PMID:23259833
Evolutionary conservation analysis increases the colocalization of predicted exonic splicing enhancers in the BRCA1 gene with missense sequence changes and in-frame deletions, but not polymorphisms

PubMed Central

Pettigrew, Christopher; Wayte, Nicola; Lovelock, Paul K; Tavtigian, Sean V; Chenevix-Trench, Georgia; Spurdle, Amanda B; Brown, Melissa A

2005-01-01

Introduction Aberrant pre-mRNA splicing can be more detrimental to the function of a gene than changes in the length or nature of the encoded amino acid sequence. Although predicting the effects of changes in consensus 5' and 3' splice sites near intron:exon boundaries is relatively straightforward, predicting the possible effects of changes in exonic splicing enhancers (ESEs) remains a challenge. Methods As an initial step toward determining which ESEs predicted by the web-based tool ESEfinder in the breast cancer susceptibility gene BRCA1 are likely to be functional, we have determined their evolutionary conservation and compared their location with known BRCA1 sequence variants. Results Using the default settings of ESEfinder, we initially detected 669 potential ESEs in the coding region of the BRCA1 gene. Increasing the threshold score reduced the total number to 464, while taking into consideration the proximity to splice donor and acceptor sites reduced the number to 211. Approximately 11% of these ESEs (23/211) either are identical at the nucleotide level in human, primates, mouse, cow, dog and opossum Brca1 (conserved) or are detectable by ESEfinder in the same position in the Brca1 sequence (shared). The frequency of conserved and shared predicted ESEs between human and mouse is higher in BRCA1 exons (2.8 per 100 nucleotides) than in introns (0.6 per 100 nucleotides). Of conserved or shared putative ESEs, 61% (14/23) were predicted to be affected by sequence variants reported in the Breast Cancer Information Core database. Applying the filters described above increased the colocalization of predicted ESEs with missense changes, in-frame deletions and unclassified variants predicted to be deleterious to protein function, whereas they decreased the colocalization with known polymorphisms or unclassified variants predicted to be neutral. Conclusion In this report we show that evolutionary conservation analysis may be used to improve the specificity of an ESE prediction tool. This is the first report on the prediction of the frequency and distribution of ESEs in the BRCA1 gene, and it is the first reported attempt to predict which ESEs are most likely to be functional and therefore which sequence variants in ESEs are most likely to be pathogenic. PMID:16280041
Composite conserved promoter-terminator motifs (PeSLs) that mediate modular shuffling in the diverse T4-like myoviruses.

PubMed

Comeau, André M; Arbiol, Christine; Krisch, Henry M

2014-06-19

The diverse T4-like phages (Tquatrovirinae) infect a wide array of gram-negative bacterial hosts. The genome architecture of these phages is generally well conserved, most of the phylogenetically variable genes being grouped together in a series hyperplastic regions (HPRs) that are interspersed among large blocks of conserved core genes. Recent evidence from a pair of closely related T4-like phages has suggested that small, composite terminator/promoter sequences (promoterearly stem loop [PeSLs]) were implicated in mediating the high levels of genetic plasticity by indels occurring within the HPRs. Here, we present the genome sequence analysis of two T4-like phages, PST (168 kb, 272 open reading frames [ORFs]) and nt-1 (248 kb, 405 ORFs). These two phages were chosen for comparative sequence analysis because, although they are closely related to phages that have been previously sequenced (T4 and KVP40, respectively), they have different host ranges. In each case, one member of the pair infects a bacterial strain that is a human pathogen, whereas the other phage's host is a nonpathogen. Despite belonging to phylogenetically distant branches of the T4-likes, these pairs of phage have diverged from each other in part by a mechanism apparently involving PeSL-mediated recombination. This analysis confirms a role of PeSL sequences in the generation of genomic diversity by serving as a point of genetic exchange between otherwise unrelated sequences within the HPRs. Finally, the palette of divergent genes swapped by PeSL-mediated homologous recombination is discussed in the context of the PeSLs' potentially important role in facilitating phage adaption to new hosts and environments. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Strategies to Improve Efficiency and Specificity of Degenerate Primers in PCR.

PubMed

Campos, Maria Jorge; Quesada, Alberto

2017-01-01

PCR with degenerate primers can be used to identify the coding sequence of an unknown protein or to detect a genetic variant within a gene family. These primers, which are complex mixtures of slightly different oligonucleotide sequences, can be optimized to increase the efficiency and/or specificity of PCR in the amplification of a sequence of interest by the introduction of mismatches with the target sequence and balancing their position toward the primers 5'- or 3'-ends. In this work, we explain in detail examples of rational design of primers in two different applications, including the use of specific determinants at the 3'-end, to: (1) improve PCR efficiency with coding sequences for members of a protein family by fully degeneration at a core box of conserved genetic information, with the reduction of degeneration at the 5'-end, and (2) optimize specificity of allelic discrimination of closely related orthologous by 5'-end degenerate primers.
Synthetic Core Promoters as Universal Parts for Fine-Tuning Expression in Different Yeast Species

PubMed Central

2016-01-01

Synthetic biology and metabolic engineering experiments frequently require the fine-tuning of gene expression to balance and optimize protein levels of regulators or metabolic enzymes. A key concept of synthetic biology is the development of modular parts that can be used in different contexts. Here, we have applied a computational multifactor design approach to generate de novo synthetic core promoters and 5′ untranslated regions (UTRs) for yeast cells. In contrast to upstream cis-regulatory modules (CRMs), core promoters are typically not subject to specific regulation, making them ideal engineering targets for gene expression fine-tuning. 112 synthetic core promoter sequences were designed on the basis of the sequence/function relationship of natural core promoters, nucleosome occupancy and the presence of short motifs. The synthetic core promoters were fused to the Pichia pastoris AOX1 CRM, and the resulting activity spanned more than a 200-fold range (0.3% to 70.6% of the wild type AOX1 level). The top-ten synthetic core promoters with highest activity were fused to six additional CRMs (three in P. pastoris and three in Saccharomyces cerevisiae). Inducible CRM constructs showed significantly higher activity than constitutive CRMs, reaching up to 176% of natural core promoters. Comparing the activity of the same synthetic core promoters fused to different CRMs revealed high correlations only for CRMs within the same organism. These data suggest that modularity is maintained to some extent but only within the same organism. Due to the conserved role of eukaryotic core promoters, this rational design concept may be transferred to other organisms as a generic engineering tool. PMID:27973777
Production of mouse monoclonal antibody against Streptococcus dysgalactiae GapC protein and mapping its conserved B-cell epitope.

PubMed

Zhang, Limeng; Zhang, Hua; Fan, Ziyao; Zhou, Xue; Yu, Liquan; Sun, Hunan; Wu, Zhijun; Yu, Yongzhong; Song, Baifen; Ma, Jinzhu; Tong, Chunyu; Zhu, Zhanbo; Cui, Yudong

2015-02-01

Streptococcus dysgalactiae (S. dysgalactiae) GapC protein is a protective antigen that induces partial immunity against S. dysgalactiae infection in animals. To identify the conserved B-cell epitope of S. dysgalactiae GapC, a mouse monoclonal antibody 1E11 (mAb1E11) against GapC was generated and used to screen a phage-displayed 12-mer random peptide library (Ph.D.-12). Eleven positive clones recognized by mAb1E11 were identified, most of which matched the consensus motif TGFFAKK. Sequence of the motif exactly matched amino acids 97-103 of the S. dysgalactiae GapC. In addition, the epitope (97)TGFFAKK(103) showed high homology among different streptococcus species. Site-directed mutagenic analysis further confirmed that residues G98, F99, F100 and K103 formed the core of (97)TGFFAKK(103), and this core motif was the minimal determinant of the B-cell epitope recognized by the mAb1E11. Collectively, the identification of conserved B-cell epitope within S. dysgalactiae GapC highlights the possibility of developing the epitope-based vaccine. Copyright © 2014 Elsevier Ltd. All rights reserved.
Next-Generation Sequence Analysis of the Genome of RFHVMn, the Macaque Homolog of Kaposi's Sarcoma (KS)-Associated Herpesvirus, from a KS-Like Tumor of a Pig-Tailed Macaque

PubMed Central

Bruce, A. Gregory; Ryan, Jonathan T.; Thomas, Mathew J.; Peng, Xinxia; Grundhoff, Adam; Tsai, Che-Chung

2013-01-01

The complete sequence of retroperitoneal fibromatosis-associated herpesvirus Macaca nemestrina (RFHVMn), the pig-tailed macaque homolog of Kaposi's sarcoma-associated herpesvirus (KSHV), was determined by next-generation sequence analysis of a Kaposi's sarcoma (KS)-like macaque tumor. Colinearity of genes was observed with the KSHV genome, and the core herpesvirus genes had strong sequence homology to the corresponding KSHV genes. RFHVMn lacked homologs of open reading frame 11 (ORF11) and KSHV ORFs K5 and K6, which appear to have been generated by duplication of ORFs K3 and K4 after the divergence of KSHV and RFHV. RFHVMn contained positional homologs of all other unique KSHV genes, although some showed limited sequence similarity. RFHVMn contained a number of candidate microRNA genes. Although there was little sequence similarity with KSHV microRNAs, one candidate contained the same seed sequence as the positional homolog, kshv-miR-K12-10a, suggesting functional overlap. RNA transcript splicing was highly conserved between RFHVMn and KSHV, and strong sequence conservation was noted in specific promoters and putative origins of replication, predicting important functional similarities. Sequence comparisons indicated that RFHVMn and KSHV developed in long-term synchrony with the evolution of their hosts, and both viruses phylogenetically group within the RV1 lineage of Old World primate rhadinoviruses. RFHVMn is the closest homolog of KSHV to be completely sequenced and the first sequenced RV1 rhadinovirus homolog of KSHV from a nonhuman Old World primate. The strong genetic and sequence similarity between RFHVMn and KSHV, coupled with similarities in biology and pathology, demonstrate that RFHVMn infection in macaques offers an important and relevant model for the study of KSHV in humans. PMID:24109218
Genome analyses of the sunflower pathogen Plasmopara halstedii provide insights into effector evolution in downy mildews and Phytophthora.

PubMed

Sharma, Rahul; Xia, Xiaojuan; Cano, Liliana M; Evangelisti, Edouard; Kemen, Eric; Judelson, Howard; Oome, Stan; Sambles, Christine; van den Hoogen, D Johan; Kitner, Miloslav; Klein, Joël; Meijer, Harold J G; Spring, Otmar; Win, Joe; Zipper, Reinhard; Bode, Helge B; Govers, Francine; Kamoun, Sophien; Schornack, Sebastian; Studholme, David J; Van den Ackerveken, Guido; Thines, Marco

2015-10-05

Downy mildews are the most speciose group of oomycetes and affect crops of great economic importance. So far, there is only a single deeply-sequenced downy mildew genome available, from Hyaloperonospora arabidopsidis. Further genomic resources for downy mildews are required to study their evolution, including pathogenicity effector proteins, such as RxLR effectors. Plasmopara halstedii is a devastating pathogen of sunflower and a potential pathosystem model to study downy mildews, as several Avr-genes and R-genes have been predicted and unlike Arabidopsis downy mildew, large quantities of almost contamination-free material can be obtained easily. Here a high-quality draft genome of Plasmopara halstedii is reported and analysed with respect to various aspects, including genome organisation, secondary metabolism, effector proteins and comparative genomics with other sequenced oomycetes. Interestingly, the present analyses revealed further variation of the RxLR motif, suggesting an important role of the conservation of the dEER-motif. Orthology analyses revealed the conservation of 28 RxLR-like core effectors among Phytophthora species. Only six putative RxLR-like effectors were shared by the two sequenced downy mildews, highlighting the fast and largely independent evolution of two of the three major downy mildew lineages. This is seemingly supported by phylogenomic results, in which downy mildews did not appear to be monophyletic. The genome resource will be useful for developing markers for monitoring the pathogen population and might provide the basis for new approaches to fight Phytophthora and downy mildew pathogens by targeting core pathogenicity effectors.
Acquisition and evolution of plant pathogenesis-associated gene clusters and candidate determinants of tissue-specificity in xanthomonas.

PubMed

Lu, Hong; Patil, Prabhu; Van Sluys, Marie-Anne; White, Frank F; Ryan, Robert P; Dow, J Maxwell; Rabinowicz, Pablo; Salzberg, Steven L; Leach, Jan E; Sonti, Ramesh; Brendel, Volker; Bogdanove, Adam J

2008-01-01

Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown. To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors) cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage. Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major modifications or wholesale exchange of clusters, but subtle changes in a small number of genes or in non-coding sequences, and/or differences outside the clusters, potentially among regulatory targets or secretory substrates.
Disrupted auto-regulation of the spliceosomal gene SNRPB causes cerebro–costo–mandibular syndrome

PubMed Central

Lynch, Danielle C.; Revil, Timothée; Schwartzentruber, Jeremy; Bhoj, Elizabeth J.; Innes, A. Micheil; Lamont, Ryan E.; Lemire, Edmond G.; Chodirker, Bernard N.; Taylor, Juliet P.; Zackai, Elaine H.; McLeod, D. Ross; Kirk, Edwin P.; Hoover-Fong, Julie; Fleming, Leah; Savarirayan, Ravi; Boycott, Kym; MacKenzie, Alex; Brudno, Michael; Bulman, Dennis; Dyment, David; Majewski, Jacek; Jerome-Majewska, Loydie A.; Parboosingh, Jillian S.; Bernier, Francois P.

2014-01-01

Elucidating the function of highly conserved regulatory sequences is a significant challenge in genomics today. Certain intragenic highly conserved elements have been associated with regulating levels of core components of the spliceosome and alternative splicing of downstream genes. Here we identify mutations in one such element, a regulatory alternative exon of SNRPB as the cause of cerebro–costo–mandibular syndrome. This exon contains a premature termination codon that triggers nonsense-mediated mRNA decay when included in the transcript. These mutations cause increased inclusion of the alternative exon and decreased overall expression of SNRPB. We provide evidence for the functional importance of this conserved intragenic element in the regulation of alternative splicing and development, and suggest that the evolution of such a regulatory mechanism has contributed to the complexity of mammalian development. PMID:25047197
Disrupted auto-regulation of the spliceosomal gene SNRPB causes cerebro-costo-mandibular syndrome.

PubMed

Lynch, Danielle C; Revil, Timothée; Schwartzentruber, Jeremy; Bhoj, Elizabeth J; Innes, A Micheil; Lamont, Ryan E; Lemire, Edmond G; Chodirker, Bernard N; Taylor, Juliet P; Zackai, Elaine H; McLeod, D Ross; Kirk, Edwin P; Hoover-Fong, Julie; Fleming, Leah; Savarirayan, Ravi; Majewski, Jacek; Jerome-Majewska, Loydie A; Parboosingh, Jillian S; Bernier, Francois P

2014-07-22

Elucidating the function of highly conserved regulatory sequences is a significant challenge in genomics today. Certain intragenic highly conserved elements have been associated with regulating levels of core components of the spliceosome and alternative splicing of downstream genes. Here we identify mutations in one such element, a regulatory alternative exon of SNRPB as the cause of cerebro-costo-mandibular syndrome. This exon contains a premature termination codon that triggers nonsense-mediated mRNA decay when included in the transcript. These mutations cause increased inclusion of the alternative exon and decreased overall expression of SNRPB. We provide evidence for the functional importance of this conserved intragenic element in the regulation of alternative splicing and development, and suggest that the evolution of such a regulatory mechanism has contributed to the complexity of mammalian development.
The genome sequence of taurine cattle: a window to ruminant biology and evolution.

PubMed

Elsik, Christine G; Tellam, Ross L; Worley, Kim C; Gibbs, Richard A; Muzny, Donna M; Weinstock, George M; Adelson, David L; Eichler, Evan E; Elnitski, Laura; Guigó, Roderic; Hamernik, Debora L; Kappes, Steve M; Lewin, Harris A; Lynn, David J; Nicholas, Frank W; Reymond, Alexandre; Rijnkels, Monique; Skow, Loren C; Zdobnov, Evgeny M; Schook, Lawrence; Womack, James; Alioto, Tyler; Antonarakis, Stylianos E; Astashyn, Alex; Chapple, Charles E; Chen, Hsiu-Chuan; Chrast, Jacqueline; Câmara, Francisco; Ermolaeva, Olga; Henrichsen, Charlotte N; Hlavina, Wratko; Kapustin, Yuri; Kiryutin, Boris; Kitts, Paul; Kokocinski, Felix; Landrum, Melissa; Maglott, Donna; Pruitt, Kim; Sapojnikov, Victor; Searle, Stephen M; Solovyev, Victor; Souvorov, Alexandre; Ucla, Catherine; Wyss, Carine; Anzola, Juan M; Gerlach, Daniel; Elhaik, Eran; Graur, Dan; Reese, Justin T; Edgar, Robert C; McEwan, John C; Payne, Gemma M; Raison, Joy M; Junier, Thomas; Kriventseva, Evgenia V; Eyras, Eduardo; Plass, Mireya; Donthu, Ravikiran; Larkin, Denis M; Reecy, James; Yang, Mary Q; Chen, Lin; Cheng, Ze; Chitko-McKown, Carol G; Liu, George E; Matukumalli, Lakshmi K; Song, Jiuzhou; Zhu, Bin; Bradley, Daniel G; Brinkman, Fiona S L; Lau, Lilian P L; Whiteside, Matthew D; Walker, Angela; Wheeler, Thomas T; Casey, Theresa; German, J Bruce; Lemay, Danielle G; Maqbool, Nauman J; Molenaar, Adrian J; Seo, Seongwon; Stothard, Paul; Baldwin, Cynthia L; Baxter, Rebecca; Brinkmeyer-Langford, Candice L; Brown, Wendy C; Childers, Christopher P; Connelley, Timothy; Ellis, Shirley A; Fritz, Krista; Glass, Elizabeth J; Herzig, Carolyn T A; Iivanainen, Antti; Lahmers, Kevin K; Bennett, Anna K; Dickens, C Michael; Gilbert, James G R; Hagen, Darren E; Salih, Hanni; Aerts, Jan; Caetano, Alexandre R; Dalrymple, Brian; Garcia, Jose Fernando; Gill, Clare A; Hiendleder, Stefan G; Memili, Erdogan; Spurlock, Diane; Williams, John L; Alexander, Lee; Brownstein, Michael J; Guan, Leluo; Holt, Robert A; Jones, Steven J M; Marra, Marco A; Moore, Richard; Moore, Stephen S; Roberts, Andy; Taniguchi, Masaaki; Waterman, Richard C; Chacko, Joseph; Chandrabose, Mimi M; Cree, Andy; Dao, Marvin Diep; Dinh, Huyen H; Gabisi, Ramatu Ayiesha; Hines, Sandra; Hume, Jennifer; Jhangiani, Shalini N; Joshi, Vandita; Kovar, Christie L; Lewis, Lora R; Liu, Yih-Shin; Lopez, John; Morgan, Margaret B; Nguyen, Ngoc Bich; Okwuonu, Geoffrey O; Ruiz, San Juana; Santibanez, Jireh; Wright, Rita A; Buhay, Christian; Ding, Yan; Dugan-Rocha, Shannon; Herdandez, Judith; Holder, Michael; Sabo, Aniko; Egan, Amy; Goodell, Jason; Wilczek-Boney, Katarzyna; Fowler, Gerald R; Hitchens, Matthew Edward; Lozado, Ryan J; Moen, Charles; Steffen, David; Warren, James T; Zhang, Jingkun; Chiu, Readman; Schein, Jacqueline E; Durbin, K James; Havlak, Paul; Jiang, Huaiyang; Liu, Yue; Qin, Xiang; Ren, Yanru; Shen, Yufeng; Song, Henry; Bell, Stephanie Nicole; Davis, Clay; Johnson, Angela Jolivet; Lee, Sandra; Nazareth, Lynne V; Patel, Bella Mayurkumar; Pu, Ling-Ling; Vattathil, Selina; Williams, Rex Lee; Curry, Stacey; Hamilton, Cerissa; Sodergren, Erica; Wheeler, David A; Barris, Wes; Bennett, Gary L; Eggen, André; Green, Ronnie D; Harhay, Gregory P; Hobbs, Matthew; Jann, Oliver; Keele, John W; Kent, Matthew P; Lien, Sigbjørn; McKay, Stephanie D; McWilliam, Sean; Ratnakumar, Abhirami; Schnabel, Robert D; Smith, Timothy; Snelling, Warren M; Sonstegard, Tad S; Stone, Roger T; Sugimoto, Yoshikazu; Takasuga, Akiko; Taylor, Jeremy F; Van Tassell, Curtis P; Macneil, Michael D; Abatepaulo, Antonio R R; Abbey, Colette A; Ahola, Virpi; Almeida, Iassudara G; Amadio, Ariel F; Anatriello, Elen; Bahadue, Suria M; Biase, Fernando H; Boldt, Clayton R; Carroll, Jeffery A; Carvalho, Wanessa A; Cervelatti, Eliane P; Chacko, Elsa; Chapin, Jennifer E; Cheng, Ye; Choi, Jungwoo; Colley, Adam J; de Campos, Tatiana A; De Donato, Marcos; Santos, Isabel K F de Miranda; de Oliveira, Carlo J F; Deobald, Heather; Devinoy, Eve; Donohue, Kaitlin E; Dovc, Peter; Eberlein, Annett; Fitzsimmons, Carolyn J; Franzin, Alessandra M; Garcia, Gustavo R; Genini, Sem; Gladney, Cody J; Grant, Jason R; Greaser, Marion L; Green, Jonathan A; Hadsell, Darryl L; Hakimov, Hatam A; Halgren, Rob; Harrow, Jennifer L; Hart, Elizabeth A; Hastings, Nicola; Hernandez, Marta; Hu, Zhi-Liang; Ingham, Aaron; Iso-Touru, Terhi; Jamis, Catherine; Jensen, Kirsty; Kapetis, Dimos; Kerr, Tovah; Khalil, Sari S; Khatib, Hasan; Kolbehdari, Davood; Kumar, Charu G; Kumar, Dinesh; Leach, Richard; Lee, Justin C-M; Li, Changxi; Logan, Krystin M; Malinverni, Roberto; Marques, Elisa; Martin, William F; Martins, Natalia F; Maruyama, Sandra R; Mazza, Raffaele; McLean, Kim L; Medrano, Juan F; Moreno, Barbara T; Moré, Daniela D; Muntean, Carl T; Nandakumar, Hari P; Nogueira, Marcelo F G; Olsaker, Ingrid; Pant, Sameer D; Panzitta, Francesca; Pastor, Rosemeire C P; Poli, Mario A; Poslusny, Nathan; Rachagani, Satyanarayana; Ranganathan, Shoba; Razpet, Andrej; Riggs, Penny K; Rincon, Gonzalo; Rodriguez-Osorio, Nelida; Rodriguez-Zas, Sandra L; Romero, Natasha E; Rosenwald, Anne; Sando, Lillian; Schmutz, Sheila M; Shen, Libing; Sherman, Laura; Southey, Bruce R; Lutzow, Ylva Strandberg; Sweedler, Jonathan V; Tammen, Imke; Telugu, Bhanu Prakash V L; Urbanski, Jennifer M; Utsunomiya, Yuri T; Verschoor, Chris P; Waardenberg, Ashley J; Wang, Zhiquan; Ward, Robert; Weikard, Rosemarie; Welsh, Thomas H; White, Stephen N; Wilming, Laurens G; Wunderlich, Kris R; Yang, Jianqi; Zhao, Feng-Qi

2009-04-24

To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Amino acid sequence analysis of the annexin super-gene family of proteins.

PubMed

Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J

1991-06-15

The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.
Diversity and Divergence of Dinoflagellate Histone Proteins

PubMed Central

Marinov, Georgi K.; Lynch, Michael

2015-01-01

Histone proteins and the nucleosomal organization of chromatin are near-universal eukaroytic features, with the exception of dinoflagellates. Previous studies have suggested that histones do not play a major role in the packaging of dinoflagellate genomes, although several genomic and transcriptomic surveys have detected a full set of core histone genes. Here, transcriptomic and genomic sequence data from multiple dinoflagellate lineages are analyzed, and the diversity of histone proteins and their variants characterized, with particular focus on their potential post-translational modifications and the conservation of the histone code. In addition, the set of putative epigenetic mark readers and writers, chromatin remodelers and histone chaperones are examined. Dinoflagellates clearly express the most derived set of histones among all autonomous eukaryote nuclei, consistent with a combination of relaxation of sequence constraints imposed by the histone code and the presence of numerous specialized histone variants. The histone code itself appears to have diverged significantly in some of its components, yet others are conserved, implying conservation of the associated biochemical processes. Specifically, and with major implications for the function of histones in dinoflagellates, the results presented here strongly suggest that transcription through nucleosomal arrays happens in dinoflagellates. Finally, the plausible roles of histones in dinoflagellate nuclei are discussed. PMID:26646152
A novel paired domain DNA recognition motif can mediate Pax2 repression of gene transcription.

PubMed

Håvik, B; Ragnhildstveit, E; Lorens, J B; Saelemyr, K; Fauske, O; Knudsen, L K; Fjose, A

1999-12-20

The paired domain (PD) is an evolutionarily conserved DNA-binding domain encoded by the Pax gene family of developmental regulators. The Pax proteins are transcription factors and are involved in a variety of processes such as brain development, patterning of the central nervous system (CNS), and B-cell development. In this report we demonstrate that the zebrafish Pax2 PD can interact with a novel type of DNA sequences in vitro, the triple-A motif, consisting of a heptameric nucleotide sequence G/CAAACA/TC with an invariant core of three adjacent adenosines. This recognition sequence was found to be conserved in known natural Pax5 repressor elements involved in controlling the expression of the p53 and J-chain genes. By identifying similar high affinity binding sites in potential target genes of the Pax2 protein, including the pax2 gene itself, we obtained further evidence that the triple-A sites are biologically significant. The putative natural target sites also provide a basis for defining an extended consensus recognition sequence. In addition, we observed in transformation assays a direct correlation between Pax2 repressor activity and the presence of triple-A sites. The results suggest that a transcriptional regulatory function of Pax proteins can be modulated by PD binding to different categories of target sequences. Copyright 1999 Academic Press.

Motif finding in DNA sequences based on skipping nonconserved positions in background Markov chains.

PubMed

Zhao, Xiaoyan; Sze, Sing-Hoi

2011-05-01

One strategy to identify transcription factor binding sites is through motif finding in upstream DNA sequences of potentially co-regulated genes. Despite extensive efforts, none of the existing algorithms perform very well. We consider a string representation that allows arbitrary ignored positions within the nonconserved portion of single motifs, and use O(2(l)) Markov chains to model the background distributions of motifs of length l while skipping these positions within each Markov chain. By focusing initially on positions that have fixed nucleotides to define core occurrences, we develop an algorithm to identify motifs of moderate lengths. We compare the performance of our algorithm to other motif finding algorithms on a few benchmark data sets, and show that significant improvement in accuracy can be obtained when the sites are sufficiently conserved within a given sample, while comparable performance is obtained when the site conservation rate is low. A software program (PosMotif ) and detailed results are available online at http://faculty.cse.tamu.edu/shsze/posmotif.
The genome sequence of Agrotis segetum granulovirus, isolate AgseGV-DA, reveals a new Betabaculovirus species of a slow killing granulovirus.

PubMed

Gueli Alletti, Gianpiero; Eigenbrod, Marina; Carstens, Eric B; Kleespies, Regina G; Jehle, Johannes A

2017-06-01

The European isolate Agrotis segetum granulovirus DA (AgseGV-DA) is a slow killing, type I granulovirus due to low dose-mortality responses within seven days post infection and a tissue tropism of infection restricted solely to the fat body of infected Agrotis segetum host larvae. The genome of AgseGV-DA was completely sequenced and compared to the whole genome sequences of the Chinese isolates AgseGV-XJ and AgseGV-L1. All three isolates share highly conserved genomes. The AgseGV-DA genome is 131,557bp in length and encodes for 149 putative open reading frames, including 37 baculovirus core genes and the per os infectivity factor ac110. Comprehensive investigations of repeat regions identified one putative non-hr like origin of replication in AgseGV-DA. Phylogenetic analysis based on concatenated amino acid alignments of 37 baculovirus core genes as well as pairwise distances based on the nucleotide alignments of partial granulin, lef-8 and lef-9 sequences with deposited betabaculoviruses confirmed AgseGV-DA, AgseGV-XJ and AgseGV-L1 as representative isolates of the same Betabaculovirus species. AgseGV encodes for a distinct putative enhancin, distantly related to enhancins from other granuloviruses. Copyright © 2017. Published by Elsevier Inc.
The X-ray Crystallographic Structure and Activity Analysis of a Pseudomonas-Specific Subfamily of the HAD Enzyme Superfamily Evidences a Novel Biochemical Function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peisach,E.; Wang, L.; Burroughs, A.

2008-01-01

The haloacid dehalogenase (HAD) superfamily is a large family of proteins dominated by phosphotransferases. Thirty-three sequence families within the HAD superfamily (HADSF) have been identified to assist in function assignment. One such family includes the enzyme phosphoacetaldehyde hydrolase (phosphonatase). Phosphonatase possesses the conserved Rossmanniod core domain and a C1-type cap domain. Other members of this family do not possess a cap domain and because the cap domain of phosphonatase plays an important role in active site desolvation and catalysis, the function of the capless family members must be unique. A representative of the capless subfamily, PSPTO{_}2114, from the plant pathogenmore » Pseudomonas syringae, was targeted for catalytic activity and structure analyses. The X-ray structure of PSPTO{_}2114 reveals a capless homodimer that conserves some but not all of the intersubunit contacts contributed by the core domains of the phosphonatase homodimer. The region of the PSPTO{_}2114 that corresponds to the catalytic scaffold of phosphonatase (and other HAD phosphotransfereases) positions amino acid residues that are ill suited for Mg+2 cofactor binding and mediation of phosphoryl group transfer between donor and acceptor substrates. The absence of phosphotransferase activity in PSPTO{_}2114 was confirmed by kinetic assays. To explore PSPTO{_}2114 function, the conservation of sequence motifs extending outside of the HADSF catalytic scaffold was examined. The stringently conserved residues among PSPTO{_}2114 homologs were mapped onto the PSPTO{_}2114 three-dimensional structure to identify a surface region unique to the family members that do not possess a cap domain. The hypothesis that this region is used in protein-protein recognition is explored to define, for the first time, HADSF proteins which have acquired a function other than that of a catalyst. Proteins 2008.« less
Multiple isoforms for the catalytic subunit of PKA in the basal fungal lineage Mucor circinelloides.

PubMed

Fernández Núñez, Lucas; Ocampo, Josefina; Gottlieb, Alexandra M; Rossi, Silvia; Moreno, Silvia

2016-12-01

Protein kinase A (PKA) activity is involved in dimorphism of the basal fungal lineage Mucor. From the recently sequenced genome of Mucor circinelloides we could predict ten catalytic subunits of PKA. From sequence alignment and structural prediction we conclude that the catalytic core of the isoforms is conserved, and the difference between them resides in their amino termini. This high number of isoforms is maintained in the subdivision Mucoromycotina. Each paralogue, when compared to the ones form other fungi is more homologous to one of its orthologs than to its paralogs. All of these fungal isoforms cannot be included in the class I or II in which fungal protein kinases have been classified. mRNA levels for each isoform were measured during aerobic and anaerobic growth. The expression of each isoform is differential and associated to a particular growth stage. We reanalyzed the sequence of PKAC (GI 20218944), the only cloned sequence available until now for a catalytic subunit of M. circinelloides. PKAC cannot be classified as a PKA because of its difference in the conserved C-tail; it shares with PKB a conserved C2 domain in the N-terminus. No catalytic activity could be measured for this protein nor predicted bioinformatically. It can thus be classified as a pseudokinase. Its importance can not be underestimated since it is expressed at the mRNA level in different stages of growth, and its deletion is lethal. Copyright Â© 2016 British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Multiple genome alignment for identifying the core structure among moderately related microbial genomes.

PubMed

Uchiyama, Ikuo

2008-10-31

Identifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult when more distantly related genomes are compared. Here, we consider the core structure as a set of sufficiently long segments in which gene orders are conserved so that they are likely to have been inherited mainly through vertical transfer, and developed a method for identifying the core structure by finding the order of pre-identified orthologous groups (OGs) that maximally retains the conserved gene orders. The method was applied to genome comparisons of two well-characterized families, Bacillaceae and Enterobacteriaceae, and identified their core structures comprising 1438 and 2125 OGs, respectively. The core sets contained most of the essential genes and their related genes, which were primarily included in the intersection of the two core sets comprising around 700 OGs. The definition of the genomic core based on gene order conservation was demonstrated to be more robust than the simpler approach based only on gene conservation. We also investigated the core structures in terms of G+C content homogeneity and phylogenetic congruence, and found that the core genes primarily exhibited the expected characteristic, i.e., being indigenous and sharing the same history, more than the non-core genes. The results demonstrate that our strategy of genome alignment based on gene order conservation can provide an effective approach to identify the genomic core among moderately related microbial genomes.
Prediction of the protein components of a putative Calanus finmarchicus (Crustacea, Copepoda) circadian signaling system using a de novo assembled transcriptome

PubMed Central

Christie, Andrew E.; Fontanilla, Tiana M.; Nesbit, Katherine T.; Lenz, Petra H.

2013-01-01

Diel vertical migration and seasonal diapause are critical life history events for the copepod Calanus finmarchicus. While much is known about these behaviors phenomenologically, little is known about their molecular underpinnings. Recent studies in insects suggest that some circadian genes/proteins also contribute to the establishment of seasonal diapause. Thus, it is possible that in Calanus these distinct timing regimes share some genetic components. To begin to address this possibility, we used the well-established Drosophila melanogaster circadian system as a reference for mining clock transcripts from a 200,000+ sequence Calanus transcriptome; the proteins encoded by the identified transcripts were also deduced and characterized. Sequences encoding homologs of the Drosophila core clock proteins CLOCK, CYCLE, PERIOD and TIMELESS were identified, as was one encoding CRYPTOCHROME 2, a core clock protein in ancestral insect systems, but absent in Drosophila. Calanus transcripts encoding proteins known to modulate the Drosophila core clock were also identified and characterized, e.g. CLOCKWORK ORANGE, DOUBLETIME, SHAGGY and VRILLE. Alignment and structural analyses of the deduced Calanus proteins with their Drosophila counterparts revealed extensive sequence conservation, particularly in functional domains. Interestingly, reverse BLAST analyses of these sequences against all arthropod proteins typically revealed non-Drosophila isoforms to be most similar to the Calanus queries. This, in combination with the presence of both CRYPTOCHROME 1 (a clock input pathway protein) and CRYPTOCHROME 2 in Calanus, suggests that the organization of the copepod circadian system is an ancestral one, more similar to that of insects like Danaus plexippus than to that of Drosophila. PMID:23727418
Massive Gene Transfer and Extensive RNA Editing of a Symbiotic Dinoflagellate Plastid Genome

PubMed Central

Mungpakdee, Sutada; Shinzato, Chuya; Takeuchi, Takeshi; Kawashima, Takeshi; Koyanagi, Ryo; Hisata, Kanako; Tanaka, Makiko; Goto, Hiroki; Fujie, Manabu; Lin, Senjie; Satoh, Nori; Shoguchi, Eiichi

2014-01-01

Genome sequencing of Symbiodinium minutum revealed that 95 of 109 plastid-associated genes have been transferred to the nuclear genome and subsequently expanded by gene duplication. Only 14 genes remain in plastids and occur as DNA minicircles. Each minicircle (1.8–3.3 kb) contains one gene and a conserved noncoding region containing putative promoters and RNA-binding sites. Nine types of RNA editing, including a novel G/U type, were discovered in minicircle transcripts but not in genes transferred to the nucleus. In contrast to DNA editing sites in dinoflagellate mitochondria, which tend to be highly conserved across all taxa, editing sites employed in DNA minicircles are highly variable from species to species. Editing is crucial for core photosystem protein function. It restores evolutionarily conserved amino acids and increases peptidyl hydropathy. It also increases protein plasticity necessary to initiate photosystem complex assembly. PMID:24881086
Conserved and variable domains of RNase MRP RNA.

PubMed

Dávila López, Marcela; Rosenblad, Magnus Alm; Samuelsson, Tore

2009-01-01

Ribonuclease MRP is a eukaryotic ribonucleoprotein complex consisting of one RNA molecule and 7-10 protein subunits. One important function of MRP is to catalyze an endonucleolytic cleavage during processing of rRNA precursors. RNase MRP is evolutionary related to RNase P which is critical for tRNA processing. A large number of MRP RNA sequences that now are available have been used to identify conserved primary and secondary structure features of the molecule. MRP RNA has structural features in common with P RNA such as a conserved catalytic core, but it also has unique features and is characterized by a domain highly variable between species. Information regarding primary and secondary structure features is of interest not only in basic studies of the function of MRP RNA, but also because mutations in the RNA give rise to human genetic diseases such as cartilage-hair hypoplasia.
Archaeal Haloarcula californiae Icosahedral Virus 1 Highlights Conserved Elements in Icosahedral Membrane-Containing DNA Viruses from Extreme Environments.

PubMed

Demina, Tatiana A; Pietilä, Maija K; Svirskaitė, Julija; Ravantti, Janne J; Atanasova, Nina S; Bamford, Dennis H; Oksanen, Hanna M

2016-07-19

Despite their high genomic diversity, all known viruses are structurally constrained to a limited number of virion morphotypes. One morphotype of viruses infecting bacteria, archaea, and eukaryotes is the tailless icosahedral morphotype with an internal membrane. Although it is considered an abundant morphotype in extreme environments, only seven such archaeal viruses are known. Here, we introduce Haloarcula californiae icosahedral virus 1 (HCIV-1), a halophilic euryarchaeal virus originating from salt crystals. HCIV-1 also retains its infectivity under low-salinity conditions, showing that it is able to adapt to environmental changes. The release of progeny virions resulting from cell lysis was evidenced by reduced cellular oxygen consumption, leakage of intracellular ATP, and binding of an indicator ion to ruptured cell membranes. The virion contains at least 12 different protein species, lipids selectively acquired from the host cell membrane, and a 31,314-bp-long linear double-stranded DNA (dsDNA). The overall genome organization and sequence show high similarity to the genomes of archaeal viruses in the Sphaerolipoviridae family. Phylogenetic analysis based on the major conserved components needed for virion assembly-the major capsid proteins and the packaging ATPase-placed HCIV-1 along with the alphasphaerolipoviruses in a distinct, well-supported clade. On the basis of its virion morphology and sequence similarities, most notably, those of its core virion components, we propose that HCIV-1 is a member of the PRD1-adenovirus structure-based lineage together with other sphaerolipoviruses. This addition to the lineage reinforces the notion of the ancient evolutionary links observed between the viruses and further highlights the limits of the choices found in nature for formation of a virion. Under conditions of extreme salinity, the majority of the organisms present are archaea, which encounter substantial selective pressure, being constantly attacked by viruses. Regardless of the enormous viral sequence diversity, all known viruses can be clustered into a few structure-based viral lineages based on their core virion components. Our description of a new halophilic virus-host system adds significant insights into the largely unstudied field of archaeal viruses and, in general, of life under extreme conditions. Comprehensive molecular characterization of HCIV-1 shows that this icosahedral internal membrane-containing virus exhibits conserved elements responsible for virion organization. This places the virus neatly in the PRD1-adenovirus structure-based lineage. HCIV-1 further highlights the limited diversity of virus morphotypes despite the astronomical number of viruses in the biosphere. The observed high conservation in the core virion elements should be considered in addressing such fundamental issues as the origin and evolution of viruses and their interplay with their hosts. Copyright © 2016 Demina et al.
Comparative Genomics of 12 Strains of Erwinia amylovora Identifies a Pan-Genome with a Large Conserved Core

PubMed Central

Mann, Rachel A.; Smits, Theo H. M.; Bühlmann, Andreas; Blom, Jochen; Goesmann, Alexander; Frey, Jürg E.; Plummer, Kim M.; Beer, Steven V.; Luck, Joanne; Duffy, Brion; Rodoni, Brendan

2013-01-01

The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus) and strains infecting Rubus (raspberries and blackberries). Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin) of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains), the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1Ea and a putative secondary metabolite pathway only present in Rubus-infecting strains. PMID:23409014
Comparative genomics of 12 strains of Erwinia amylovora identifies a pan-genome with a large conserved core.

PubMed

Mann, Rachel A; Smits, Theo H M; Bühlmann, Andreas; Blom, Jochen; Goesmann, Alexander; Frey, Jürg E; Plummer, Kim M; Beer, Steven V; Luck, Joanne; Duffy, Brion; Rodoni, Brendan

2013-01-01

The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus) and strains infecting Rubus (raspberries and blackberries). Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin) of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains), the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1(Ea) and a putative secondary metabolite pathway only present in Rubus-infecting strains.
Identification of a conserved B-cell epitope on the GapC protein of Streptococcus dysgalactiae.

PubMed

Zhang, Limeng; Zhou, Xue; Fan, Ziyao; Tang, Wei; Chen, Liang; Dai, Jian; Wei, Yuhua; Zhang, Jianxin; Yang, Xuan; Yang, Xijing; Liu, Daolong; Yu, Liquan; Zhang, Hua; Wu, Zhijun; Yu, Yongzhong; Sun, Hunan; Cui, Yudong

2015-01-01

Streptococcus dysgalactiae (S. dysgalactia) GapC is a highly conserved surface dehydrogenase among the streptococcus spp., which is responsible for inducing protective antibody immune responses in animals. However, the B-cell epitope of S. dysgalactia GapC have not been well characterized. In this study, a monoclonal antibody 1F2 (mAb1F2) against S. dysgalactiae GapC was generated by the hybridoma technique and used to screen a phage-displayed 12-mer random peptide library (Ph.D.-12) for mapping the linear B-cell epitope. The mAb1F2 recognized phages displaying peptides with the consensus motif TRINDLT. Amino acid sequence of the motif exactly matched (30)TRINDLT(36) of the S. dysgalactia GapC. Subsequently, site-directed mutagenic analysis further demonstrated that residues R31, I32, N33, D34 and L35 formed the core of (30)TRINDLT(36), and this core motif was the minimal determinant of the B-cell epitope recognized by the mAb1F2. The epitope (30)TRINDLT(36) showed high homology among different streptococcus species. Overall, our findings characterized a conserved B-cell epitope, which will be useful for the further study of epitope-based vaccines. Copyright © 2015 Elsevier Ltd. All rights reserved.
The Genome Sequence of Taurine Cattle: A window to ruminant biology and evolution

PubMed Central

Elsik, Christine G.; Tellam, Ross L.; Worley, Kim C.

2010-01-01

To understand the biology and evolution of ruminants, the cattle genome was sequenced to ∼7× coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1,217 are absent or undetected in non-eutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides an enabling resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production. PMID:19390049
Newly discovered young CORE-SINEs in marsupial genomes.

PubMed

Munemasa, Maruo; Nikaido, Masato; Nishihara, Hidenori; Donnellan, Stephen; Austin, Christopher C; Okada, Norihiro

2008-01-15

Although recent mammalian genome projects have uncovered a large part of genomic component of various groups, several repetitive sequences still remain to be characterized and classified for particular groups. The short interspersed repetitive elements (SINEs) distributed among marsupial genomes are one example. We have identified and characterized two new SINEs from marsupial genomes that belong to the CORE-SINE family, characterized by a highly conserved "CORE" domain. PCR and genomic dot blot analyses revealed that the distribution of each SINE shows distinct patterns among the marsupial genomes, implying different timing of their retroposition during the evolution of marsupials. The members of Mar3 (Marsupialia 3) SINE are distributed throughout the genomes of all marsupials, whereas the Mac1 (Macropodoidea 1) SINE is distributed specifically in the genomes of kangaroos. Sequence alignment of the Mar3 SINEs revealed that they can be further divided into four subgroups, each of which has diagnostic nucleotides. The insertion patterns of each SINE at particular genomic loci, together with the distribution patterns of each SINE, suggest that the Mar3 SINEs have intensively amplified after the radiation of diprotodontians, whereas the Mac1 SINE has amplified only slightly after the divergence of hypsiprimnodons from other macropods. By compiling the information of CORE-SINEs characterized to date, we propose a comprehensive picture of how SINE evolution occurred in the genomes of marsupials.
A conserved genetic module that encodes the major virion components in both the coliphage T4 and the marine cyanophage S-PM2

PubMed Central

Hambly, Emma; Tétart, Francoise; Desplats, Carine; Wilson, William H.; Krisch, Henry M.; Mann, Nicholas H.

2001-01-01

Sequence analysis of a 10-kb region of the genome of the marine cyanomyovirus S-PM2 reveals a homology to coliphage T4 that extends as a contiguous block from gene (g)18 to g23. The order of the S-PM2 genes in this region is similar to that of T4, but there are insertions and deletions of small ORFs of unknown function. In T4, g18 codes for the tail sheath, g19, the tail tube, g20, the head portal protein, g21, the prohead core protein, g22, a scaffolding protein, and g23, the major capsid protein. Thus, the entire module that determines the structural components of the phage head and contractile tail is conserved between T4 and this cyanophage. The significant differences in the morphology of these phages must reflect the considerable divergence of the amino acid sequence of their homologous virion proteins, which uniformly exceeds 50%. We suggest that their enormous diversity in the sea could be a result of genetic shuffling between disparate phages mediated by such commonly shared modules. These conserved sequences could facilitate genetic exchange by providing partially homologous substrates for recombination between otherwise divergent phage genomes. Such a mechanism would thus expand the pool of phage genes accessible by recombination to all those phages that share common modules. PMID:11553768
The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle

PubMed Central

Nelson, William C.; Stegen, James C.

2015-01-01

Candidate phylum OD1 bacteria (also referred to as Parcubacteria) have been identified in a broad range of anoxic environments through community survey analysis. Although none of these species have been isolated in the laboratory, several genome sequences have been reconstructed from metagenomic sequence data and single-cell sequencing. The organisms have small (generally <1 Mb) genomes with severely reduced metabolic capabilities. We have reconstructed 8 partial to near-complete OD1 genomes from oxic groundwater samples, and compared them against existing genomic data. The conserved core gene set comprises 202 genes, or ~28% of the genomic complement. “Housekeeping” genes and genes for biosynthesis of peptidoglycan and Type IV pilus production are conserved. Gene sets for biosynthesis of cofactors, amino acids, nucleotides, and fatty acids are absent entirely or greatly reduced. The only aspects of energy metabolism conserved are the non-oxidative branch of the pentose-phosphate shunt and central glycolysis. These organisms also lack some activities conserved in almost all other known bacterial genomes, including signal recognition particle, pseudouridine synthase A, and FAD synthase. Pan-genome analysis indicates a broad genotypic diversity and perhaps a highly fluid gene complement, indicating historical adaptation to a wide range of growth environments and a high degree of specialization. The genomes were examined for signatures suggesting either a free-living, streamlined lifestyle, or a symbiotic lifestyle. The lack of biosynthetic capabilities and DNA repair, along with the presence of potential attachment and adhesion proteins suggest that the Parcubacteria are ectosymbionts or parasites of other organisms. The wide diversity of genes that potentially mediate cell-cell contact suggests a broad range of partner/prey organisms across the phylum. PMID:26257709
The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle

DOE PAGES

Nelson, William C.; Stegen, James C.

2015-07-21

Candidate phylum OD1 bacteria (also referred to as Parcubacteria) have been identified in a broad range of anoxic environments through community survey analysis. Although none of these species have been isolated in the laboratory, several genome sequences have been reconstructed from metagenomic sequence data and single-cell sequencing. The organisms have small (generally <1 Mb) genomes with severely reduced metabolic capabilities. We have reconstructed 8 partial to near-complete OD1 genomes from oxic groundwater samples, and compared them against existing genomic data. The conserved core gene set comprises 202 genes, or ~28% of the genomic complement. “Housekeeping” genes and genes for biosynthesismore » of peptidoglycan and Type IV pilus production are conserved. Gene sets for biosynthesis of cofactors, amino acids, nucleotides, and fatty acids are absent entirely or greatly reduced. The only aspects of energy metabolism conserved are the non-oxidative branch of the pentose-phosphate shunt and central glycolysis. These organisms also lack some activities conserved in almost all other known bacterial genomes, including signal recognition particle, pseudouridine synthase A, and FAD synthase. Pan-genome analysis indicates a broad genotypic diversity and perhaps a highly fluid gene complement, indicating historical adaptation to a wide range of growth environments and a high degree of specialization. The genomes were examined for signatures suggesting either a free-living, streamlined lifestyle, or a symbiotic lifestyle. The lack of biosynthetic capabilities and DNA repair, along with the presence of potential attachment and adhesion proteins suggest that the Parcubacteria are ectosymbionts or parasites of other organisms. The wide diversity of genes that potentially mediate cell-cell contact suggests a broad range of partner/prey organisms across the phylum.« less
The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nelson, William C.; Stegen, James C.

2015-07-21

Candidate phylum OD1 bacteria (also referred to as Parcubacteria) have been identified in broad range of anoxic environments through community survey analysis. Although none of these species have been isolated in the laboratory, several genome sequences have been reconstructed from metagenomic sequence data and single-cell sequencing. The organisms have small (generally <1 Mb) genomes with severely reduced metabolic capabilities. We have reconstructed 8 partial to near-complete OD1 genomes from oxic groundwater samples, and compared them against existing genomic data. The conserved core gene set comprises 202 genes, or ~28% of the genomic complement. ‘Housekeeping’ genes and genes for biosynthesis ofmore » peptidoglycan and Type IV pilus production are conserved. Gene sets for biosynthesis of cofactors, amino acids, nucleotides and fatty acids are absent entirely or greatly reduced. The only aspects of energy metabolism conserved are the non-oxidative branch of the pentose-phosphate shunt and central glycolysis. These organisms also lack some activities conserved in almost all other known bacterial genomes, including signal recognition particle, pseudouridine synthase A, and FAD synthase. Pan-genome analysis indicates a broad genotypic diversity and perhaps a highly fluid gene complement, indicating historical adaptation to a wide range of growth environments and a high degree of specialization. The genomes were examined for signatures suggesting either a free-living, streamlined lifestyle or a symbiotic lifestyle. The lack of biosynthetic capabilities and DNA repair, along with the presence of potential attachment and adhesion proteins suggest the Parcubacteria are ectosymbionts or parasites of other organisms. The wide diversity of genes that potentially mediate cell-cell contact suggests a broad range of partner/prey organisms across the phylum.« less
A conserved αβ transmembrane interface forms the core of a compact T-cell receptor–CD3 structure within the membrane

PubMed Central

Krshnan, Logesvaran; Park, Soohyung; Im, Wonpil; Call, Melissa J.; Call, Matthew E.

2016-01-01

The T-cell antigen receptor (TCR) is an assembly of eight type I single-pass membrane proteins that occupies a central position in adaptive immunity. Many TCR-triggering models invoke an alteration in receptor complex structure as the initiating event, but both the precise subunit organization and the pathway by which ligand-induced alterations are transferred to the cytoplasmic signaling domains are unknown. Here, we show that the receptor complex transmembrane (TM) domains form an intimately associated eight-helix bundle organized by a specific interhelical TCR TM interface. The salient features of this core structure are absolutely conserved between αβ and γδ TCR sequences and throughout vertebrate evolution, and mutations at key interface residues caused defects in the formation of stable TCRαβ:CD3δε:CD3γε:ζζ complexes. These findings demonstrate that the eight TCR–CD3 subunits form a compact and precisely organized structure within the membrane and provide a structural basis for further investigation of conformationally regulated models of transbilayer TCR signaling. PMID:27791034
A conserved αβ transmembrane interface forms the core of a compact T-cell receptor-CD3 structure within the membrane.

PubMed

Krshnan, Logesvaran; Park, Soohyung; Im, Wonpil; Call, Melissa J; Call, Matthew E

2016-10-25

The T-cell antigen receptor (TCR) is an assembly of eight type I single-pass membrane proteins that occupies a central position in adaptive immunity. Many TCR-triggering models invoke an alteration in receptor complex structure as the initiating event, but both the precise subunit organization and the pathway by which ligand-induced alterations are transferred to the cytoplasmic signaling domains are unknown. Here, we show that the receptor complex transmembrane (TM) domains form an intimately associated eight-helix bundle organized by a specific interhelical TCR TM interface. The salient features of this core structure are absolutely conserved between αβ and γδ TCR sequences and throughout vertebrate evolution, and mutations at key interface residues caused defects in the formation of stable TCRαβ:CD3δε:CD3γε:ζζ complexes. These findings demonstrate that the eight TCR-CD3 subunits form a compact and precisely organized structure within the membrane and provide a structural basis for further investigation of conformationally regulated models of transbilayer TCR signaling.

Coral-Associated Bacterial Diversity Is Conserved across Two Deep-Sea Anthothela Species

PubMed Central

Lawler, Stephanie N.; Kellogg, Christina A.; France, Scott C.; Clostio, Rachel W.; Brooke, Sandra D.; Ross, Steve W.

2016-01-01

Cold-water corals, similar to tropical corals, contain diverse and complex microbial assemblages. These bacteria provide essential biological functions within coral holobionts, facilitating increased nutrient utilization and production of antimicrobial compounds. To date, few cold-water octocoral species have been analyzed to explore the diversity and abundance of their microbial associates. For this study, 23 samples of the family Anthothelidae were collected from Norfolk (n = 12) and Baltimore Canyons (n = 11) from the western Atlantic in August 2012 and May 2013. Genetic testing found that these samples comprised two Anthothela species (Anthothela grandiflora and Anthothela sp.) and Alcyonium grandiflorum. DNA was extracted and sequenced with primers targeting the V4–V5 variable region of the 16S rRNA gene using 454 pyrosequencing with GS FLX Titanium chemistry. Results demonstrated that the coral host was the primary driver of bacterial community composition. Al. grandiflorum, dominated by Alteromonadales and Pirellulales had much higher species richness, and a distinct bacterial community compared to Anthothela samples. Anthothela species (A. grandiflora and Anthothela sp.) had very similar bacterial communities, dominated by Oceanospirillales and Spirochaetes. Additional analysis of core-conserved bacteria at 90% sample coverage revealed genus level conservation across Anthothela samples. This core included unclassified Oceanospirillales, Kiloniellales, Campylobacterales, and genus Spirochaeta. Members of this core were previously recognized for their functional capabilities in nitrogen cycling and suggest the possibility of a nearly complete nitrogen cycle within Anthothela species. Overall, many of the bacterial associates identified in this study have the potential to contribute to the acquisition and cycling of nutrients within the coral holobiont. PMID:27092120
Coral-associated bacterial diversity is conserved across two deep-sea Anthothela species

USGS Publications Warehouse

Lawler, Stephanie N.; Kellogg, Christina A.; France, Scott C; Clostio, Rachel W; Brooke, Sandra D.; Ross, Steve W.

2016-01-01

Cold-water corals, similar to tropical corals, contain diverse and complex microbial assemblages. These bacteria provide essential biological functions within coral holobionts, facilitating increased nutrient utilization and production of antimicrobial compounds. To date, few cold-water octocoral species have been analyzed to explore the diversity and abundance of their microbial associates. For this study, 23 samples of the family Anthothelidae were collected from Norfolk (n = 12) and Baltimore Canyons (n = 11) from the western Atlantic in August 2012 and May 2013. Genetic testing found that these samples comprised two Anthothela species (Anthothela grandiflora and Anthothela sp.) and Alcyonium grandiflorum. DNA was extracted and sequenced with primers targeting the V4-V5 variable region of the 16S rRNA gene using 454 pyrosequencing with GS FLX Titanium chemistry. Results demonstrated that the coral host was the primary driver of bacterial community composition. Al. grandiflorum, dominated by Alteromonadales and Pirellulales had much higher species richness, and a distinct bacterial community compared to Anthothela samples. Anthothela species (A. grandiflora and Anthothela sp.) had very similar bacterial communities, dominated by Oceanospirillales and Spirochaetes. Additional analysis of core-conserved bacteria at 90% sample coverage revealed genus level conservation across Anthothela samples. This core included unclassified Oceanospirillales, Kiloniellales, Campylobacterales, and genus Spirochaeta. Members of this core were previously recognized for their functional capabilities in nitrogen cycling and suggest the possibility of a nearly complete nitrogen cycle within Anthothela species. Overall, many of the bacterial associates identified in this study have the potential to contribute to the acquisition and cycling of nutrients within the coral holobiont.
Complete sequence determination of a novel reptile iridovirus isolated from soft-shelled turtle and evolutionary analysis of Iridoviridae

PubMed Central

Huang, Youhua; Huang, Xiaohong; Liu, Hong; Gong, Jie; Ouyang, Zhengliang; Cui, Huachun; Cao, Jianhao; Zhao, Yingtao; Wang, Xiujie; Jiang, Yulin; Qin, Qiwei

2009-01-01

Background Soft-shelled turtle iridovirus (STIV) is the causative agent of severe systemic diseases in cultured soft-shelled turtles (Trionyx sinensis). To our knowledge, the only molecular information available on STIV mainly concerns the highly conserved STIV major capsid protein. The complete sequence of the STIV genome is not yet available. Therefore, determining the genome sequence of STIV and providing a detailed bioinformatic analysis of its genome content and evolution status will facilitate further understanding of the taxonomic elements of STIV and the molecular mechanisms of reptile iridovirus pathogenesis. Results We determined the complete nucleotide sequence of the STIV genome using 454 Life Science sequencing technology. The STIV genome is 105 890 bp in length with a base composition of 55.1% G+C. Computer assisted analysis revealed that the STIV genome contains 105 potential open reading frames (ORFs), which encode polypeptides ranging from 40 to 1,294 amino acids and 20 microRNA candidates. Among the putative proteins, 20 share homology with the ancestral proteins of the nuclear and cytoplasmic large DNA viruses (NCLDVs). Comparative genomic analysis showed that STIV has the highest degree of sequence conservation and a colinear arrangement of genes with frog virus 3 (FV3), followed by Tiger frog virus (TFV), Ambystoma tigrinum virus (ATV), Singapore grouper iridovirus (SGIV), Grouper iridovirus (GIV) and other iridovirus isolates. Phylogenetic analysis based on conserved core genes and complete genome sequence of STIV with other virus genomes was performed. Moreover, analysis of the gene gain-and-loss events in the family Iridoviridae suggested that the genes encoded by iridoviruses have evolved for favoring adaptation to different natural host species. Conclusion This study has provided the complete genome sequence of STIV. Phylogenetic analysis suggested that STIV and FV3 are strains of the same viral species belonging to the Ranavirus genus in the Iridoviridae family. Given virus-host co-evolution and the phylogenetic relationship among vertebrates from fish to reptiles, we propose that iridovirus might transmit between reptiles and amphibians and that STIV and FV3 are strains of the same viral species in the Ranavirus genus. PMID:19439104
Localized Plasticity in the Streamlined Genomes of Vinyl Chloride Respiring Dehalococcoides

DOE Office of Scientific and Technical Information (OSTI.GOV)

McMurdie, Paul J.; Behrens, Sebastien F.; Muller, Jochen A.

2009-06-30

Vinyl chloride (VC) is a human carcinogen and widespread priority pollutant. Here we report the first, to our knowledge, complete genome sequences of microorganisms able to respire VC, Dehalococcoides sp. strains VS and BAV1. Notably, the respective VC reductase encoding genes, vcrAB and bvcAB, were found embedded in distinct genomic islands (GEIs) with different predicted integration sites, suggesting that these genes were acquired horizontally and independently by distinct mechanisms. A comparative analysis that included two previously sequenced Dehalococcoides genomes revealed a contextually conserved core that is interrupted by two high plasticity regions (HPRs) near the Ori. These HPRs contain themore » majority of GEIs and strain-specific genes identified in the four Dehalococcoides genomes, an elevated number of repeated elements including insertion sequences (IS), as well as 91 of 96 rdhAB, genes that putatively encode terminal reductases in organohalide respiration. Only three core rdhA orthologous groups were identified, and only one of these groups is supported by synteny. The low number of core rdhAB, contrasted with the high rdhAB numbers per genome (up to 36 in strain VS), as well as their colocalization with GEIs and other signatures for horizontal transfer, suggests that niche adaptation via organohalide respiration is a fundamental ecological strategy in Dehalococccoides. This adaptation has been exacted through multiple mechanisms of recombination that are mainly confined within HPRs of an otherwise remarkably stable, syntenic, streamlined genome among the smallest of any free-living microorganism.« less
Infection of capilloviruses requires subgenomic RNAs whose transcription is controlled by promoter-like sequences conserved among flexiviruses.

PubMed

Komatsu, Ken; Hirata, Hisae; Fukagawa, Takako; Yamaji, Yasuyuki; Okano, Yukari; Ishikawa, Kazuya; Adachi, Tatsushi; Maejima, Kensaku; Hashimoto, Masayoshi; Namba, Shigetou

2012-07-01

The first open-reading frame (ORF) of apple stem grooving virus (ASGV), of the genus Capillovirus, encodes an apparently chimeric polyprotein containing conserved regions for replicase (Rep) and coat protein (CP). However, our previous study revealed that ASGV mutants with distinct and discontinuous Rep- and CP-coding regions successfully infect plants, indicating that CP expressed via a subgenomic RNA (sgRNA) is sufficient for viability of the virus. Here we identified a transcription start site of the CP sgRNA and revealed that CP translated from the sgRNA is essential for ASGV infection. We mapped the transcription start sites of both the CP and the movement protein (MP) sgRNAs of ASGV and found a hexanucleotide motif, UUAGGU, conserved upstream from both sgRNA transcription start sites. Mutational analysis of the putative CP initiation codon and of the UUAGGU sequence upstream from the transcription start site of CP sgRNA demonstrated their importance for ASGV accumulation. Our results also demonstrated that potato virus T (PVT), an unassigned species closely related to ASGV, produces two sgRNAs putatively deployed for the CP and MP expression and that the same hexanucleotide motif as found in ASGV is located upstream from the transcription start sites of both sgRNAs. This motif, which constituted putative core elements of the sgRNA promoter, is broadly conserved among viruses in the families Alphaflexiviridae and Betaflexiviridae, suggesting that the gene expression strategy of the viruses in both families has been conserved throughout evolution. Copyright © 2012 Elsevier B.V. All rights reserved.
Crystal structure and novel recognition motif of rho ADP-ribosylating C3 exoenzyme from Clostridium botulinum: structural insights for recognition specificity and catalysis.

PubMed

Han, S; Arvai, A S; Clancy, S B; Tainer, J A

2001-01-05

Clostridium botulinum C3 exoenzyme inactivates the small GTP-binding protein family Rho by ADP-ribosylating asparagine 41, which depolymerizes the actin cytoskeleton. C3 thus represents a major family of the bacterial toxins that transfer the ADP-ribose moiety of NAD to specific amino acids in acceptor proteins to modify key biological activities in eukaryotic cells, including protein synthesis, differentiation, transformation, and intracellular signaling. The 1.7 A resolution C3 exoenzyme structure establishes the conserved features of the core NAD-binding beta-sandwich fold with other ADP-ribosylating toxins despite little sequence conservation. Importantly, the central core of the C3 exoenzyme structure is distinguished by the absence of an active site loop observed in many other ADP-ribosylating toxins. Unlike the ADP-ribosylating toxins that possess the active site loop near the central core, the C3 exoenzyme replaces the active site loop with an alpha-helix, alpha3. Moreover, structural and sequence similarities with the catalytic domain of vegetative insecticidal protein 2 (VIP2), an actin ADP-ribosyltransferase, unexpectedly implicates two adjacent, protruding turns, which join beta5 and beta6 of the toxin core fold, as a novel recognition specificity motif for this newly defined toxin family. Turn 1 evidently positions the solvent-exposed, aromatic side-chain of Phe209 to interact with the hydrophobic region of Rho adjacent to its GTP-binding site. Turn 2 evidently both places the Gln212 side-chain for hydrogen bonding to recognize Rho Asn41 for nucleophilic attack on the anomeric carbon of NAD ribose and holds the key Glu214 catalytic side-chain in the adjacent catalytic pocket. This proposed bipartite ADP-ribosylating toxin turn-turn (ARTT) motif places the VIP2 and C3 toxin classes into a single ARTT family characterized by analogous target protein recognition via turn 1 aromatic and turn 2 hydrogen-bonding side-chain moieties. Turn 2 centrally anchors the catalytic Glu214 within the ARTT motif, and furthermore distinguishes the C3 toxin class by a conserved turn 2 Gln and the VIP2 binary toxin class by a conserved turn 2 Glu for appropriate target side-chain hydrogen-bonding recognition. Taken together, these structural results provide a molecular basis for understanding the coupled activity and recognition specificity for C3 and for the newly defined ARTT toxin family, which acts in the depolymerization of the actin cytoskeleton. This beta5 to beta6 region of the toxin fold represents an experimentally testable and potentially general recognition motif region for other ADP-ribosylating toxins that have a similar beta-structure framework. Copyright 2001 Academic Press.
Divergent N-Terminal Sequences Target an Inducible Testis Deubiquitinating Enzyme to Distinct Subcellular Structures

PubMed Central

Lin, Haijiang; Keriel, Anne; Morales, Carlos R.; Bedard, Nathalie; Zhao, Qing; Hingamp, Pascal; Lefrançois, Stephane; Combaret, Lydie; Wing, Simon S.

2000-01-01

Ubiquitin-specific processing proteases (UBPs) presently form the largest enzyme family in the ubiquitin system, characterized by a core region containing conserved motifs surrounded by divergent sequences, most commonly at the N-terminal end. The functions of these divergent sequences remain unclear. We identified two isoforms of a novel testis-specific UBP, UBP-t1 and UBP-t2, which contain identical core regions but distinct N termini, thereby permitting dissection of the functions of these two regions. Both isoforms were germ cell specific and developmentally regulated. Immunocytochemistry revealed that UBP-t1 was induced in step 16 to 19 spermatids while UBP-t2 was expressed in step 18 to 19 spermatids. Immunoelectron microscopy showed that UBP-t1 was found in the nucleus while UBP-t2 was extranuclear and was found in residual bodies. For the first time, we show that the differential subcellular localization was due to the distinct N-terminal sequences. When transfected into COS-7 cells, the core region was expressed throughout the cell but the UBP-t1 and UBP-t2 isoforms were concentrated in the nucleus and the perinuclear region, respectively. Fusions of each N-terminal end with green fluorescent protein yielded the same subcellular localization as the native proteins, indicating that the N-terminal ends were sufficient for determining differential localization. Interestingly, UBP-t2 colocalized with anti-γ-tubulin immunoreactivity, indicating that like several other components of the ubiquitin system, a deubiquitinating enzyme is associated with the centrosome. Regulated expression and alternative N termini can confer specificity of UBP function by restricting its temporal and spatial loci of action. PMID:10938131
Comparative genomics of four closely related Clostridium perfringens bacteriophages reveals variable evolution among core genes with therapeutic potential

PubMed Central

2011-01-01

Background Because biotechnological uses of bacteriophage gene products as alternatives to conventional antibiotics will require a thorough understanding of their genomic context, we sequenced and analyzed the genomes of four closely related phages isolated from Clostridium perfringens, an important agricultural and human pathogen. Results Phage whole-genome tetra-nucleotide signatures and proteomic tree topologies correlated closely with host phylogeny. Comparisons of our phage genomes to 26 others revealed three shared COGs; of particular interest within this core genome was an endolysin (PF01520, an N-acetylmuramoyl-L-alanine amidase) and a holin (PF04531). Comparative analyses of the evolutionary history and genomic context of these common phage proteins revealed two important results: 1) strongly significant host-specific sequence variation within the endolysin, and 2) a protein domain architecture apparently unique to our phage genomes in which the endolysin is located upstream of its associated holin. Endolysin sequences from our phages were one of two very distinct genotypes distinguished by variability within the putative enzymatically-active domain. The shared or core genome was comprised of genes with multiple sequence types belonging to five pfam families, and genes belonging to 12 pfam families, including the holin genes, which were nearly identical. Conclusions Significant genomic diversity exists even among closely-related bacteriophages. Holins and endolysins represent conserved functions across divergent phage genomes and, as we demonstrate here, endolysins can have significant variability and host-specificity even among closely-related genomes. Endolysins in our phage genomes may be subject to different selective pressures than the rest of the genome. These findings may have important implications for potential biotechnological applications of phage gene products. PMID:21631945
Plant centromere organization: a dynamic structure with conserved functions.

PubMed

Ma, Jianxin; Wing, Rod A; Bennetzen, Jeffrey L; Jackson, Scott A

2007-03-01

Although the structural features of centromeres from most multicellular eukaryotes remain to be characterized, recent analyses of the complete sequences of two centromeric regions of rice, together with data from Arabidopsis thaliana and maize, have illuminated the considerable size variation and sequence divergence of plant centromeres. Despite the severe suppression of meiotic chromosomal exchange in centromeric and pericentromeric regions of rice, the centromere core shows high rates of unequal homologous recombination in the absence of chromosomal exchange, resulting in frequent and extensive DNA rearrangement. Not only is the sequence of centromeric tandem and non-tandem repeats highly variable but also the copy number, spacing, order and orientation, providing ample natural variation as the basis for selection of superior centromere performance. This review article focuses on the structural and evolutionary dynamics of plant centromere organization and the potential molecular mechanisms responsible for the rapid changes of centromeric components.
Genomic Definition of Hypervirulent and Multidrug-Resistant Klebsiella pneumoniae Clonal Groups

PubMed Central

Bialek-Davenet, Suzanne; Criscuolo, Alexis; Ailloud, Florent; Passet, Virginie; Jones, Louis; Delannoy-Vieillard, Anne-Sophie; Garin, Benoit; Le Hello, Simon; Arlet, Guillaume; Nicolas-Chanoine, Marie-Hélène; Decré, Dominique

2014-01-01

Multidrug-resistant and highly virulent Klebsiella pneumoniae isolates are emerging, but the clonal groups (CGs) corresponding to these high-risk strains have remained imprecisely defined. We aimed to identify K. pneumoniae CGs on the basis of genome-wide sequence variation and to provide a simple bioinformatics tool to extract virulence and resistance gene data from genomic data. We sequenced 48 K. pneumoniae isolates, mostly of serotypes K1 and K2, and compared the genomes with 119 publicly available genomes. A total of 694 highly conserved genes were included in a core-genome multilocus sequence typing scheme, and cluster analysis of the data enabled precise definition of globally distributed hypervirulent and multidrug-resistant CGs. In addition, we created a freely accessible database, BIGSdb-Kp, to enable rapid extraction of medically and epidemiologically relevant information from genomic sequences of K. pneumoniae. Although drug-resistant and virulent K. pneumoniae populations were largely nonoverlapping, isolates with combined virulence and resistance features were detected. PMID:25341126
RNA chaperoning and intrinsic disorder in the core proteins of Flaviviridae.

PubMed

Ivanyi-Nagy, Roland; Lavergne, Jean-Pierre; Gabus, Caroline; Ficheux, Damien; Darlix, Jean-Luc

2008-02-01

RNA chaperone proteins are essential partners of RNA in living organisms and viruses. They are thought to assist in the correct folding and structural rearrangements of RNA molecules by resolving misfolded RNA species in an ATP-independent manner. RNA chaperoning is probably an entropy-driven process, mediated by the coupled binding and folding of intrinsically disordered protein regions and the kinetically trapped RNA. Previously, we have shown that the core protein of hepatitis C virus (HCV) is a potent RNA chaperone that can drive profound structural modifications of HCV RNA in vitro. We now examined the RNA chaperone activity and the disordered nature of core proteins from different Flaviviridae genera, namely that of HCV, GBV-B (GB virus B), WNV (West Nile virus) and BVDV (bovine viral diarrhoea virus). Despite low-sequence similarities, all four proteins demonstrated general nucleic acid annealing and RNA chaperone activities. Furthermore, heat resistance of core proteins, as well as far-UV circular dichroism spectroscopy suggested that a well-defined 3D protein structure is not necessary for core-induced RNA structural rearrangements. These data provide evidence that RNA chaperoning-possibly mediated by intrinsically disordered protein segments-is conserved in Flaviviridae core proteins. Thus, besides nucleocapsid formation, core proteins may function in RNA structural rearrangements taking place during virus replication.
RNA chaperoning and intrinsic disorder in the core proteins of Flaviviridae

PubMed Central

Ivanyi-Nagy, Roland; Lavergne, Jean-Pierre; Gabus, Caroline; Ficheux, Damien; Darlix, Jean-Luc

2008-01-01

RNA chaperone proteins are essential partners of RNA in living organisms and viruses. They are thought to assist in the correct folding and structural rearrangements of RNA molecules by resolving misfolded RNA species in an ATP-independent manner. RNA chaperoning is probably an entropy-driven process, mediated by the coupled binding and folding of intrinsically disordered protein regions and the kinetically trapped RNA. Previously, we have shown that the core protein of hepatitis C virus (HCV) is a potent RNA chaperone that can drive profound structural modifications of HCV RNA in vitro. We now examined the RNA chaperone activity and the disordered nature of core proteins from different Flaviviridae genera, namely that of HCV, GBV-B (GB virus B), WNV (West Nile virus) and BVDV (bovine viral diarrhoea virus). Despite low-sequence similarities, all four proteins demonstrated general nucleic acid annealing and RNA chaperone activities. Furthermore, heat resistance of core proteins, as well as far-UV circular dichroism spectroscopy suggested that a well-defined 3D protein structure is not necessary for core-induced RNA structural rearrangements. These data provide evidence that RNA chaperoning—possibly mediated by intrinsically disordered protein segments—is conserved in Flaviviridae core proteins. Thus, besides nucleocapsid formation, core proteins may function in RNA structural rearrangements taking place during virus replication. PMID:18033802
Sequencing Conservation Actions Through Threat Assessments in the Southeastern United States

Treesearch

Robert D. Sutter; Christopher C. Szell

2006-01-01

The identification of conservation priorities is one of the leading issues in conservation biology. We present a project of The Nature Conservancy, called Sequencing Conservation Actions, which prioritizes conservation areas and identifies foci for crosscutting strategies at various geographic scales. We use the term âSequencingâ to mean an ordering of actions over...
Human Adenovirus Core Protein V Is Targeted by the Host SUMOylation Machinery To Limit Essential Viral Functions.

PubMed

Freudenberger, Nora; Meyer, Tina; Groitl, Peter; Dobner, Thomas; Schreiner, Sabrina

2018-02-15

Human adenoviruses (HAdV) are nonenveloped viruses containing a linear, double-stranded DNA genome surrounded by an icosahedral capsid. To allow proper viral replication, the genome is imported through the nuclear pore complex associated with viral core proteins. Until now, the role of these incoming virion proteins during the early phase of infection was poorly understood. The core protein V is speculated to bridge the core and the surrounding capsid. It binds the genome in a sequence-independent manner and localizes in the nucleus of infected cells, accumulating at nucleoli. Here, we show that protein V contains conserved SUMO conjugation motifs (SCMs). Mutation of these consensus motifs resulted in reduced SUMOylation of the protein; thus, protein V represents a novel target of the host SUMOylation machinery. To understand the role of protein V SUMO posttranslational modification during productive HAdV infection, we generated a replication-competent HAdV with SCM mutations within the protein V coding sequence. Phenotypic analyses revealed that these SCM mutations are beneficial for adenoviral replication. Blocking protein V SUMOylation at specific sites shifts the onset of viral DNA replication to earlier time points during infection and promotes viral gene expression. Simultaneously, the altered kinetics within the viral life cycle are accompanied by more efficient proteasomal degradation of host determinants and increased virus progeny production than that observed during wild-type infection. Taken together, our studies show that protein V SUMOylation reduces virus growth; hence, protein V SUMOylation represents an important novel aspect of the host antiviral strategy to limit virus replication and thereby points to potential intervention strategies. IMPORTANCE Many decades of research have revealed that HAdV structural proteins promote viral entry and mainly physical stability of the viral genome in the capsid. Our work over the last years showed that this concept needs expansion as the functions are more diverse. We showed that capsid protein VI regulates the antiviral response by modulation of the transcription factor Daxx during infection. Moreover, core protein VII interacts with SPOC1 restriction factor, which is beneficial for efficient viral gene expression. Here, we were able to show that core protein V also represents a novel substrate of the host SUMOylation machinery and contains several conserved SCMs; mutation of these consensus motifs reduced SUMOylation of the protein. Unexpectedly, we observed that introducing these mutations into HAdV promotes adenoviral replication. In conclusion, we offer novel insights into adenovirus core proteins and provide evidence that SUMOylation of HAdV factors regulates replication efficiency. Copyright © 2018 American Society for Microbiology.
Core Promoter Functions in the Regulation of Gene Expression of Drosophila Dorsal Target Genes*

PubMed Central

Zehavi, Yonathan; Kuznetsov, Olga; Ovadia-Shochat, Avital; Juven-Gershon, Tamar

2014-01-01

Developmental processes are highly dependent on transcriptional regulation by RNA polymerase II. The RNA polymerase II core promoter is the ultimate target of a multitude of transcription factors that control transcription initiation. Core promoters consist of core promoter motifs, e.g. the initiator, TATA box, and the downstream core promoter element (DPE), which confer specific properties to the core promoter. Here, we explored the importance of core promoter functions in the dorsal-ventral developmental gene regulatory network. This network includes multiple genes that are activated by different nuclear concentrations of Dorsal, an NFκB homolog transcription factor, along the dorsal-ventral axis. We show that over two-thirds of Dorsal target genes contain DPE sequence motifs, which is significantly higher than the proportion of DPE-containing promoters in Drosophila genes. We demonstrate that multiple Dorsal target genes are evolutionarily conserved and functionally dependent on the DPE. Furthermore, we have analyzed the activation of key Dorsal target genes by Dorsal, as well as by another Rel family transcription factor, Relish, and the dependence of their activation on the DPE motif. Using hybrid enhancer-promoter constructs in Drosophila cells and embryo extracts, we have demonstrated that the core promoter composition is an important determinant of transcriptional activity of Dorsal target genes. Taken together, our results provide evidence for the importance of core promoter composition in the regulation of Dorsal target genes. PMID:24634215
Creating entanglement using integrals of motion

NASA Astrophysics Data System (ADS)

Olshanii, Maxim; Scoquart, Thibault; Yampolsky, Dmitry; Dunjko, Vanja; Jackson, Steven Glenn

2018-01-01

A quantum Galilean cannon is a one-dimensional sequence of N hard-core particles with special mass ratios and a hard wall; conservation laws due to the reflection group AN prevent both classical stochastization and quantum diffraction. It is realizable through specie-alternating mutually repulsive bosonic soliton trains. We show that an initial disentangled state can evolve into one where the heavy and light particles are entangled, and we propose a sensor, containing Ntotal atoms, with a √{Ntotal} times higher sensitivity than in a one-atom sensor with Ntotal repetitions.
Characteristics of the nuclear (18S, 5.8S, 28S and 5S) and mitochondrial (12S and 16S) rRNA genes of Apis mellifera (Insecta: Hymenoptera): structure, organization, and retrotransposable elements

PubMed Central

Gillespie, J J; Johnston, J S; Cannone, J J; Gutell, R R

2006-01-01

As an accompanying manuscript to the release of the honey bee genome, we report the entire sequence of the nuclear (18S, 5.8S, 28S and 5S) and mitochondrial (12S and 16S) ribosomal RNA (rRNA)-encoding gene sequences (rDNA) and related internally and externally transcribed spacer regions of Apis mellifera (Insecta: Hymenoptera: Apocrita). Additionally, we predict secondary structures for the mature rRNA molecules based on comparative sequence analyses with other arthropod taxa and reference to recently published crystal structures of the ribosome. In general, the structures of honey bee rRNAs are in agreement with previously predicted rRNA models from other arthropods in core regions of the rRNA, with little additional expansion in non-conserved regions. Our multiple sequence alignments are made available on several public databases and provide a preliminary establishment of a global structural model of all rRNAs from the insects. Additionally, we provide conserved stretches of sequences flanking the rDNA cistrons that comprise the externally transcribed spacer regions (ETS) and part of the intergenic spacer region (IGS), including several repetitive motifs. Finally, we report the occurrence of retrotransposition in the nuclear large subunit rDNA, as R2 elements are present in the usual insertion points found in other arthropods. Interestingly, functional R1 elements usually present in the genomes of insects were not detected in the honey bee rRNA genes. The reverse transcriptase products of the R2 elements are deduced from their putative open reading frames and structurally aligned with those from another hymenopteran insect, the jewel wasp Nasonia (Pteromalidae). Stretches of conserved amino acids shared between Apis and Nasonia are illustrated and serve as potential sites for primer design, as target amplicons within these R2 elements may serve as novel phylogenetic markers for Hymenoptera. Given the impending completion of the sequencing of the Nasonia genome, we expect our report eventually to shed light on the evolution of the hymenopteran genome within higher insects, particularly regarding the relative maintenance of conserved rDNA genes, related variable spacer regions and retrotransposable elements. PMID:17069639
The Drosophila Translational Control Element (TCE) Is Required for High-Level Transcription of Many Genes That Are Specifically Expressed in Testes

PubMed Central

Anderson, Ashley K.; Ohler, Uwe; Wassarman, David A.

2012-01-01

To investigate the importance of core promoter elements for tissue-specific transcription of RNA polymerase II genes, we examined testis-specific transcription in Drosophila melanogaster. Bioinformatic analyses of core promoter sequences from 190 genes that are specifically expressed in testes identified a 10 bp A/T-rich motif that is identical to the translational control element (TCE). The TCE functions in the 5′ untranslated region of Mst(3)CGP mRNAs to repress translation, and it also functions in a heterologous gene to regulate transcription. We found that among genes with focused initiation patterns, the TCE is significantly enriched in core promoters of genes that are specifically expressed in testes but not in core promoters of genes that are specifically expressed in other tissues. The TCE is variably located in core promoters and is conserved in melanogaster subgroup species, but conservation dramatically drops in more distant species. In transgenic flies, short (300–400 bp) genomic regions containing a TCE directed testis-specific transcription of a reporter gene. Mutation of the TCE significantly reduced but did not abolish reporter gene transcription indicating that the TCE is important but not essential for transcription activation. Finally, mutation of testis-specific TFIID (tTFIID) subunits significantly reduced the transcription of a subset of endogenous TCE-containing but not TCE-lacking genes, suggesting that tTFIID activity is limited to TCE-containing genes but that tTFIID is not an obligatory regulator of TCE-containing genes. Thus, the TCE is a core promoter element in a subset of genes that are specifically expressed in testes. Furthermore, the TCE regulates transcription in the context of short genomic regions, from variable locations in the core promoter, and both dependently and independently of tTFIID. These findings set the stage for determining the mechanism by which the TCE regulates testis-specific transcription and understanding the dual role of the TCE in translational and transcriptional regulation. PMID:22984601
The Drosophila Translational Control Element (TCE) is required for high-level transcription of many genes that are specifically expressed in testes.

PubMed

Katzenberger, Rebeccah J; Rach, Elizabeth A; Anderson, Ashley K; Ohler, Uwe; Wassarman, David A

2012-01-01

To investigate the importance of core promoter elements for tissue-specific transcription of RNA polymerase II genes, we examined testis-specific transcription in Drosophila melanogaster. Bioinformatic analyses of core promoter sequences from 190 genes that are specifically expressed in testes identified a 10 bp A/T-rich motif that is identical to the translational control element (TCE). The TCE functions in the 5' untranslated region of Mst(3)CGP mRNAs to repress translation, and it also functions in a heterologous gene to regulate transcription. We found that among genes with focused initiation patterns, the TCE is significantly enriched in core promoters of genes that are specifically expressed in testes but not in core promoters of genes that are specifically expressed in other tissues. The TCE is variably located in core promoters and is conserved in melanogaster subgroup species, but conservation dramatically drops in more distant species. In transgenic flies, short (300-400 bp) genomic regions containing a TCE directed testis-specific transcription of a reporter gene. Mutation of the TCE significantly reduced but did not abolish reporter gene transcription indicating that the TCE is important but not essential for transcription activation. Finally, mutation of testis-specific TFIID (tTFIID) subunits significantly reduced the transcription of a subset of endogenous TCE-containing but not TCE-lacking genes, suggesting that tTFIID activity is limited to TCE-containing genes but that tTFIID is not an obligatory regulator of TCE-containing genes. Thus, the TCE is a core promoter element in a subset of genes that are specifically expressed in testes. Furthermore, the TCE regulates transcription in the context of short genomic regions, from variable locations in the core promoter, and both dependently and independently of tTFIID. These findings set the stage for determining the mechanism by which the TCE regulates testis-specific transcription and understanding the dual role of the TCE in translational and transcriptional regulation.
On the relationship between residue structural environment and sequence conservation in proteins.

PubMed

Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao

2017-09-01

Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.

Non-3D domain swapped crystal structure of truncated zebrafish alphaA crystallin

PubMed Central

Laganowsky, A; Eisenberg, D

2010-01-01

In previous work on truncated alpha crystallins (Laganowsky et al., Protein Sci 2010; 19:1031–1043), we determined crystal structures of the alpha crystallin core, a seven beta-stranded immunoglobulin-like domain, with its conserved C-terminal extension. These extensions swap into neighboring cores forming oligomeric assemblies. The extension is palindromic in sequence, binding in either of two directions. Here, we report the crystal structure of a truncated alphaA crystallin (AAC) from zebrafish (Danio rerio) revealing C-terminal extensions in a non three-dimensional (3D) domain swapped, “closed” state. The extension is quasi-palindromic, bound within its own zebrafish core domain, lying in the opposite direction to that of bovine AAC, which is bound within an adjacent core domain (Laganowsky et al., Protein Sci 2010; 19:1031–1043). Our findings establish that the C-terminal extension of alpha crystallin proteins can be either 3D domain swapped or non-3D domain swapped. This duality provides another molecular mechanism for alpha crystallin proteins to maintain the polydispersity that is crucial for eye lens transparency. PMID:20669149
Massive gene transfer and extensive RNA editing of a symbiotic dinoflagellate plastid genome.

PubMed

Mungpakdee, Sutada; Shinzato, Chuya; Takeuchi, Takeshi; Kawashima, Takeshi; Koyanagi, Ryo; Hisata, Kanako; Tanaka, Makiko; Goto, Hiroki; Fujie, Manabu; Lin, Senjie; Satoh, Nori; Shoguchi, Eiichi

2014-05-31

Genome sequencing of Symbiodinium minutum revealed that 95 of 109 plastid-associated genes have been transferred to the nuclear genome and subsequently expanded by gene duplication. Only 14 genes remain in plastids and occur as DNA minicircles. Each minicircle (1.8-3.3 kb) contains one gene and a conserved noncoding region containing putative promoters and RNA-binding sites. Nine types of RNA editing, including a novel G/U type, were discovered in minicircle transcripts but not in genes transferred to the nucleus. In contrast to DNA editing sites in dinoflagellate mitochondria, which tend to be highly conserved across all taxa, editing sites employed in DNA minicircles are highly variable from species to species. Editing is crucial for core photosystem protein function. It restores evolutionarily conserved amino acids and increases peptidyl hydropathy. It also increases protein plasticity necessary to initiate photosystem complex assembly. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
High-Throughput Genetic Identification of Functionally Important Regions of the Yeast DEAD-Box Protein Mss116p

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mohr, Georg; Del Campo, Mark; Turner, Kathryn G.

The Saccharomyces cerevisiae DEAD-box protein Mss116p is a general RNA chaperone that functions in splicing mitochondrial group I and group II introns. Recent X-ray crystal structures of Mss116p in complex with ATP analogs and single-stranded RNA show that the helicase core induces a bend in the bound RNA, as in other DEAD-box proteins, while a C-terminal extension (CTE) induces a second bend, resulting in RNA crimping. Here, we illuminate these structures by using high-throughput genetic selections, unigenic evolution, and analyses of in vivo splicing activity to comprehensively identify functionally important regions and permissible amino acid substitutions throughout Mss116p. The functionallymore » important regions include those containing conserved sequence motifs involved in ATP and RNA binding or interdomain interactions, as well as previously unidentified regions, including surface loops that may function in protein-protein interactions. The genetic selections recapitulate major features of the conserved helicase motifs seen in other DEAD-box proteins but also show surprising variations, including multiple novel variants of motif III (SAT). Patterns of amino acid substitutions indicate that the RNA bend induced by the helicase core depends on ionic and hydrogen-bonding interactions with the bound RNA; identify a subset of critically interacting residues; and indicate that the bend induced by the CTE results primarily from a steric block. Finally, we identified two conserved regions - one the previously noted post II region in the helicase core and the other in the CTE - that may help displace or sequester the opposite RNA strand during RNA unwinding.« less
Location of core diagnostic information across various sequences in brain MRI and implications for efficiency of MRI scanner utilization.

PubMed

Sharma, Aseem; Chatterjee, Arindam; Goyal, Manu; Parsons, Matthew S; Bartel, Seth

2015-04-01

Targeting redundancy within MRI can improve its cost-effective utilization. We sought to quantify potential redundancy in our brain MRI protocols. In this retrospective review, we aggregated 207 consecutive adults who underwent brain MRI and reviewed their medical records to document clinical indication, core diagnostic information provided by MRI, and its clinical impact. Contributory imaging abnormalities constituted positive core diagnostic information whereas absence of imaging abnormalities constituted negative core diagnostic information. The senior author selected core sequences deemed sufficient for extraction of core diagnostic information. For validating core sequences selection, four readers assessed the relative ease of extracting core diagnostic information from the core sequences. Potential redundancy was calculated by comparing the average number of core sequences to the average number of sequences obtained. Scanning had been performed using 9.4±2.8 sequences over 37.3±12.3 minutes. Core diagnostic information was deemed extractable from 2.1±1.1 core sequences, with an assumed scanning time of 8.6±4.8 minutes, reflecting a potential redundancy of 74.5%±19.1%. Potential redundancy was least in scans obtained for treatment planning (14.9%±25.7%) and highest in scans obtained for follow-up of benign diseases (81.4%±12.6%). In 97.4% of cases, all four readers considered core diagnostic information to be either easily extractable from core sequences or the ease to be equivalent to that from the entire study. With only one MRI lacking clinical impact (0.48%), overutilization did not seem to contribute to potential redundancy. High potential redundancy that can be targeted for more efficient scanner utilization exists in brain MRI protocols.
Transcription and ncRNAs: at the cent(rome)re of kinetochore assembly and maintenance.

PubMed

Scott, Kristin C

2013-12-01

Centromeres are sites of chromosomal spindle attachment during mitosis and meiosis. Centromeres are defined, in part, by a distinct chromatin landscape in which histone H3 is replaced by the conserved histone H3 variant, CENP-A. Sequences competent for centromere formation and function vary among organisms and are typically composed of repetitive DNA. It is unclear how such diverse genomic signals are integrated with the epigenetic mechanisms that govern CENP-A incorporation at a single locus on each chromosome. Recent work highlights the intriguing possibility that the transcriptional properties of centromeric core DNA contribute to centromere identity and maintenance through cell division. Moreover, core-derived noncoding RNAs (ncRNAs) have emerged as active participants in the regulation and control of centromere activity in plants and mammals. This paper reviews the transcriptional properties of eukaryotic centromeres and discusses the known roles of core-derived ncRNAs in chromatin integrity, kinetochore assembly, and centromere activity.
Pan-Genomic Analysis Provides Insights into the Genomic Variation and Evolution of Salmonella Paratyphi A

PubMed Central

Chen, Chunxia; Cui, Xiaoying; Yu, Jun; Xiao, Jingfa; Kan, Biao

2012-01-01

Salmonella Paratyphi A (S. Paratyphi A) is a highly adapted, human-specific pathogen that causes paratyphoid fever. Cases of paratyphoid fever have recently been increasing, and the disease is becoming a major public health concern, especially in Eastern and Southern Asia. To investigate the genomic variation and evolution of S. Paratyphi A, a pan-genomic analysis was performed on five newly sequenced S. Paratyphi A strains and two other reference strains. A whole genome comparison revealed that the seven genomes are collinear and that their organization is highly conserved. The high rate of substitutions in part of the core genome indicates that there are frequent homologous recombination events. Based on the changes in the pan-genome size and cluster number (both in the core functional genes and core pseudogenes), it can be inferred that the sharply increasing number of pseudogene clusters may have strong correlation with the inactivation of functional genes, and indicates that the S. Paratyphi A genome is being degraded. PMID:23028950
Structural modeling identifies Plasmodium vivax 4-diphosphocytidyl-2C-methyl-d-erythritol kinase (IspE) as a plausible new antimalarial drug target.

PubMed

Kadian, Kavita; Vijay, Sonam; Gupta, Yash; Rawal, Ritu; Singh, Jagbir; Anvikar, Anup; Pande, Veena; Sharma, Arun

2018-08-01

Malaria parasites utilize Methylerythritol phosphate (MEP) pathway for synthesis of isoprenoid precursors which are essential for maturation and survival of parasites during erythrocytic and gametocytic stages. The absence of MEP pathway in the human host establishes MEP pathway enzymes as a repertoire of essential drug targets. The fourth enzyme, 4-diphosphocytidyl-2C-methyl-d-erythritol kinase (IspE) has been proved essential in pathogenic bacteria, however; it has not yet been studied in any Plasmodium species. This study was undertaken to investigate genetic polymorphism and concomitant structural implications of the Plasmodium vivax IspE (PvIspE) by employing sequencing, modeling and bioinformatics approach. We report that PvIspE gene displayed six non-synonymous mutations which were restricted to non-conserved regions within the gene from seven topographically distinct malaria-endemic regions of India. Phylogenetic studies reflected that PvIspE occupies unique status within Plasmodia genus and reflects that Plasmodium vivax IspE gene has a distant and non-conserved relation with human ortholog Mevalonate Kinase (MAVK). Structural modeling analysis revealed that all PvIspE Indian isolates have critically conserved canonical galacto-homoserine-mevalonate-phosphomevalonate kinase (GHMP) domain within the active site lying in a deep cleft sandwiched between ATP and CDPME-binding domains. The active core region was highly conserved among all clinical isolates, may be due to >60% β-pleated rigid architecture. The mapped structural analysis revealed the critically conserved active site of PvIspE, both sequence, and spacially among all Indian isolates; showing no significant changes in the active site. Our study strengthens the candidature of Plasmodium vivax IspE enzyme as a future target for novel antimalarials. Copyright © 2018 Elsevier B.V. All rights reserved.
Comparative genomics of Burkholderia multivorans, a ubiquitous pathogen with a highly conserved genomic structure

PubMed Central

Cooper, Vaughn S.; Hatcher, Philip J.; Verheyde, Bart; Carlier, Aurélien; Vandamme, Peter

2017-01-01

The natural environment serves as a reservoir of opportunistic pathogens. A well-established method for studying the epidemiology of such opportunists is multilocus sequence typing, which in many cases has defined strains predisposed to causing infection. Burkholderia multivorans is an important pathogen in people with cystic fibrosis (CF) and its epidemiology suggests that strains are acquired from non-human sources such as the natural environment. This raises the central question of whether the isolation source (CF or environment) or the multilocus sequence type (ST) of B. multivorans better predicts their genomic content and functionality. We identified four pairs of B. multivorans isolates, representing distinct STs and consisting of one CF and one environmental isolate each. All genomes were sequenced using the PacBio SMRT sequencing technology, which resulted in eight high-quality B. multivorans genome assemblies. The present study demonstrated that the genomic structure of the examined B. multivorans STs is highly conserved and that the B. multivorans genomic lineages are defined by their ST. Orthologous protein families were not uniformly distributed among chromosomes, with core orthologs being enriched on the primary chromosome and ST-specific orthologs being enriched on the second and third chromosome. The ST-specific orthologs were enriched in genes involved in defense mechanisms and secondary metabolism, corroborating the strain-specificity of these virulence characteristics. Finally, the same B. multivorans genomic lineages occur in both CF and environmental samples and on different continents, demonstrating their ubiquity and evolutionary persistence. PMID:28430818
Welcome to pandoraviruses at the ‘Fourth TRUC’ club

PubMed Central

Sharma, Vikas; Colson, Philippe; Chabrol, Olivier; Scheid, Patrick; Pontarotti, Pierre; Raoult, Didier

2015-01-01

Nucleocytoplasmic large DNA viruses, or representatives of the proposed order Megavirales, belong to families of giant viruses that infect a broad range of eukaryotic hosts. Megaviruses have been previously described to comprise a fourth monophylogenetic TRUC (things resisting uncompleted classification) together with cellular domains in the universal tree of life. Recently described pandoraviruses have large (1.9–2.5 MB) and highly divergent genomes. In the present study, we updated the classification of pandoraviruses and other reported giant viruses. Phylogenetic trees were constructed based on six informational genes. Hierarchical clustering was performed based on a set of informational genes from Megavirales members and cellular organisms. Homologous sequences were selected from cellular organisms using TimeTree software, comprising comprehensive, and representative sets of members from Bacteria, Archaea, and Eukarya. Phylogenetic analyses based on three conserved core genes clustered pandoraviruses with phycodnaviruses, exhibiting their close relatedness. Additionally, hierarchical clustering analyses based on informational genes grouped pandoraviruses with Megavirales members as a super group distinct from cellular organisms. Thus, the analyses based on core conserved genes revealed that pandoraviruses are new genuine members of the ‘Fourth TRUC’ club, encompassing distinct life forms compared with cellular organisms. PMID:26042093
Welcome to pandoraviruses at the 'Fourth TRUC' club.

PubMed

Sharma, Vikas; Colson, Philippe; Chabrol, Olivier; Scheid, Patrick; Pontarotti, Pierre; Raoult, Didier

2015-01-01

Nucleocytoplasmic large DNA viruses, or representatives of the proposed order Megavirales, belong to families of giant viruses that infect a broad range of eukaryotic hosts. Megaviruses have been previously described to comprise a fourth monophylogenetic TRUC (things resisting uncompleted classification) together with cellular domains in the universal tree of life. Recently described pandoraviruses have large (1.9-2.5 MB) and highly divergent genomes. In the present study, we updated the classification of pandoraviruses and other reported giant viruses. Phylogenetic trees were constructed based on six informational genes. Hierarchical clustering was performed based on a set of informational genes from Megavirales members and cellular organisms. Homologous sequences were selected from cellular organisms using TimeTree software, comprising comprehensive, and representative sets of members from Bacteria, Archaea, and Eukarya. Phylogenetic analyses based on three conserved core genes clustered pandoraviruses with phycodnaviruses, exhibiting their close relatedness. Additionally, hierarchical clustering analyses based on informational genes grouped pandoraviruses with Megavirales members as a super group distinct from cellular organisms. Thus, the analyses based on core conserved genes revealed that pandoraviruses are new genuine members of the 'Fourth TRUC' club, encompassing distinct life forms compared with cellular organisms.
Complete Genome Analysis of Thermus parvatiensis and Comparative Genomics of Thermus spp. Provide Insights into Genetic Variability and Evolution of Natural Competence as Strategic Survival Attributes

PubMed Central

Tripathi, Charu; Mishra, Harshita; Khurana, Himani; Dwivedi, Vatsala; Kamra, Komal; Negi, Ram K.; Lal, Rup

2017-01-01

Thermophilic environments represent an interesting niche. Among thermophiles, the genus Thermus is among the most studied genera. In this study, we have sequenced the genome of Thermus parvatiensis strain RL, a thermophile isolated from Himalayan hot water springs (temperature >96°C) using PacBio RSII SMRT technique. The small genome (2.01 Mbp) comprises a chromosome (1.87 Mbp) and a plasmid (143 Kbp), designated in this study as pTP143. Annotation revealed a high number of repair genes, a squeezed genome but containing highly plastic plasmid with transposases, integrases, mobile elements and hypothetical proteins (44%). We performed a comparative genomic study of the group Thermus with an aim of analysing the phylogenetic relatedness as well as niche specific attributes prevalent among the group. We compared the reference genome RL with 16 Thermus genomes to assess their phylogenetic relationships based on 16S rRNA gene sequences, average nucleotide identity (ANI), conserved marker genes (31 and 400), pan genome and tetranucleotide frequency. The core genome of the analyzed genomes contained 1,177 core genes and many singleton genes were detected in individual genomes, reflecting a conserved core but adaptive pan repertoire. We demonstrated the presence of metagenomic islands (chromosome:5, plasmid:5) by recruiting raw metagenomic data (from the same niche) against the genomic replicons of T. parvatiensis. We also dissected the CRISPR loci wide all genomes and found widespread presence of this system across Thermus genomes. Additionally, we performed a comparative analysis of competence loci wide Thermus genomes and found evidence for recent horizontal acquisition of the locus and continued dispersal among members reflecting that natural competence is a beneficial survival trait among Thermus members and its acquisition depicts unending evolution in order to accomplish optimal fitness. PMID:28798737
RNA Dependent RNA Polymerases: Insights from Structure, Function and Evolution.

PubMed

Venkataraman, Sangita; Prasad, Burra V L S; Selvarajan, Ramasamy

2018-02-10

RNA dependent RNA polymerase (RdRp) is one of the most versatile enzymes of RNA viruses that is indispensable for replicating the genome as well as for carrying out transcription. The core structural features of RdRps are conserved, despite the divergence in their sequences. The structure of RdRp resembles that of a cupped right hand and consists of fingers, palm and thumb subdomains. The catalysis involves the participation of conserved aspartates and divalent metal ions. Complexes of RdRps with substrates, inhibitors and metal ions provide a comprehensive view of their functional mechanism and offer valuable insights regarding the development of antivirals. In this article, we provide an overview of the structural aspects of RdRps and their complexes from the Group III, IV and V viruses and their structure-based phylogeny.
RNA Dependent RNA Polymerases: Insights from Structure, Function and Evolution

PubMed Central

Venkataraman, Sangita; Prasad, Burra V L S; Selvarajan, Ramasamy

2018-01-01

RNA dependent RNA polymerase (RdRp) is one of the most versatile enzymes of RNA viruses that is indispensable for replicating the genome as well as for carrying out transcription. The core structural features of RdRps are conserved, despite the divergence in their sequences. The structure of RdRp resembles that of a cupped right hand and consists of fingers, palm and thumb subdomains. The catalysis involves the participation of conserved aspartates and divalent metal ions. Complexes of RdRps with substrates, inhibitors and metal ions provide a comprehensive view of their functional mechanism and offer valuable insights regarding the development of antivirals. In this article, we provide an overview of the structural aspects of RdRps and their complexes from the Group III, IV and V viruses and their structure-based phylogeny. PMID:29439438
Comparative sequence analysis of the X-inactivation center region in mouse, human, and bovine.

PubMed

Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

2002-06-01

We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5' of Xist that was recently shown to attract histone modification early after the onset of X inactivation.
Measuring the effectiveness of conservation: a novel framework to quantify the benefits of sage-grouse conservation policy and easements in Wyoming.

PubMed

Copeland, Holly E; Pocewicz, Amy; Naugle, David E; Griffiths, Tim; Keinath, Doug; Evans, Jeffrey; Platt, James

2013-01-01

Increasing energy and housing demands are impacting wildlife populations throughout western North America. Greater sage-grouse (Centrocercus urophasianus), a species known for its sensitivity to landscape-scale disturbance, inhabits the same low elevation sage-steppe in which much of this development is occurring. Wyoming has committed to maintain sage-grouse populations through conservation easements and policy changes that conserves high bird abundance "core" habitat and encourages development in less sensitive landscapes. In this study, we built new predictive models of oil and gas, wind, and residential development and applied build-out scenarios to simulate future development and measure the efficacy of conservation actions for maintaining sage-grouse populations. Our approach predicts sage-grouse population losses averted through conservation action and quantifies return on investment for different conservation strategies. We estimate that without conservation, sage-grouse populations in Wyoming will decrease under our long-term scenario by 14-29% (95% CI: 4-46%). However, a conservation strategy that includes the "core area" policy and $250 million in targeted easements could reduce these losses to 9-15% (95% CI: 3-32%), cutting anticipated losses by roughly half statewide and nearly two-thirds within sage-grouse core breeding areas. Core area policy is the single most important component, and targeted easements are complementary to the overall strategy. There is considerable uncertainty around the magnitude of our estimates; however, the relative benefit of different conservation scenarios remains comparable because potential biases and assumptions are consistently applied regardless of the strategy. There is early evidence based on a 40% reduction in leased hectares inside core areas that Wyoming policy is reducing potential for future fragmentation inside core areas. Our framework using build-out scenarios to anticipate species declines provides estimates that could be used by decision makers to determine if expected population losses warrant ESA listing.
A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3

PubMed Central

Dietmann, Sabine; Park, Jong; Notredame, Cedric; Heger, Andreas; Lappe, Michael; Holm, Liisa

2001-01-01

The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank (PDB). The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to: (i) supersecondary structural motifs (attractors in fold space), (ii) the topology of globular domains (fold types), (iii) remote homologues (functional families) and (iv) homologues with sequence identity above 25% (sequence families). The computational definitions of attractors and functional families are new. In September 2000, the Dali classification contained 10 531 PDB entries comprising 17 101 chains, which were partitioned into five attractor regions, 1375 fold types, 2582 functional families and 3724 domain sequence families. Sequence families were further associated with 99 582 unique homologous sequences in the HSSP database, which increases the number of effectively known structures several-fold. The resulting database contains the description of protein domain architecture, the definition of structural neighbours around each known structure, the definition of structurally conserved cores and a comprehensive library of explicit multiple alignments of distantly related protein families. PMID:11125048
Core element characterization of Rhodococcus promoters and development of a promoter-RBS mini-pool with different activity levels for efficient gene expression.

PubMed

Jiao, Song; Yu, Huimin; Shen, Zhongyao

2018-09-25

To satisfy the urgent demand for promoter engineering that can accurately regulate the metabolic circuits and expression of specific genes in the Rhodococcus microbial platform, a promoter-ribosome binding site (RBS) coupled mini-pool with fine-tuning of different activity levels was successfully established. Transcriptome analyses of R. ruber TH revealed several representative promoters with different activity levels, e.g., Pami, Pcs, Pnh, P50sl36, PcbiM, PgroE and Pniami. β-Galactosidase (LacZ) reporter measurement demonstrated that different gene expression levels could be obtained with these natural promoters combined with an optimal RBS of ami. Further use of these promoters to overexpress the nitrile hydratase (NHase) gene with RBSami in R. ruber THdAdN produced different expression levels consistent with the transcription analyses. The -35 and -10 core elements of different promoters were further analyzed, and the conserved sequences were revealed to be TTGNNN and (T/C)GNNA(A/C)AAT. By mutating the core elements of the strong promoters, Pnh and Pami, into the above consensus sequence, two even stronger promoters, PnhM and PamiM, were obtained with 2.2-fold and 7.7-fold improvements in transcription, respectively. Integrating several strategies, including transcriptome promoter screening, -35 and -10 core element identification, core element point-mutation, RBS optimization and diverse reporter verification, a fine-tuning promoter-RBS combination mini-pool with different activity levels in Rhodococcus strains was successfully established. This development is significant for broad applications of the Rhodococcus genus as a microbial platform. Copyright © 2018 Elsevier B.V. All rights reserved.
Conservation and diversification of Msx protein in metazoan evolution.

PubMed

Takahashi, Hirokazu; Kamiya, Akiko; Ishiguro, Akira; Suzuki, Atsushi C; Saitou, Naruya; Toyoda, Atsushi; Aruga, Jun

2008-01-01

Msx (/msh) family genes encode homeodomain (HD) proteins that control ontogeny in many animal species. We compared the structures of Msx genes from a wide range of Metazoa (Porifera, Cnidaria, Nematoda, Arthropoda, Tardigrada, Platyhelminthes, Mollusca, Brachiopoda, Annelida, Echiura, Echinodermata, Hemichordata, and Chordata) to gain an understanding of the role of these genes in phylogeny. Exon-intron boundary analysis suggested that the position of the intron located N-terminally to the HDs was widely conserved in all the genes examined, including those of cnidarians. Amino acid (aa) sequence comparison revealed 3 new evolutionarily conserved domains, as well as very strong conservation of the HDs. Two of the three domains were associated with Groucho-like protein binding in both a vertebrate and a cnidarian Msx homolog, suggesting that the interaction between Groucho-like proteins and Msx proteins was established in eumetazoan ancestors. Pairwise comparison among the collected HDs and their C-flanking aa sequences revealed that the degree of sequence conservation varied depending on the animal taxa from which the sequences were derived. Highly conserved Msx genes were identified in the Vertebrata, Cephalochordata, Hemichordata, Echinodermata, Mollusca, Brachiopoda, and Anthozoa. The wide distribution of the conserved sequences in the animal phylogenetic tree suggested that metazoan ancestors had already acquired a set of conserved domains of the current Msx family genes. Interestingly, although strongly conserved sequences were recovered from the Vertebrata, Cephalochordata, and Anthozoa, the sequences from the Urochordata and Hydrozoa showed weak conservation. Because the Vertebrata-Cephalochordata-Urochordata and Anthozoa-Hydrozoa represent sister groups in the Chordata and Cnidaria, respectively, Msx sequence diversification may have occurred differentially in the course of evolution. We speculate that selective loss of the conserved domains in Msx family proteins contributed to the diversification of animal body organization.
IFLA General Conference, 1986. Management and Technology Division. Section: Conservation. Papers.

ERIC Educational Resources Information Center

International Federation of Library Associations and Institutions, The Hague (Netherlands).

This document contains three papers on conservation which were presented at the 1986 International Federation of Library Associations (IFLA) conference. In "The IFLA Conservation Section and the Core Programme for Preservation (PAC)," David W. G. Clements of the United Kingdom outlines the background of the Core Programme on Preservation…
Draft genome of the American Eel (Anguilla rostrata).

PubMed

Pavey, Scott A; Laporte, Martin; Normandeau, Eric; Gaudin, Jérémy; Letourneau, Louis; Boisvert, Sébastien; Corbeil, Jacques; Audet, Céline; Bernatchez, Louis

2017-07-01

Freshwater eels (Anguilla sp.) have large economic, cultural, ecological and aesthetic importance worldwide, but they suffered more than 90% decline in global stocks over the past few decades. Proper genetic resources, such as sequenced, assembled and annotated genomes, are essential to help plan sustainable recoveries by identifying physiological, biochemical and genetic mechanisms that caused the declines or that may lead to recoveries. Here, we present the first sequenced genome of the American eel. This genome contained 305 043 contigs (N50 = 7397) and 79 209 scaffolds (N50 = 86 641) for a total size of 1.41 Gb, which is in the middle of the range of previous estimations for this species. In addition, protein-coding regions, including introns and flanking regions, are very well represented in the genome, as 95.2% of the 458 core eukaryotic genes and 98.8% of the 248 ultra-conserved subset were represented in the assembly and a total of 26 564 genes were annotated for future functional genomics studies. We performed a candidate gene analysis to compare three genes among all three freshwater eel species and, congruent with the phylogenetic relationships, Japanese eel (A. japanica) exhibited the most divergence. Overall, the sequenced genome presented in this study is a crucial addition to the presently available genetic tools to help guide future conservation efforts of freshwater eels. © 2016 John Wiley & Sons Ltd.

Cloning, expression, purification and characterization of lipase from Bacillus licheniformis, isolated from hot spring of Himachal Pradesh, India.

PubMed

Kaur, Gagandeep; Singh, Amninder; Sharma, Rohit; Sharma, Vinay; Verma, Swati; Sharma, Pushpender K

2016-06-01

In the present investigation, a gene encoding extracellular lipase was cloned from a Bacillus licheniformis. The recombinant protein containing His-tag was expressed as inclusion bodies in Esherichia coli BL21DE3 cells, using pET-23a as expression vector. Expressed protein purified from the inclusion bodies demonstrated ~22 kDa protein band on 12 % SDS-PAGE. It exhibited specific activity of 0.49 U mg -1 and % yield of 8.58. Interestingly, the lipase displayed activity at wide range of pH and temperature, i.e., 9.0-14.0 pH and 30-80 °C, respectively. It further demonstrated ~100 % enzyme activity in presence of various organic solvents. Enzyme activity was strongly inhibited in the presence of β-ME. Additionally, the serine and histidine modifiers also inhibited the enzyme activities strongly at all concentrations that suggest their role in the catalytic center. Enzyme could retain its activity in presence of various detergents (Triton X-100, Tween 20, Tween 40, SDS). Sequence and structural analysis employing in silico tools revealed that the lipase contained two highly conserved sequences consisting of ITITGCGNDL and NLYNP, arranged as parallel β-sheet in the core of the 3D structure. The function of these conserve sequences have not fully understood.
Molecular and Mutational Analysis of a Gelsolin-Family Member Encoded by the Flightless I Gene of Drosophila Melanogaster

PubMed Central

de-Couet, H. G.; Fong, KSK.; Weeds, A. G.; McLaughlin, P. J.; Miklos, GLG.

1995-01-01

The flightless locus of Drosophila melanogaster has been analyzed at the genetic, molecular, ultrastructural and comparative crystallographic levels. The gene encodes a single transcript encoding a protein consisting of a leucine-rich amino terminal half and a carboxyterminal half with high sequence similarity to gelsolin. We determined the genomic sequence of the flightless landscape, the breakpoints of four chromosomal rearrangements, and the molecular lesions in two lethal and two viable alleles of the gene. The two alleles that lead to flight muscle abnormalities encode mutant proteins exhibiting amino acid replacements within the S1-like domain of their gelsolin-like region. Furthermore, the deduced intronexon structure of the D. melanogaster gene has been compared with that of the Caenorhabditis elegans homologue. Furthermore, the sequence similarities of the flightless protein with gelsolin allow it to be evaluated in the context of the published crystallographic structure of the S1 domain of gelsolin. Amino acids considered essential for the structural integrity of the core are found to be highly conserved in the predicted flightless protein. Some of the residues considered essential for actin and calcium binding in gelsolin S1 and villin V1 are also well conserved. These data are discussed in light of the phenotypic characteristics of the mutants and the putative functions of the protein. PMID:8582612
Disease-Causing 7.4 kb Cis-Regulatory Deletion Disrupting Conserved Non-Coding Sequences and Their Interaction with the FOXL2 Promotor: Implications for Mutation Screening

PubMed Central

Dostie, Josée; Lemire, Edmond; Bouchard, Philippe; Field, Michael; Jones, Kristie; Lorenz, Birgit; Menten, Björn; Buysse, Karen; Pattyn, Filip; Friedli, Marc; Ucla, Catherine; Rossier, Colette; Wyss, Carine; Speleman, Frank; De Paepe, Anne; Dekker, Job; Antonarakis, Stylianos E.; De Baere, Elfride

2009-01-01

To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular. PMID:19543368
Functionally conserved enhancers with divergent sequences in distant vertebrates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Song; Oksenberg, Nir; Takayama, Sachiko

To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.
Functionally conserved enhancers with divergent sequences in distant vertebrates

DOE PAGES

Yang, Song; Oksenberg, Nir; Takayama, Sachiko; ...

2015-10-30

To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.
Comparative Sequence Analysis of the X-Inactivation Center Region in Mouse, Human, and Bovine

PubMed Central

Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

2002-01-01

We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5′ of Xist that was recently shown to attract histone modification early after the onset of X inactivation. [The sequence data described in this paper have been submitted to the EMBL data library under accession nos. AJ421478, AJ421479, AJ421480, and AJ421481. Online supplemental data are available at http://pbil.univ-lyon1.fr/datasets/Xic2002/data.html and www.genome.org.] PMID:12045143
Plastoglobules in algae: A comprehensive comparative study of the presence of major structural and functional components in complex plastids.

PubMed

Lohscheider, Jens N; Río Bártulos, Carolina

2016-08-01

Plastoglobules (PG) are lipophilic droplets attached to thylakoid membranes in higher plants and green algae and are implicated in prenyl lipid biosynthesis. They might also represent a central hub for integration of plastid signals under stress and therefore the adaptation of the thylakoid membrane under such conditions. In Arabidopsis thaliana, PG contain around 30 specific proteins of which Fibrillins (FBN) and Activity of bc1 complex kinases (ABC1K) represent the majority with respect to both number and protein mass. However, nothing is known about the presence of PG in most algal species, which are responsible for about 50% of global primary production. Therefore, we searched the genomes of publicly available algal genomes for components of PG and the associated functional network in order to predict their presence and potential evolutionary conservation of physiological functions. We could identify homologous sequences for core components of PG, like FBN and ABC1K, in most investigated algal species. Furthermore, proteins at central and interesting positions within the PG functional coexpression network were identified. Phylogenetic sequence analysis revealed diversity within FBN and ABC1K sequences among algal species with complex plastids of the red lineage and large differences compared with green lineage species. Two types of FBN were detected that differ in their isoelectric point which seems to correlate with subcellular localization. Subgroups of FBN were shared between many investigated species and modeling of their 3D-structure implied a conserved structure. FBN and ABC1K are essential structural and functional components of PG. Their occurrence in investigated algal species suggests presence of PG therein and functions in prenyl lipid metabolism and adaptation of the thylakoid membrane that are conserved during evolution. Copyright © 2016 Elsevier B.V. All rights reserved.
Variola virus topoisomerase: DNA cleavage specificity and distribution of sites in Poxvirus genomes.

PubMed

Minkah, Nana; Hwang, Young; Perry, Kay; Van Duyne, Gregory D; Hendrickson, Robert; Lefkowitz, Elliot J; Hannenhalli, Sridhar; Bushman, Frederic D

2007-08-15

Topoisomerase enzymes regulate superhelical tension in DNA resulting from transcription, replication, repair, and other molecular transactions. Poxviruses encode an unusual type IB topoisomerase that acts only at conserved DNA sequences containing the core pentanucleotide 5'-(T/C)CCTT-3'. In X-ray structures of the variola virus topoisomerase bound to DNA, protein-DNA contacts were found to extend beyond the core pentanucleotide, indicating that the full recognition site has not yet been fully defined in functional studies. Here we report quantitation of DNA cleavage rates for an optimized 13 bp site and for all possible single base substitutions (40 total sites), with the goals of understanding the molecular mechanism of recognition and mapping topoisomerase sites in poxvirus genome sequences. The data allow a precise definition of enzyme-DNA interactions and the energetic contributions of each. We then used the resulting "action matrix" to show that favorable topoisomerase sites are distributed all along the length of poxvirus DNA sequences, consistent with a requirement for local release of superhelical tension in constrained topological domains. In orthopox genomes, an additional central cluster of sites was also evident. A negative correlation of predicted topoisomerase sites was seen relative to early terminators, but no correlation was seen with early or late promoters. These data define the full variola virus topoisomerase recognition site and provide a new window on topoisomerase function in vivo.
Structural analysis of Bacillus pumilus phenolic acid decarboxylase, a lipocalin-fold enzyme

DOE Office of Scientific and Technical Information (OSTI.GOV)

Matte, Allan; Grosse, Stephan; Bergeron, Hélène

The decarboxylation of phenolic acids, including ferulic and p-coumaric acids, to their corresponding vinyl derivatives is of importance in the flavoring and polymer industries. Here, the crystal structure of phenolic acid decarboxylase (PAD) from Bacillus pumilus strain UI-670 is reported. The enzyme is a 161-residue polypeptide that forms dimers both in the crystal and in solution. The structure of PAD as determined by X-ray crystallography revealed a -barrel structure and two -helices, with a cleft formed at one edge of the barrel. The PAD structure resembles those of the lipocalin-fold proteins, which often bind hydrophobic ligands. Superposition of structurally relatedmore » proteins bound to their cognate ligands shows that they and PAD bind their ligands in a conserved location within the -barrel. Analysis of the residue-conservation pattern for PAD-related sequences mapped onto the PAD structure reveals that the conservation mainly includes residues found within the hydrophobic core of the protein, defining a common lipocalin-like fold for this enzyme family. A narrow cleft containing several conserved amino acids was observed as a structural feature and a potential ligand-binding site.« less
Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences.

PubMed

Bergman, C M; Kreitman, M

2001-08-01

Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.
RNAcode: Robust discrimination of coding and noncoding regions in comparative sequence data

PubMed Central

Washietl, Stefan; Findeiß, Sven; Müller, Stephan A.; Kalkhof, Stefan; von Bergen, Martin; Hofacker, Ivo L.; Stadler, Peter F.; Goldman, Nick

2011-01-01

With the availability of genome-wide transcription data and massive comparative sequencing, the discrimination of coding from noncoding RNAs and the assessment of coding potential in evolutionarily conserved regions arose as a core analysis task. Here we present RNAcode, a program to detect coding regions in multiple sequence alignments that is optimized for emerging applications not covered by current protein gene-finding software. Our algorithm combines information from nucleotide substitution and gap patterns in a unified framework and also deals with real-life issues such as alignment and sequencing errors. It uses an explicit statistical model with no machine learning component and can therefore be applied “out of the box,” without any training, to data from all domains of life. We describe the RNAcode method and apply it in combination with mass spectrometry experiments to predict and confirm seven novel short peptides in Escherichia coli and to analyze the coding potential of RNAs previously annotated as “noncoding.” RNAcode is open source software and available for all major platforms at http://wash.github.com/rnacode. PMID:21357752
RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data.

PubMed

Washietl, Stefan; Findeiss, Sven; Müller, Stephan A; Kalkhof, Stefan; von Bergen, Martin; Hofacker, Ivo L; Stadler, Peter F; Goldman, Nick

2011-04-01

With the availability of genome-wide transcription data and massive comparative sequencing, the discrimination of coding from noncoding RNAs and the assessment of coding potential in evolutionarily conserved regions arose as a core analysis task. Here we present RNAcode, a program to detect coding regions in multiple sequence alignments that is optimized for emerging applications not covered by current protein gene-finding software. Our algorithm combines information from nucleotide substitution and gap patterns in a unified framework and also deals with real-life issues such as alignment and sequencing errors. It uses an explicit statistical model with no machine learning component and can therefore be applied "out of the box," without any training, to data from all domains of life. We describe the RNAcode method and apply it in combination with mass spectrometry experiments to predict and confirm seven novel short peptides in Escherichia coli and to analyze the coding potential of RNAs previously annotated as "noncoding." RNAcode is open source software and available for all major platforms at http://wash.github.com/rnacode.
Biological function in the twilight zone of sequence conservation.

PubMed

Ponting, Chris P

2017-08-16

Strong DNA conservation among divergent species is an indicator of enduring functionality. With weaker sequence conservation we enter a vast 'twilight zone' in which sequence subject to transient or lower constraint cannot be distinguished easily from neutrally evolving, non-functional sequence. Twilight zone functional sequence is illuminated instead by principles of selective constraint and positive selection using genomic data acquired from within a species' population. Application of these principles reveals that despite being biochemically active, most twilight zone sequence is not functional.
Genotype diversity of hepatitis C virus (HCV) in HCV-associated liver disease patients in Indonesia.

PubMed

Utama, Andi; Tania, Navessa Padma; Dhenni, Rama; Gani, Rino Alvani; Hasan, Irsan; Sanityoso, Andri; Lelosutan, Syafruddin A R; Martamala, Ruswhandi; Lesmana, Laurentius Adrianus; Sulaiman, Ali; Tai, Susan

2010-09-01

Hepatitis C virus (HCV) genotype distribution in Indonesia has been reported. However, the identification of HCV genotype was based on 5'-UTR or NS5B sequence. This study was aimed to observe HCV core sequence variation among HCV-associated liver disease patients in Jakarta, and to analyse the HCV genotype diversity based on the core sequence. Sixty-eight chronic hepatitis (CH), 48 liver cirrhosis (LC) and 34 hepatocellular carcinoma (HCC) were included in this study. HCV core variation was analysed by direct sequencing. Alignment of HCV core sequences demonstrated that the core sequence was relatively varied among the genotype. Indeed, 237 bases of the core sequence could classify the HCV subtype; however, 236 bases failed to differentiate several subtypes. Based on 237 bases of the core sequences, the HCV strains were classified into genotypes 1 (subtypes 1a, 1b and 1c), 2 (subtypes 2a, 2e and 2f) and 3 (subtypes 3a and 3k). The HCV 1b (47.3%) was the most prevalent, followed by subtypes 1c (18.7%), 3k (10.7%), 2a (10.0%), 1a (6.7%), 2e (5.3%), 2f (0.7%) and 3a (0.7%). HCV 1b was the most common in all patients, and the prevalence increased with the severity of liver disease (36.8% in CH, 54.2% in LC and 58.8% in HCC). These results were similar to a previous report based on NS5B sequence analysis. Hepatitis C virus core sequence (237 bases) could identify the HCV subtype and the prevalence of HCV subtype based on core sequence was similar to those based on the NS5B region.
Similarity in Shape Dictates Signature Intrinsic Dynamics Despite No Functional Conservation in TIM Barrel Enzymes

PubMed Central

Tiwari, Sandhya P.; Reuter, Nathalie

2016-01-01

The conservation of the intrinsic dynamics of proteins emerges as we attempt to understand the relationship between sequence, structure and functional conservation. We characterise the conservation of such dynamics in a case where the structure is conserved but function differs greatly. The triosephosphate isomerase barrel fold (TBF), renowned for its 8 β-strand-α-helix repeats that close to form a barrel, is one of the most diverse and abundant folds found in known protein structures. Proteins with this fold have diverse enzymatic functions spanning five of six Enzyme Commission classes, and we have picked five different superfamily candidates for our analysis using elastic network models. We find that the overall shape is a large determinant in the similarity of the intrinsic dynamics, regardless of function. In particular, the β-barrel core is highly rigid, while the α-helices that flank the β-strands have greater relative mobility, allowing for the many possibilities for placement of catalytic residues. We find that these elements correlate with each other via the loops that link them, as opposed to being directly correlated. We are also able to analyse the types of motions encoded by the normal mode vectors of the α-helices. We suggest that the global conservation of the intrinsic dynamics in the TBF contributes greatly to its success as an enzymatic scaffold both through evolution and enzyme design. PMID:27015412
Phylogenomics and taxonomy of Lecomtelleae (Poaceae), an isolated panicoid lineage from Madagascar

PubMed Central

Besnard, Guillaume; Christin, Pascal-Antoine; Malé, Pierre-Jean G.; Coissac, Eric; Ralimanana, Hélène; Vorontsova, Maria S.

2013-01-01

Background and Aims An accurate characterization of biodiversity requires analyses of DNA sequences in addition to classical morphological descriptions. New methods based on high-throughput sequencing may allow investigation of specimens with a large set of genetic markers to infer their evolutionary history. In the grass family, the phylogenetic position of the monotypic genus Lecomtella, a rare bamboo-like endemic from Madagascar, has never been appropriately evaluated. Until now its taxonomic treatment has remained controversial, indicating the need for re-evaluation based on a combination of molecular and morphological data. Methods The phylogenetic position of Lecomtella in Poaceae was evaluated based on sequences from the nuclear and plastid genomes generated by next-generation sequencing (NGS). In addition, a detailed morphological description of L. madagascariensis was produced, and its distribution and habit were investigated in order to assess its conservation status. Key Results The complete plastid sequence, a ribosomal DNA unit and fragments of low-copy nuclear genes (phyB and ppc) were obtained. All phylogenetic analyses place Lecomtella as an isolated member of the core panicoids, which last shared a common ancestor with other species >20 million years ago. Although Lecomtella exhibits morphological characters typical of Panicoideae, an unusual combination of traits supports its treatment as a separate group. Conclusions The study showed that NGS can be used to generate abundant phylogenetic information rapidly, opening new avenues for grass phylogenetics. These data clearly showed that Lecomtella forms an isolated lineage, which, in combination with its morphological peculiarities, justifies its treatment as a separate tribe: Lecomtelleae. New descriptions of the tribe, genus and species are presented with a typification, a distribution map and an IUCN conservation assessment. PMID:23985988
Consensus-Degenerate Hybrid Oligonucleotide Primers for Amplification of Priming Glycosyltransferase Genes of the Exopolysaccharide Locus in Strains of the Lactobacillus casei Group

PubMed Central

Provencher, Cathy; LaPointe, Gisèle; Sirois, Stéphane; Van Calsteren, Marie-Rose; Roy, Denis

2003-01-01

A primer design strategy named CODEHOP (consensus-degenerate hybrid oligonucleotide primer) for amplification of distantly related sequences was used to detect the priming glycosyltransferase (GT) gene in strains of the Lactobacillus casei group. Each hybrid primer consisted of a short 3′ degenerate core based on four highly conserved amino acids and a longer 5′ consensus clamp region based on six sequences of the priming GT gene products from exopolysaccharide (EPS)-producing bacteria. The hybrid primers were used to detect the priming GT gene of 44 commercial isolates and reference strains of Lactobacillus rhamnosus, L. casei, Lactobacillus zeae, and Streptococcus thermophilus. The priming GT gene was detected in the genome of both non-EPS-producing (EPS−) and EPS-producing (EPS+) strains of L. rhamnosus. The sequences of the cloned PCR products were similar to those of the priming GT gene of various gram-negative and gram-positive EPS+ bacteria. Specific primers designed from the L. rhamnosus RW-9595M GT gene were used to sequence the end of the priming GT gene in selected EPS+ strains of L. rhamnosus. Phylogenetic analysis revealed that Lactobacillus spp. form a distinctive group apart from other lactic acid bacteria for which GT genes have been characterized to date. Moreover, the sequences show a divergence existing among strains of L. rhamnosus with respect to the terminal region of the priming GT gene. Thus, the PCR approach with consensus-degenerate hybrid primers designed with CODEHOP is a practical approach for the detection of similar genes containing conserved motifs in different bacterial genomes. PMID:12788729
Phylogenomics and taxonomy of Lecomtelleae (Poaceae), an isolated panicoid lineage from Madagascar.

PubMed

Besnard, Guillaume; Christin, Pascal-Antoine; Malé, Pierre-Jean G; Coissac, Eric; Ralimanana, Hélène; Vorontsova, Maria S

2013-10-01

An accurate characterization of biodiversity requires analyses of DNA sequences in addition to classical morphological descriptions. New methods based on high-throughput sequencing may allow investigation of specimens with a large set of genetic markers to infer their evolutionary history. In the grass family, the phylogenetic position of the monotypic genus Lecomtella, a rare bamboo-like endemic from Madagascar, has never been appropriately evaluated. Until now its taxonomic treatment has remained controversial, indicating the need for re-evaluation based on a combination of molecular and morphological data. The phylogenetic position of Lecomtella in Poaceae was evaluated based on sequences from the nuclear and plastid genomes generated by next-generation sequencing (NGS). In addition, a detailed morphological description of L. madagascariensis was produced, and its distribution and habit were investigated in order to assess its conservation status. The complete plastid sequence, a ribosomal DNA unit and fragments of low-copy nuclear genes (phyB and ppc) were obtained. All phylogenetic analyses place Lecomtella as an isolated member of the core panicoids, which last shared a common ancestor with other species >20 million years ago. Although Lecomtella exhibits morphological characters typical of Panicoideae, an unusual combination of traits supports its treatment as a separate group. The study showed that NGS can be used to generate abundant phylogenetic information rapidly, opening new avenues for grass phylogenetics. These data clearly showed that Lecomtella forms an isolated lineage, which, in combination with its morphological peculiarities, justifies its treatment as a separate tribe: Lecomtelleae. New descriptions of the tribe, genus and species are presented with a typification, a distribution map and an IUCN conservation assessment.
RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching

NASA Astrophysics Data System (ADS)

Nasalean, Lorena; Stombaugh, Jesse; Zirbel, Craig L.; Leontis, Neocles B.

Structured RNA molecules resemble proteins in the hierarchical organization of their global structures, folding and broad range of functions. Structured RNAs are composed of recurrent modular motifs that play specific functional roles. Some motifs direct the folding of the RNA or stabilize the folded structure through tertiary interactions. Others bind ligands or proteins or catalyze chemical reactions. Therefore, it is desirable, starting from the RNA sequence, to be able to predict the locations of recurrent motifs in RNA molecules. Conversely, the potential occurrence of one or more known 3D RNA motifs may indicate that a genomic sequence codes for a structured RNA molecule. To identify known RNA structural motifs in new RNA sequences, precise structure-based definitions are needed that specify the core nucleotides of each motif and their conserved interactions. By comparing instances of each recurrent motif and applying base pair isosteriCity relations, one can identify neutral mutations that preserve its structure and function in the contexts in which it occurs.
A strategy for detecting the conservation of folding-nucleus residues in protein superfamilies.

PubMed

Michnick, S W; Shakhnovich, E

1998-01-01

Nucleation-growth theory predicts that fast-folding peptide sequences fold to their native structure via structures in a transition-state ensemble that share a small number of native contacts (the folding nucleus). Experimental and theoretical studies of proteins suggest that residues participating in folding nuclei are conserved among homologs. We attempted to determine if this is true in proteins with highly diverged sequences but identical folds (superfamilies). We describe a strategy based on comparisons of residue conservation in natural superfamily sequences with simulated sequences (generated with a Monte-Carlo sequence design strategy) for the same proteins. The basic assumptions of the strategy were that natural sequences will conserve residues needed for folding and stability plus function, the simulated sequences contain no functional conservation, and nucleus residues make native contacts with each other. Based on these assumptions, we identified seven potential nucleus residues in ubiquitin superfamily members. Non-nucleus conserved residues were also identified; these are proposed to be involved in stabilizing native interactions. We found that all superfamily members conserved the same potential nucleus residue positions, except those for which the structural topology is significantly different. Our results suggest that the conservation of the nucleus of a specific fold can be predicted by comparing designed simulated sequences with natural highly diverged sequences that fold to the same structure. We suggest that such a strategy could be used to help plan protein folding and design experiments, to identify new superfamily members, and to subdivide superfamilies further into classes having a similar folding mechanism.

Fine-tuning structural RNA alignments in the twilight zone.

PubMed

Bremges, Andreas; Schirmer, Stefanie; Giegerich, Robert

2010-04-30

A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index.
Magnetic braking of stellar cores in red giants and supergiants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maeder, André; Meynet, Georges, E-mail: andre.maeder@unige, E-mail: georges.meynet@unige.ch

2014-10-01

Magnetic configurations, stable on the long term, appear to exist in various evolutionary phases, from main-sequence stars to white dwarfs and neutron stars. The large-scale ordered nature of these fields, often approximately dipolar, and their scaling according to the flux conservation scenario favor a fossil field model. We make some first estimates of the magnetic coupling between the stellar cores and the outer layers in red giants and supergiants. Analytical expressions of the truncation radius of the field coupling are established for a convective envelope and for a rotating radiative zone with horizontal turbulence. The timescales of the internal exchangesmore » of angular momentum are considered. Numerical estimates are made on the basis of recent model grids. The direct magnetic coupling of the core to the extended convective envelope of red giants and supergiants appears unlikely. However, we find that the intermediate radiative zone is fully coupled to the core during the He-burning and later phases. This coupling is able to produce a strong spin down of the core of red giants and supergiants, also leading to relatively slowly rotating stellar remnants such as white dwarfs and pulsars. Some angular momentum is also transferred to the outer convective envelope of red giants and supergiants during the He-burning phase and later.« less
The Gam protein of bacteriophage Mu is an orthologue of eukaryotic Ku

PubMed Central

di Fagagna, Fabrizio d'Adda; Weller, Geoffrey R.; Doherty, Aidan J.; Jackson, Stephen P.

2003-01-01

Mu bacteriophage inserts its DNA into the genome of host bacteria and is used as a model for DNA transposition events in other systems. The eukaryotic Ku protein has key roles in DNA repair and in certain transposition events. Here we show that the Gam protein of phage Mu is conserved in bacteria, has sequence homology with both subunits of Ku, and has the potential to adopt a similar architecture to the core DNA-binding region of Ku. Through biochemical studies, we demonstrate that Gam and the related protein of Haemophilus influenzae display DNA binding characteristics remarkably similar to those of human Ku. In addition, we show that Gam can interfere with Ty1 retrotransposition in Saccharomyces cerevisiae. These data reveal structural and functional parallels between bacteriophage Gam and eukaryotic Ku and suggest that their functions have been evolutionarily conserved. PMID:12524520
Gene family innovation, conservation and loss on the animal stem lineage.

PubMed

Richter, Daniel J; Fozouni, Parinaz; Eisen, Michael; King, Nicole

2018-05-31

Choanoflagellates, the closest living relatives of animals, can provide unique insights into the changes in gene content that preceded the origin of animals. However, only two choanoflagellate genomes are currently available, providing poor coverage of their diversity. We sequenced transcriptomes of 19 additional choanoflagellate species to produce a comprehensive reconstruction of the gains and losses that shaped the ancestral animal gene repertoire. We identified ~1,944 gene families that originated on the animal stem lineage, of which only 39 are conserved across all animals in our study. In addition, ~372 gene families previously thought to be animal-specific, including Notch, Delta, and homologs of the animal Toll-like receptor genes, instead evolved prior to the animal-choanoflagellate divergence. Our findings contribute to an increasingly detailed portrait of the gene families that defined the biology of the Urmetazoan and that may underpin core features of extant animals. © 2018, Richter et al.
Some conservation issues for the dynamical cores of NWP and climate models

NASA Astrophysics Data System (ADS)

Thuburn, J.

2008-03-01

The rationale for designing atmospheric numerical model dynamical cores with certain conservation properties is reviewed. The conceptual difficulties associated with the multiscale nature of realistic atmospheric flow, and its lack of time-reversibility, are highlighted. A distinction is made between robust invariants, which are conserved or nearly conserved in the adiabatic and frictionless limit, and non-robust invariants, which are not conserved in the limit even though they are conserved by exactly adiabatic frictionless flow. For non-robust invariants, a further distinction is made between processes that directly transfer some quantity from large to small scales, and processes involving a cascade through a continuous range of scales; such cascades may either be explicitly parameterized, or handled implicitly by the dynamical core numerics, accepting the implied non-conservation. An attempt is made to estimate the relative importance of different conservation laws. It is argued that satisfactory model performance requires spurious sources of a conservable quantity to be much smaller than any true physical sources; for several conservable quantities the magnitudes of the physical sources are estimated in order to provide benchmarks against which any spurious sources may be measured.
Divergence and evolution of homologous regions of Bombyx mori nuclear polyhedrosis virus.

PubMed Central

Majima, K; Kobara, R; Maeda, S

1993-01-01

Homologous regions (hrs) (hr1,hr2-left,hr2-right,hr3,hr4-left,hr 4-right, and hr5) similar to those found in the Autographa californica nuclear polyhedrosis virus (AcNPV) genome were found in the Bombyx mori NPV (BmNPV) genome. The BmNPV hrs contained two to eight repeats of a homologous nucleotide sequence which were on average about 75 bp long. All of these homologous sequence repeats contained a 26-bp-long palindrome motif with an EcoRI or EcoRI-like site at its core. The consensus sequence of the BmNPV hrs showed 95% conservation with respect to those found in AcNPV. Nucleotide sequence analysis indicated that hr2-left and hr2-right of BmNPV evolved from an ancestor similar to hr2 of AcNPV by inversion, cleavage, and ligation. The polarities of the BmNPV and AcNPV hrs were conserved except for that of hr4-left. Within hr4-right of BmNPV, four repeats of a previously underscribed palindrome motif were found. Bmhr5D, a BmNPV mutant which lacked hr5, replicated at a rate similar to that of wild-type BmNPV in BmN cells and silkworm larvae, indicating that hr5 was not essential for viral replication. After ten passages of Bmhr5D in BmN cells, no detectable changes in its genome were observed by restriction endonuclease analysis. The evolution and divergence of the BmNPV genome are also discussed. Images PMID:8230471
Aging increases cell-to-cell transcriptional variability upon immune stimulation.

PubMed

Martinez-Jimenez, Celia Pilar; Eling, Nils; Chen, Hung-Chang; Vallejos, Catalina A; Kolodziejczyk, Aleksandra A; Connor, Frances; Stojic, Lovorka; Rayner, Timothy F; Stubbington, Michael J T; Teichmann, Sarah A; de la Roche, Maike; Marioni, John C; Odom, Duncan T

2017-03-31

Aging is characterized by progressive loss of physiological and cellular functions, but the molecular basis of this decline remains unclear. We explored how aging affects transcriptional dynamics using single-cell RNA sequencing of unstimulated and stimulated naïve and effector memory CD4 + T cells from young and old mice from two divergent species. In young animals, immunological activation drives a conserved transcriptomic switch, resulting in tightly controlled gene expression characterized by a strong up-regulation of a core activation program, coupled with a decrease in cell-to-cell variability. Aging perturbed the activation of this core program and increased expression heterogeneity across populations of cells in both species. These discoveries suggest that increased cell-to-cell transcriptional variability will be a hallmark feature of aging across most, if not all, mammalian tissues. Copyright © 2017, American Association for the Advancement of Science.
Modeling coding-sequence evolution within the context of residue solvent accessibility.

PubMed

Scherrer, Michael P; Meyer, Austin G; Wilke, Claus O

2012-09-12

Protein structure mediates site-specific patterns of sequence divergence. In particular, residues in the core of a protein (solvent-inaccessible residues) tend to be more evolutionarily conserved than residues on the surface (solvent-accessible residues). Here, we present a model of sequence evolution that explicitly accounts for the relative solvent accessibility of each residue in a protein. Our model is a variant of the Goldman-Yang 1994 (GY94) model in which all model parameters can be functions of the relative solvent accessibility (RSA) of a residue. We apply this model to a data set comprised of nearly 600 yeast genes, and find that an evolutionary-rate ratio ω that varies linearly with RSA provides a better model fit than an RSA-independent ω or an ω that is estimated separately in individual RSA bins. We further show that the branch length t and the transition-transverion ratio κ also vary with RSA. The RSA-dependent GY94 model performs better than an RSA-dependent Muse-Gaut 1994 (MG94) model in which the synonymous and non-synonymous rates individually are linear functions of RSA. Finally, protein core size affects the slope of the linear relationship between ω and RSA, and gene expression level affects both the intercept and the slope. Structure-aware models of sequence evolution provide a significantly better fit than traditional models that neglect structure. The linear relationship between ω and RSA implies that genes are better characterized by their ω slope and intercept than by just their mean ω.
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment

PubMed Central

2013-01-01

Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

PubMed

Nagar, Anurag; Hahsler, Michael

2013-01-01

Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.
The Most Deeply Conserved Noncoding Sequences in Plants Serve Similar Functions to Those in Vertebrates Despite Large Differences in Evolutionary Rates[W

PubMed Central

Burgess, Diane; Freeling, Michael

2014-01-01

In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging ∼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing–associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates. PMID:24681619
Measuring the Effectiveness of Conservation: A Novel Framework to Quantify the Benefits of Sage-Grouse Conservation Policy and Easements in Wyoming

PubMed Central

Copeland, Holly E.; Pocewicz, Amy; Naugle, David E.; Griffiths, Tim; Keinath, Doug; Evans, Jeffrey; Platt, James

2013-01-01

Increasing energy and housing demands are impacting wildlife populations throughout western North America. Greater sage-grouse (Centrocercus urophasianus), a species known for its sensitivity to landscape-scale disturbance, inhabits the same low elevation sage-steppe in which much of this development is occurring. Wyoming has committed to maintain sage-grouse populations through conservation easements and policy changes that conserves high bird abundance “core” habitat and encourages development in less sensitive landscapes. In this study, we built new predictive models of oil and gas, wind, and residential development and applied build-out scenarios to simulate future development and measure the efficacy of conservation actions for maintaining sage-grouse populations. Our approach predicts sage-grouse population losses averted through conservation action and quantifies return on investment for different conservation strategies. We estimate that without conservation, sage-grouse populations in Wyoming will decrease under our long-term scenario by 14–29% (95% CI: 4–46%). However, a conservation strategy that includes the “core area” policy and $250 million in targeted easements could reduce these losses to 9–15% (95% CI: 3–32%), cutting anticipated losses by roughly half statewide and nearly two-thirds within sage-grouse core breeding areas. Core area policy is the single most important component, and targeted easements are complementary to the overall strategy. There is considerable uncertainty around the magnitude of our estimates; however, the relative benefit of different conservation scenarios remains comparable because potential biases and assumptions are consistently applied regardless of the strategy. There is early evidence based on a 40% reduction in leased hectares inside core areas that Wyoming policy is reducing potential for future fragmentation inside core areas. Our framework using build-out scenarios to anticipate species declines provides estimates that could be used by decision makers to determine if expected population losses warrant ESA listing. PMID:23826250
PUTATIVE GENE PROMOTER SEQUENCES IN THE CHLORELLA VIRUSES

PubMed Central

Fitzgerald, Lisa A.; Boucher, Philip T.; Yanai-Balser, Giane; Suhre, Karsten; Graves, Michael V.; Van Etten, James L.

2008-01-01

Three short (7 to 9 nucleotides) highly conserved nucleotide sequences were identified in the putative promoter regions (150 bp upstream and 50 bp downstream of the ATG translation start site) of three members of the genus Chlorovirus, family Phycodnaviridae. Most of these sequences occurred in similar locations within the defined promoter regions. The sequence and location of the motifs were often conserved among homologous ORFs within the Chlorovirus family. One of these conserved sequences (AATGACA) is predominately associated with genes expressed early in virus replication. PMID:18768195
Fusion activation through attachment protein stalk domains indicates a conserved core mechanism of paramyxovirus entry into cells.

PubMed

Bose, Sayantan; Song, Albert S; Jardetzky, Theodore S; Lamb, Robert A

2014-04-01

Paramyxoviruses are a large family of membrane-enveloped negative-stranded RNA viruses causing important diseases in humans and animals. Two viral integral membrane glycoproteins (fusion [F] and attachment [HN, H, or G]) mediate a concerted process of host receptor recognition, followed by the fusion of viral and cellular membranes, resulting in viral nucleocapsid entry into the cytoplasm. However, the sequence of events that closely links the timing of receptor recognition by HN, H, or G and the "triggering" interaction of the attachment protein with F is unclear. F activation results in F undergoing a series of irreversible conformational rearrangements to bring about membrane merger and virus entry. By extensive study of properties of multiple paramyxovirus HN proteins, we show that key features of F activation, including the F-activating regions of HN proteins, flexibility within this F-activating region, and changes in globular head-stalk interactions are highly conserved. These results, together with functionally active "headless" mumps and Newcastle disease virus HN proteins, provide insights into the F-triggering process. Based on these data and very recently published data for morbillivirus H and henipavirus G proteins, we extend our recently proposed "stalk exposure model" to other paramyxoviruses and propose an "induced fit" hypothesis for F-HN/H/G interactions as conserved core mechanisms of paramyxovirus-mediated membrane fusion. Paramyxoviruses are a large family of membrane-enveloped negative-stranded RNA viruses causing important diseases in humans and animals. Two viral integral membrane glycoproteins (fusion [F] and attachment [HN, H, or G]) mediate a concerted process of host receptor recognition, followed by the fusion of viral and cellular membranes. We describe here the molecular mechanism by which HN activates the F protein such that virus-cell fusion is controlled and occurs at the right time and the right place. We extend our recently proposed "stalk exposure model" first proposed for parainfluenza virus 5 to other paramyxoviruses and propose an "induced fit" hypothesis for F-HN/H/G interactions as conserved core mechanisms of paramyxovirus-mediated membrane fusion.
Fusion Activation through Attachment Protein Stalk Domains Indicates a Conserved Core Mechanism of Paramyxovirus Entry into Cells

PubMed Central

Bose, Sayantan; Song, Albert S.; Jardetzky, Theodore S.

2014-01-01

ABSTRACT Paramyxoviruses are a large family of membrane-enveloped negative-stranded RNA viruses causing important diseases in humans and animals. Two viral integral membrane glycoproteins (fusion [F] and attachment [HN, H, or G]) mediate a concerted process of host receptor recognition, followed by the fusion of viral and cellular membranes, resulting in viral nucleocapsid entry into the cytoplasm. However, the sequence of events that closely links the timing of receptor recognition by HN, H, or G and the “triggering” interaction of the attachment protein with F is unclear. F activation results in F undergoing a series of irreversible conformational rearrangements to bring about membrane merger and virus entry. By extensive study of properties of multiple paramyxovirus HN proteins, we show that key features of F activation, including the F-activating regions of HN proteins, flexibility within this F-activating region, and changes in globular head-stalk interactions are highly conserved. These results, together with functionally active “headless” mumps and Newcastle disease virus HN proteins, provide insights into the F-triggering process. Based on these data and very recently published data for morbillivirus H and henipavirus G proteins, we extend our recently proposed “stalk exposure model” to other paramyxoviruses and propose an “induced fit” hypothesis for F-HN/H/G interactions as conserved core mechanisms of paramyxovirus-mediated membrane fusion. IMPORTANCE Paramyxoviruses are a large family of membrane-enveloped negative-stranded RNA viruses causing important diseases in humans and animals. Two viral integral membrane glycoproteins (fusion [F] and attachment [HN, H, or G]) mediate a concerted process of host receptor recognition, followed by the fusion of viral and cellular membranes. We describe here the molecular mechanism by which HN activates the F protein such that virus-cell fusion is controlled and occurs at the right time and the right place. We extend our recently proposed “stalk exposure model” first proposed for parainfluenza virus 5 to other paramyxoviruses and propose an “induced fit” hypothesis for F-HN/H/G interactions as conserved core mechanisms of paramyxovirus-mediated membrane fusion. PMID:24453369
BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results.

PubMed

Worley, K C; Wiese, B A; Smith, R F

1995-09-01

BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).
A conserved mechanism for replication origin recognition and binding in archaea.

PubMed

Majerník, Alan I; Chong, James P J

2008-01-15

To date, methanogens are the only group within the archaea where firing DNA replication origins have not been demonstrated in vivo. In the present study we show that a previously identified cluster of ORB (origin recognition box) sequences do indeed function as an origin of replication in vivo in the archaeon Methanothermobacter thermautotrophicus. Although the consensus sequence of ORBs in M. thermautotrophicus is somewhat conserved when compared with ORB sequences in other archaea, the Cdc6-1 protein from M. thermautotrophicus (termed MthCdc6-1) displays sequence-specific binding that is selective for the MthORB sequence and does not recognize ORBs from other archaeal species. Stabilization of in vitro MthORB DNA binding by MthCdc6-1 requires additional conserved sequences 3' to those originally described for M. thermautotrophicus. By testing synthetic sequences bearing mutations in the MthORB consensus sequence, we show that Cdc6/ORB binding is critically dependent on the presence of an invariant guanine found in all archaeal ORB sequences. Mutation of a universally conserved arginine residue in the recognition helix of the winged helix domain of archaeal Cdc6-1 shows that specific origin sequence recognition is dependent on the interaction of this arginine residue with the invariant guanine. Recognition of a mutated origin sequence can be achieved by mutation of the conserved arginine residue to a lysine or glutamine residue. Thus despite a number of differences in protein and DNA sequences between species, the mechanism of origin recognition and binding appears to be conserved throughout the archaea.
Proliferating cell nuclear antigen (Pcna) as a direct downstream target gene of Hoxc8

DOE Office of Scientific and Technical Information (OSTI.GOV)

Min, Hyehyun; Lee, Ji-Yeon; Bok, Jinwoong

2010-02-19

Hoxc8 is a member of Hox family transcription factors that play crucial roles in spatiotemporal body patterning during embryogenesis. Hox proteins contain a conserved 61 amino acid homeodomain, which is responsible for recognition and binding of the proteins onto Hox-specific DNA binding motifs and regulates expression of their target genes. Previously, using proteome analysis, we identified Proliferating cell nuclear antigen (Pcna) as one of the putative target genes of Hoxc8. Here, we asked whether Hoxc8 regulates Pcna expression by directly binding to the regulatory sequence of Pcna. In mouse embryos at embryonic day 11.5, the expression pattern of Pcna wasmore » similar to that of Hoxc8 along the anteroposterior body axis. Moreover, Pcna transcript levels as well as cell proliferation rate were increased by overexpression of Hoxc8 in C3H10T1/2 mouse embryonic fibroblast cells. Characterization of 2.3 kb genomic sequence upstream of Pcna coding region revealed that the upstream sequence contains several Hox core binding sequences and one Hox-Pbx binding sequence. Direct binding of Hoxc8 proteins to the Pcna regulatory sequence was verified by chromatin immunoprecipitation assay. Taken together, our data suggest that Pcna is a direct downstream target of Hoxc8.« less
Interaction of MYC with host cell factor-1 is mediated by the evolutionarily conserved Myc box IV motif.

PubMed

Thomas, L R; Foshage, A M; Weissmiller, A M; Popay, T M; Grieb, B C; Qualls, S J; Ng, V; Carboneau, B; Lorey, S; Eischen, C M; Tansey, W P

2016-07-07

The MYC family of oncogenes encodes a set of three related transcription factors that are overexpressed in many human tumors and contribute to the cancer-related deaths of more than 70,000 Americans every year. MYC proteins drive tumorigenesis by interacting with co-factors that enable them to regulate the expression of thousands of genes linked to cell growth, proliferation, metabolism and genome stability. One effective way to identify critical co-factors required for MYC function has been to focus on sequence motifs within MYC that are conserved throughout evolution, on the assumption that their conservation is driven by protein-protein interactions that are vital for MYC activity. In addition to their DNA-binding domains, MYC proteins carry five regions of high sequence conservation known as Myc boxes (Mb). To date, four of the Mb motifs (MbI, MbII, MbIIIa and MbIIIb) have had a molecular function assigned to them, but the precise role of the remaining Mb, MbIV, and the reason for its preservation in vertebrate Myc proteins, is unknown. Here, we show that MbIV is required for the association of MYC with the abundant transcriptional coregulator host cell factor-1 (HCF-1). We show that the invariant core of MbIV resembles the tetrapeptide HCF-binding motif (HBM) found in many HCF-interaction partners, and demonstrate that MYC interacts with HCF-1 in a manner indistinguishable from the prototypical HBM-containing protein VP16. Finally, we show that rationalized point mutations in MYC that disrupt interaction with HCF-1 attenuate the ability of MYC to drive tumorigenesis in mice. Together, these data expose a molecular function for MbIV and indicate that HCF-1 is an important co-factor for MYC.
Genome-based identification of spliceosomal proteins in the silk moth Bombyx mori.

PubMed

Somarelli, Jason A; Mesa, Annia; Fuller, Myron E; Torres, Jacqueline O; Rodriguez, Carol E; Ferrer, Christina M; Herrera, Rene J

2010-12-01

Pre-messenger RNA splicing is a highly conserved eukaryotic cellular function that takes place by way of a large, RNA-protein assembly known as the spliceosome. In the mammalian system, nearly 300 proteins associate with uridine-rich small nuclear (sn)RNAs to form this complex. Some of these splicing factors are ubiquitously present in the spliceosome, whereas others are involved only in the processing of specific transcripts. Several proteomics analyses have delineated the proteins of the spliceosome in several species. In this study, we mine multiple sequence data sets of the silk moth Bombyx mori in an attempt to identify the entire set of known spliceosomal proteins. Five data sets were utilized, including the 3X, 6X, and Build 2.0 genomic contigs as well as the expressed sequence tag and protein libraries. While homologs for 88% of vertebrate splicing factors were delineated in the Bombyx mori genome, there appear to be several spliceosomal polypeptides absent in Bombyx mori and seven additional insect species. This apparent increase in spliceosomal complexity in vertebrates may reflect the tissue-specific and developmental stage-specific alternative pre-mRNA splicing requirements in vertebrates. Phylogenetic analyses of 15 eukaryotic taxa using the core splicing factors suggest that the essential functional units of the pre-mRNA processing machinery have remained highly conserved from yeast to humans. The Sm and LSm proteins are the most conserved, whereas proteins of the U1 small nuclear ribonucleoprotein particle are the most divergent. These data highlight both the differential conservation and relative phylogenetic signals of the essential spliceosomal components throughout evolution. © 2010 Wiley Periodicals, Inc.

Expression and Sequence Evolution of Aromatase cyp19a1 and Other Sexual Development Genes in East African Cichlid Fishes

PubMed Central

Böhne, Astrid; Heule, Corina; Boileau, Nicolas; Salzburger, Walter

2013-01-01

Sex determination mechanisms are highly variable across teleost fishes and sexual development is often plastic. Nevertheless, downstream factors establishing the two sexes are presumably conserved. Here, we study sequence evolution and gene expression of core genes of sexual development in a prime model system in evolutionary biology, the East African cichlid fishes. Using the available five cichlid genomes, we test for signs of positive selection in 28 genes including duplicates from the teleost whole-genome duplication, and examine the expression of these candidate genes in three cichlid species. We then focus on a particularly striking case, the A- and B-copies of the aromatase cyp19a1, and detect different evolutionary trajectories: cyp19a1A evolved under strong positive selection, whereas cyp19a1B remained conserved at the protein level, yet is subject to regulatory changes at its transcription start sites. Importantly, we find shifts in gene expression in both copies. Cyp19a1 is considered the most conserved ovary-factor in vertebrates, and in all teleosts investigated so far, cyp19a1A and cyp19a1B are expressed in ovaries and the brain, respectively. This is not the case in cichlids, where we find new expression patterns in two derived lineages: the A-copy gained a novel testis-function in the Ectodine lineage, whereas the B-copy is overexpressed in the testis of the speciest-richest cichlid group, the Haplochromini. This suggests that even key factors of sexual development, including the sex steroid pathway, are not conserved in fish, supporting the idea that flexibility in sexual determination and differentiation may be a driving force of speciation. PMID:23883521
Conservation of the egg envelope digestion mechanism of hatching enzyme in euteleostean fishes.

PubMed

Kawaguchi, Mari; Yasumasu, Shigeki; Shimizu, Akio; Sano, Kaori; Iuchi, Ichiro; Nishida, Mutsumi

2010-12-01

We purified two hatching enzymes, namely high choriolytic enzyme (HCE; EC 3.4.24.67) and low choriolytic enzyme (LCE; EC 3.4.24.66), from the hatching liquid of Fundulus heteroclitus, which were named Fundulus HCE (FHCE) and Fundulus LCE (FLCE). FHCE swelled the inner layer of egg envelope, and FLCE completely digested the FHCE-swollen envelope. In addition, we cloned three Fundulus cDNAs orthologous to cDNAs for the medaka precursors of egg envelope subunit proteins (i.e. choriogenins H, H minor and L) from the female liver. Cleavage sites of FHCE and FLCE on egg envelope subunit proteins were determined by comparing the N-terminal amino acid sequences of digests with the sequences deduced from the cDNAs for egg envelope subunit proteins. FHCE and FLCE cleaved different sites of the subunit proteins. FHCE efficiently cleaved the Pro-X-Y repeat regions into tripeptides to dodecapeptides to swell the envelope, whereas FLCE cleaved the inside of the zona pellucida domain, the core structure of egg envelope subunit protein, to completely digest the FHCE-swollen envelope. A comparison showed that the positions of hatching enzyme cleavage sites on egg envelope subunit proteins were strictly conserved between Fundulus and medaka. Finally, we extended such a comparison to three other euteleosts (i.e. three-spined stickleback, spotted halibut and rainbow trout) and found that the egg envelope digestion mechanism was well conserved among them. During evolution, the egg envelope digestion by HCE and LCE orthologs was established in the lineage of euteleosts, and the mechanism is suggested to be conserved. © 2010 The Authors Journal compilation © 2010 FEBS.
Fine-tuning structural RNA alignments in the twilight zone

PubMed Central

2010-01-01

Background A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Results Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Conclusions Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index. PMID:20433706
Conserved structure and inferred evolutionary history of long terminal repeats (LTRs)

PubMed Central

2013-01-01

Background Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability. The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible. Results Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups. Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5´TGTTRNR…YNYAACA 3´); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region. The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening. The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a ‘Superviterbi’ alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny. Conclusion The HMMs clearly demonstrated a unitary conserved structure of LTRs, supporting that they arose once during evolution. We attempted to follow the evolution of LTRs by tracing their functional foundations, that is, acquisition of RNAse H, a combined promoter/ polyadenylation site, integrase, hairpin priming and the primer binding site (PBS). Available information did not support a simple evolutionary chain of events. PMID:23369192
Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

PubMed

Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

2017-03-27

Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.
Common fold in helix–hairpin–helix proteins

PubMed Central

Shao, Xuguang; Grishin, Nick V.

2000-01-01

Helix–hairpin–helix (HhH) is a widespread motif involved in non-sequence-specific DNA binding. The majority of HhH motifs function as DNA-binding modules, however, some of them are used to mediate protein–protein interactions or have acquired enzymatic activity by incorporating catalytic residues (DNA glycosylases). From sequence and structural analysis of HhH-containing proteins we conclude that most HhH motifs are integrated as a part of a five-helical domain, termed (HhH)2 domain here. It typically consists of two consecutive HhH motifs that are linked by a connector helix and displays pseudo-2-fold symmetry. (HhH)2 domains show clear structural integrity and a conserved hydrophobic core composed of seven residues, one residue from each α-helix and each hairpin, and deserves recognition as a distinct protein fold. In addition to known HhH in the structures of RuvA, RadA, MutY and DNA-polymerases, we have detected new HhH motifs in sterile alpha motif and barrier-to-autointegration factor domains, the α-subunit of Escherichia coli RNA-polymerase, DNA-helicase PcrA and DNA glycosylases. Statistically significant sequence similarity of HhH motifs and pronounced structural conservation argue for homology between (HhH)2 domains in different protein families. Our analysis helps to clarify how non-symmetric protein motifs bind to the double helix of DNA through the formation of a pseudo-2-fold symmetric (HhH)2 functional unit. PMID:10908318
The genome and transcriptome of the enteric parasite Entamoeba invadens, a model for encystation

PubMed Central

2013-01-01

Background Several eukaryotic parasites form cysts that transmit infection. The process is found in diverse organisms such as Toxoplasma, Giardia, and nematodes. In Entamoeba histolytica this process cannot be induced in vitro, making it difficult to study. In Entamoeba invadens, stage conversion can be induced, but its utility as a model system to study developmental biology has been limited by a lack of genomic resources. We carried out genome and transcriptome sequencing of E. invadens to identify molecular processes involved in stage conversion. Results We report the sequencing and assembly of the E. invadens genome and use whole transcriptome sequencing to characterize changes in gene expression during encystation and excystation. The E. invadens genome is larger than that of E. histolytica, apparently largely due to expansion of intergenic regions; overall gene number and the machinery for gene regulation are conserved between the species. Over half the genes are regulated during the switch between morphological forms and a key signaling molecule, phospholipase D, appears to regulate encystation. We provide evidence for the occurrence of meiosis during encystation, suggesting that stage conversion may play a key role in recombination between strains. Conclusions Our analysis demonstrates that a number of core processes are common to encystation between distantly related parasites, including meiosis, lipid signaling and RNA modification. These data provide a foundation for understanding the developmental cascade in the important human pathogen E. histolytica and highlight conserved processes more widely relevant in enteric pathogens. PMID:23889909
Plasmid integration in a wide range of bacteria mediated by the integrase of Lactobacillus delbrueckii bacteriophage mv4.

PubMed Central

Auvray, F; Coddeville, M; Ritzenthaler, P; Dupont, L

1997-01-01

Bacteriophage mv4 is a temperate phage infecting Lactobacillus delbrueckii subsp. bulgaricus. During lysogenization, the phage integrates its genome into the host chromosome at the 3' end of a tRNA(Ser) gene through a site-specific recombination process (L. Dupont et al., J. Bacteriol., 177:586-595, 1995). A nonreplicative vector (pMC1) based on the mv4 integrative elements (attP site and integrase-coding int gene) is able to integrate into the chromosome of a wide range of bacterial hosts, including Lactobacillus plantarum, Lactobacillus casei (two strains), Lactococcus lactis subsp. cremoris, Enterococcus faecalis, and Streptococcus pneumoniae. Integrative recombination of pMC1 into the chromosomes of all of these species is dependent on the int gene product and occurs specifically at the pMC1 attP site. The isolation and sequencing of pMC1 integration sites from these bacteria showed that in lactobacilli, pMC1 integrated into the conserved tRNA(Ser) gene. In the other bacterial species where this tRNA gene is less or not conserved; secondary integration sites either in potential protein-coding regions or in intergenic DNA were used. A consensus sequence was deduced from the analysis of the different integration sites. The comparison of these sequences demonstrated the flexibility of the integrase for the bacterial integration site and suggested the importance of the trinucleotide CCT at the 5' end of the core in the strand exchange reaction. PMID:9068626
Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation.

PubMed

Schneider, T D

2001-12-01

The sequence logo for DNA binding sites of the bacteriophage P1 replication protein RepA shows unusually high sequence conservation ( approximately 2 bits) at a minor groove that faces RepA. However, B-form DNA can support only 1 bit of sequence conservation via contacts into the minor groove. The high conservation in RepA sites therefore implies a distorted DNA helix with direct or indirect contacts to the protein. Here I show that a high minor groove conservation signature also appears in sequence logos of sites for other replication origin binding proteins (Rts1, DnaA, P4 alpha, EBNA1, ORC) and promoter binding proteins (sigma(70), sigma(D) factors). This finding implies that DNA binding proteins generally use non-B-form DNA distortion such as base flipping to initiate replication and transcription.
Comparative Genomics of the Listeria monocytogenes ST204 Subgroup

PubMed Central

Fox, Edward M.; Allnutt, Theodore; Bradbury, Mark I.; Fanning, Séamus; Chandry, P. Scott

2016-01-01

The ST204 subgroup of Listeria monocytogenes is among the most frequently isolated in Australia from a range of environmental niches. In this study we provide a comparative genomics analysis of food and food environment isolates from geographically diverse sources. Analysis of the ST204 genomes showed a highly conserved core genome with the majority of variation seen in mobile genetic elements such as plasmids, transposons and phage insertions. Most strains (13/15) harbored plasmids, which although varying in size contained highly conserved sequences. Interestingly 4 isolates contained a conserved plasmid of 91,396 bp. The strains examined were isolated over a period of 12 years and from different geographic locations suggesting plasmids are an important component of the genetic repertoire of this subgroup and may provide a range of stress tolerance mechanisms. In addition to this 4 phage insertion sites and 2 transposons were identified among isolates, including a novel transposon. These genetic elements were highly conserved across isolates that harbored them, and also contained a range of genetic markers linked to stress tolerance and virulence. The maintenance of conserved mobile genetic elements in the ST204 population suggests these elements may contribute to the diverse range of niches colonized by ST204 isolates. Environmental stress selection may contribute to maintaining these genetic features, which in turn may be co-selecting for virulence markers relevant to clinical infection with ST204 isolates. PMID:28066377
Comparative Genomics of the Listeria monocytogenes ST204 Subgroup.

PubMed

Fox, Edward M; Allnutt, Theodore; Bradbury, Mark I; Fanning, Séamus; Chandry, P Scott

2016-01-01

The ST204 subgroup of Listeria monocytogenes is among the most frequently isolated in Australia from a range of environmental niches. In this study we provide a comparative genomics analysis of food and food environment isolates from geographically diverse sources. Analysis of the ST204 genomes showed a highly conserved core genome with the majority of variation seen in mobile genetic elements such as plasmids, transposons and phage insertions. Most strains (13/15) harbored plasmids, which although varying in size contained highly conserved sequences. Interestingly 4 isolates contained a conserved plasmid of 91,396 bp. The strains examined were isolated over a period of 12 years and from different geographic locations suggesting plasmids are an important component of the genetic repertoire of this subgroup and may provide a range of stress tolerance mechanisms. In addition to this 4 phage insertion sites and 2 transposons were identified among isolates, including a novel transposon. These genetic elements were highly conserved across isolates that harbored them, and also contained a range of genetic markers linked to stress tolerance and virulence. The maintenance of conserved mobile genetic elements in the ST204 population suggests these elements may contribute to the diverse range of niches colonized by ST204 isolates. Environmental stress selection may contribute to maintaining these genetic features, which in turn may be co-selecting for virulence markers relevant to clinical infection with ST204 isolates.
Comprehensively Surveying Structure and Function of RING Domains from Drosophila melanogaster

PubMed Central

Wu, Yuehao; Wan, Fusheng; Huang, Chunhong; Jie, Kemin

2011-01-01

Using a complete set of RING domains from Drosophila melanogaster, all the solved RING domains and cocrystal structures of RING-containing ubiquitin-ligases (RING-E3) and ubiquitin-conjugating enzyme (E2) pairs, we analyzed RING domains structures from their primary to quarternary structures. The results showed that: i) putative orthologs of RING domains between Drosophila melanogaster and the human largely occur (118/139, 84.9%); ii) of the 118 orthologous pairs from Drosophila melanogaster and the human, 117 pairs (117/118, 99.2%) were found to retain entirely uniform domain architectures, only Iap2/Diap2 experienced evolutionary expansion of domain architecture; iii) 4 evolutionary structurally conserved regions (SCRs) are responsible for homologous folding of RING domains at the superfamily level; iv) besides the conserved Cys/His chelating zinc ions, 6 equivalent residues (4 hydrophobic and 2 polar residues) in the SCRs possess good-consensus and conservation- these 4 SCRs function in the structural positioning of 6 equivalent residues as determinants for RING-E3 catalysis; v) members of these RING proteins located nucleus, multiple subcellular compartments, membrane protein and mitochondrion are respectively 42 (42/139, 30.2%), 71 (71/139, 51.1%), 22 (22/139, 15.8%) and 4 (4/139, 2.9%); vi) CG15104 (Topors) and CG1134 (Mul1) in C3HC4, and CG3929 (Deltex) in C3H2C3 seem to display broader E2s binding profiles than other RING-E3s; vii) analyzing intermolecular interfaces of E2/RING-E3 complexes indicate that residues directly interacting with E2s are all from the SCRs in RING domains. Of the 6 residues, 2 hydrophobic ones contribute to constructing the conserved hydrophobic core, while the 2 hydrophobic and 2 polar residues directly participate in E2/RING-E3 interactions. Based on sequence and structural data, SCRs, conserved equivalent residues and features of intermolecular interfaces were extracted, highlighting the presence of a nucleus for RING domain fold and formation of catalytic core in which related residues and regions exhibit preferential evolutionary conservation. PMID:21912646
Conformational Occlusion of Blockade Antibody Epitopes, a Novel Mechanism of GII.4 Human Norovirus Immune Evasion.

PubMed

Lindesmith, Lisa C; Mallory, Michael L; Debbink, Kari; Donaldson, Eric F; Brewer-Jensen, Paul D; Swann, Excel W; Sheahan, Timothy P; Graham, Rachel L; Beltramello, Martina; Corti, Davide; Lanzavecchia, Antonio; Baric, Ralph S

2018-01-01

Extensive antigenic diversity within the GII.4 genotype of human norovirus is a major driver of pandemic emergence and a significant obstacle to development of cross-protective immunity after natural infection and vaccination. However, human and mouse monoclonal antibody studies indicate that, although rare, antibodies to conserved GII.4 blockade epitopes are generated. The mechanisms by which these epitopes evade immune surveillance are uncertain. Here, we developed a new approach for identifying conserved GII.4 norovirus epitopes. Utilizing a unique set of virus-like particles (VLPs) representing the in vivo -evolved sequence diversity within an immunocompromised person, we identify key residues within epitope F, a conserved GII.4 blockade antibody epitope. The residues critical for antibody binding are proximal to evolving blockade epitope E. Like epitope F, antibody blockade of epitope E was temperature sensitive, indicating that particle conformation regulates antibody access not only to the conserved GII.4 blockade epitope F but also to the evolving epitope E. These data highlight novel GII.4 mechanisms to protect blockade antibody epitopes, map essential residues of a GII.4 conserved epitope, and expand our understanding of how viral particle dynamics may drive antigenicity and antibody-mediated protection by effectively shielding blockade epitopes. Our data support the notion that GII.4 particle breathing may well represent a major mechanism of humoral immune evasion supporting cyclic pandemic virus persistence and spread in human populations. IMPORTANCE In this study, we use norovirus virus-like particles to identify key residues of a conserved GII.4 blockade antibody epitope. Further, we identify an additional GII.4 blockade antibody epitope to be occluded, with antibody access governed by temperature and particle dynamics. These findings provide additional support for particle conformation-based presentation of binding residues mediated by a particle "breathing core." Together, these data suggest that limiting antibody access to blockade antibody epitopes may be a frequent mechanism of immune evasion for GII.4 human noroviruses. Mapping blockade antibody epitopes, the interaction between adjacent epitopes on the particle, and the breathing core that mediates antibody access to epitopes provides greater mechanistic understanding of epitope camouflage strategies utilized by human viral pathogens to evade immunity.
A Belated Green Revolution for Cannabis: Virtual Genetic Resources to Fast-Track Cultivar Development

PubMed Central

Welling, Matthew T.; Shapter, Tim; Rose, Terry J.; Liu, Lei; Stanger, Rhia; King, Graham J.

2016-01-01

Cannabis is a predominantly diecious phenotypically diverse domesticated genus with few if any extant natural populations. International narcotics conventions and associated legislation have constrained the establishment, characterization, and use of Cannabis genetic resource collections. This has resulted in the underutilization of genepool variability in cultivar development and has limited the inclusion of secondary genepools associated with genetic improvement strategies of the Green Revolution. The structured screening of ex situ germplasm and the exploitation of locally-adapted intraspecific traits is expected to facilitate the genetic improvement of Cannabis. However, limited attempts have been made to establish the full extent of genetic resources available for pre-breeding. We present a thorough critical review of Cannabis ex situ genetic resources, and discuss recommendations for conservation, pre-breeding characterization, and genetic analysis that will underpin future cultivar development. We consider East Asian germplasm to be a priority for conservation based on the prolonged historical cultivation of Cannabis in this region over a range of latitudes, along with the apparent high levels of genetic diversity and relatively low representation in published genetic resource collections. Seed cryopreservation could improve conservation by reducing hybridization and genetic drift that may occur during Cannabis germplasm regeneration. Given the unique legal status of Cannabis, we propose the establishment of a global virtual core collection based on the collation of consistent and comprehensive provenance meta-data and the adoption of high-throughput DNA sequencing technologies. This would enable representative core collections to be used for systematic phenotyping, and so underpin breeding strategies for the genetic improvement of Cannabis. PMID:27524992
A Belated Green Revolution for Cannabis: Virtual Genetic Resources to Fast-Track Cultivar Development.

PubMed

Welling, Matthew T; Shapter, Tim; Rose, Terry J; Liu, Lei; Stanger, Rhia; King, Graham J

2016-01-01

Cannabis is a predominantly diecious phenotypically diverse domesticated genus with few if any extant natural populations. International narcotics conventions and associated legislation have constrained the establishment, characterization, and use of Cannabis genetic resource collections. This has resulted in the underutilization of genepool variability in cultivar development and has limited the inclusion of secondary genepools associated with genetic improvement strategies of the Green Revolution. The structured screening of ex situ germplasm and the exploitation of locally-adapted intraspecific traits is expected to facilitate the genetic improvement of Cannabis. However, limited attempts have been made to establish the full extent of genetic resources available for pre-breeding. We present a thorough critical review of Cannabis ex situ genetic resources, and discuss recommendations for conservation, pre-breeding characterization, and genetic analysis that will underpin future cultivar development. We consider East Asian germplasm to be a priority for conservation based on the prolonged historical cultivation of Cannabis in this region over a range of latitudes, along with the apparent high levels of genetic diversity and relatively low representation in published genetic resource collections. Seed cryopreservation could improve conservation by reducing hybridization and genetic drift that may occur during Cannabis germplasm regeneration. Given the unique legal status of Cannabis, we propose the establishment of a global virtual core collection based on the collation of consistent and comprehensive provenance meta-data and the adoption of high-throughput DNA sequencing technologies. This would enable representative core collections to be used for systematic phenotyping, and so underpin breeding strategies for the genetic improvement of Cannabis.
G-quadruplex prediction in E. coli genome reveals a conserved putative G-quadruplex-Hairpin-Duplex switch.

PubMed

Kaplan, Oktay I; Berber, Burak; Hekim, Nezih; Doluca, Osman

2016-11-02

Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Functional Analysis of RNA Interference-Related Soybean Pod Borer (Lepidoptera) Genes Based on Transcriptome Sequences.

PubMed

Meng, Fanli; Yang, Mingyu; Li, Yang; Li, Tianyu; Liu, Xinxin; Wang, Guoyue; Wang, Zhanchun; Jin, Xianhao; Li, Wenbin

2018-01-01

RNA interference (RNAi) is useful for controlling pests of agriculturally important crops. The soybean pod borer (SPB) is the most important soybean pest in Northeastern Asia. In an earlier study, we confirmed that the SPB could be controlled via transgenic plant-mediated RNAi. Here, the SPB transcriptome was sequenced to identify RNAi-related genes, and also to establish an RNAi-of-RNAi assay system for evaluating genes involved in the SPB systemic RNAi response. The core RNAi genes, as well as genes potentially involved in double-stranded RNA (dsRNA) uptake were identified based on SPB transcriptome sequences. A phylogenetic analysis and the characterization of these core components as well as dsRNA uptake related genes revealed that they contain conserved domains essential for the RNAi pathway. The results of the RNAi-of-RNAi assay involving Laccas e 2 (a critical cuticle pigmentation gene) as a marker showed that genes encoding the sid-like ( Sil1 ), scavenger receptor class C ( Src ), and scavenger receptor class B ( Srb3 and Srb4 ) proteins of the endocytic pathway were required for SPB cellular uptake of dsRNA. The SPB response was inferred to contain three functional small RNA pathways (i.e., miRNA, siRNA, and piRNA pathways). Additionally, the SPB systemic RNA response may rely on systemic RNA interference deficient transmembrane channel-mediated and receptor-mediated endocytic pathways. The results presented herein may be useful for developing RNAi-mediated methods to control SPB infestations in soybean.
Functional Analysis of RNA Interference-Related Soybean Pod Borer (Lepidoptera) Genes Based on Transcriptome Sequences

PubMed Central

Meng, Fanli; Yang, Mingyu; Li, Yang; Li, Tianyu; Liu, Xinxin; Wang, Guoyue; Wang, Zhanchun; Jin, Xianhao; Li, Wenbin

2018-01-01

RNA interference (RNAi) is useful for controlling pests of agriculturally important crops. The soybean pod borer (SPB) is the most important soybean pest in Northeastern Asia. In an earlier study, we confirmed that the SPB could be controlled via transgenic plant-mediated RNAi. Here, the SPB transcriptome was sequenced to identify RNAi-related genes, and also to establish an RNAi-of-RNAi assay system for evaluating genes involved in the SPB systemic RNAi response. The core RNAi genes, as well as genes potentially involved in double-stranded RNA (dsRNA) uptake were identified based on SPB transcriptome sequences. A phylogenetic analysis and the characterization of these core components as well as dsRNA uptake related genes revealed that they contain conserved domains essential for the RNAi pathway. The results of the RNAi-of-RNAi assay involving Laccase 2 (a critical cuticle pigmentation gene) as a marker showed that genes encoding the sid-like (Sil1), scavenger receptor class C (Src), and scavenger receptor class B (Srb3 and Srb4) proteins of the endocytic pathway were required for SPB cellular uptake of dsRNA. The SPB response was inferred to contain three functional small RNA pathways (i.e., miRNA, siRNA, and piRNA pathways). Additionally, the SPB systemic RNA response may rely on systemic RNA interference deficient transmembrane channel-mediated and receptor-mediated endocytic pathways. The results presented herein may be useful for developing RNAi-mediated methods to control SPB infestations in soybean. PMID:29773992
Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs).

PubMed

Natale, D A; Shankavaram, U T; Galperin, M Y; Wolf, Y I; Aravind, L; Koonin, E V

2000-01-01

Standard archival sequence databases have not been designed as tools for genome annotation and are far from being optimal for this purpose. We used the database of Clusters of Orthologous Groups of proteins (COGs) to reannotate the genomes of two archaea, Aeropyrum pernix, the first member of the Crenarchaea to be sequenced, and Pyrococcus abyssi. A. pernix and P. abyssi proteins were assigned to COGs using the COGNITOR program; the results were verified on a case-by-case basis and augmented by additional database searches using the PSI-BLAST and TBLASTN programs. Functions were predicted for over 300 proteins from A. pernix, which could not be assigned a function using conventional methods with a conservative sequence similarity threshold, an approximately 50% increase compared to the original annotation. A. pernix shares most of the conserved core of proteins that were previously identified in the Euryarchaeota. Cluster analysis or distance matrix tree construction based on the co-occurrence of genomes in COGs showed that A. pernix forms a distinct group within the archaea, although grouping with the two species of Pyrococci, indicative of similar repertoires of conserved genes, was observed. No indication of a specific relationship between Crenarchaeota and eukaryotes was obtained in these analyses. Several proteins that are conserved in Euryarchaeota and most bacteria are unexpectedly missing in A. pernix, including the entire set of de novo purine biosynthesis enzymes, the GTPase FtsZ (a key component of the bacterial and euryarchaeal cell-division machinery), and the tRNA-specific pseudouridine synthase, previously considered universal. A. pernix is represented in 48 COGs that do not contain any euryarchaeal members. Many of these proteins are TCA cycle and electron transport chain enzymes, reflecting the aerobic lifestyle of A. pernix. Special-purpose databases organized on the basis of phylogenetic analysis and carefully curated with respect to known and predicted protein functions provide for a significant improvement in genome annotation. A differential genome display approach helps in a systematic investigation of common and distinct features of gene repertoires and in some cases reveals unexpected connections that may be indicative of functional similarities between phylogenetically distant organisms and of lateral gene exchange.
Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs)

PubMed Central

Natale, Darren A; Shankavaram, Uma T; Galperin, Michael Y; Wolf, Yuri I; Aravind, L; Koonin, Eugene V

2000-01-01

Background: Standard archival sequence databases have not been designed as tools for genome annotation and are far from being optimal for this purpose. We used the database of Clusters of Orthologous Groups of proteins (COGs) to reannotate the genomes of two archaea, Aeropyrum pernix, the first member of the Crenarchaea to be sequenced, and Pyrococcus abyssi. Results: A. pernix and P. abyssi proteins were assigned to COGs using the COGNITOR program; the results were verified on a case-by-case basis and augmented by additional database searches using the PSI-BLAST and TBLASTN programs. Functions were predicted for over 300 proteins from A. pernix, which could not be assigned a function using conventional methods with a conservative sequence similarity threshold, an approximately 50% increase compared to the original annotation. A. pernix shares most of the conserved core of proteins that were previously identified in the Euryarchaeota. Cluster analysis or distance matrix tree construction based on the co-occurrence of genomes in COGs showed that A. pernix forms a distinct group within the archaea, although grouping with the two species of Pyrococci, indicative of similar repertoires of conserved genes, was observed. No indication of a specific relationship between Crenarchaeota and eukaryotes was obtained in these analyses. Several proteins that are conserved in Euryarchaeota and most bacteria are unexpectedly missing in A. pernix, including the entire set of de novo purine biosynthesis enzymes, the GTPase FtsZ (a key component of the bacterial and euryarchaeal cell-division machinery), and the tRNA-specific pseudouridine synthase, previously considered universal. A. pernix is represented in 48 COGs that do not contain any euryarchaeal members. Many of these proteins are TCA cycle and electron transport chain enzymes, reflecting the aerobic lifestyle of A. pernix. Conclusions: Special-purpose databases organized on the basis of phylogenetic analysis and carefully curated with respect to known and predicted protein functions provide for a significant improvement in genome annotation. A differential genome display approach helps in a systematic investigation of common and distinct features of gene repertoires and in some cases reveals unexpected connections that may be indicative of functional similarities between phylogenetically distant organisms and of lateral gene exchange. PMID:11178258

Comparative genomic and functional analysis reveal conservation of plant growth promoting traits in Paenibacillus polymyxa and its closely related species

PubMed Central

Xie, Jianbo; Shi, Haowen; Du, Zhenglin; Wang, Tianshu; Liu, Xiaomeng; Chen, Sanfeng

2016-01-01

Paenibacillus polymyxa has widely been studied as a model of plant-growth promoting rhizobacteria (PGPR). Here, the genome sequences of 9 P. polymyxa strains, together with 26 other sequenced Paenibacillus spp., were comparatively studied. Phylogenetic analysis of the concatenated 244 single-copy core genes suggests that the 9 P. polymyxa strains and 5 other Paenibacillus spp., isolated from diverse geographic regions and ecological niches, formed a closely related clade (here it is called Poly-clade). Analysis of single nucleotide polymorphisms (SNPs) reveals local diversification of the 14 Poly-clade genomes. SNPs were not evenly distributed throughout the 14 genomes and the regions with high SNP density contain the genes related to secondary metabolism, including genes coding for polyketide. Recombination played an important role in the genetic diversity of this clade, although the rate of recombination was clearly lower than mutation. Some genes relevant to plant-growth promoting traits, i.e. phosphate solubilization and IAA production, are well conserved, while some genes relevant to nitrogen fixation and antibiotics synthesis are evolved with diversity in this Poly-clade. This study reveals that both P. polymyxa and its closely related species have plant growth promoting traits and they have great potential uses in agriculture and horticulture as PGPR. PMID:26856413
Conservation and variability of West Nile virus proteins.

PubMed

Koo, Qi Ying; Khan, Asif M; Jung, Keun-Ok; Ramdas, Shweta; Miotto, Olivo; Tan, Tin Wee; Brusic, Vladimir; Salmon, Jerome; August, J Thomas

2009-01-01

West Nile virus (WNV) has emerged globally as an increasingly important pathogen for humans and domestic animals. Studies of the evolutionary diversity of the virus over its known history will help to elucidate conserved sites, and characterize their correspondence to other pathogens and their relevance to the immune system. We describe a large-scale analysis of the entire WNV proteome, aimed at identifying and characterizing evolutionarily conserved amino acid sequences. This study, which used 2,746 WNV protein sequences collected from the NCBI GenPept database, focused on analysis of peptides of length 9 amino acids or more, which are immunologically relevant as potential T-cell epitopes. Entropy-based analysis of the diversity of WNV sequences, revealed the presence of numerous evolutionarily stable nonamer positions across the proteome (entropy value of < or = 1). The representation (frequency) of nonamers variant to the predominant peptide at these stable positions was, generally, low (< or = 10% of the WNV sequences analyzed). Eighty-eight fragments of length 9-29 amino acids, representing approximately 34% of the WNV polyprotein length, were identified to be identical and evolutionarily stable in all analyzed WNV sequences. Of the 88 completely conserved sequences, 67 are also present in other flaviviruses, and several have been associated with the functional and structural properties of viral proteins. Immunoinformatic analysis revealed that the majority (78/88) of conserved sequences are potentially immunogenic, while 44 contained experimentally confirmed human T-cell epitopes. This study identified a comprehensive catalogue of completely conserved WNV sequences, many of which are shared by other flaviviruses, and majority are potential epitopes. The complete conservation of these immunologically relevant sequences through the entire recorded WNV history suggests they will be valuable as components of peptide-specific vaccines or other therapeutic applications, for sequence-specific diagnosis of a wide-range of Flavivirus infections, and for studies of homologous sequences among other flaviviruses.
A Fast Alignment-Free Approach for De Novo Detection of Protein Conserved Regions

PubMed Central

Abnousi, Armen; Broschat, Shira L.; Kalyanaraman, Ananth

2016-01-01

Background Identifying conserved regions in protein sequences is a fundamental operation, occurring in numerous sequence-driven analysis pipelines. It is used as a way to decode domain-rich regions within proteins, to compute protein clusters, to annotate sequence function, and to compute evolutionary relationships among protein sequences. A number of approaches exist for identifying and characterizing protein families based on their domains, and because domains represent conserved portions of a protein sequence, the primary computation involved in protein family characterization is identification of such conserved regions. However, identifying conserved regions from large collections (millions) of protein sequences presents significant challenges. Methods In this paper we present a new, alignment-free method for detecting conserved regions in protein sequences called NADDA (No-Alignment Domain Detection Algorithm). Our method exploits the abundance of exact matching short subsequences (k-mers) to quickly detect conserved regions, and the power of machine learning is used to improve the prediction accuracy of detection. We present a parallel implementation of NADDA using the MapReduce framework and show that our method is highly scalable. Results We have compared NADDA with Pfam and InterPro databases. For known domains annotated by Pfam, accuracy is 83%, sensitivity 96%, and specificity 44%. For sequences with new domains not present in the training set an average accuracy of 63% is achieved when compared to Pfam. A boost in results in comparison with InterPro demonstrates the ability of NADDA to capture conserved regions beyond those present in Pfam. We have also compared NADDA with ADDA and MKDOM2, assuming Pfam as ground-truth. On average NADDA shows comparable accuracy, more balanced sensitivity and specificity, and being alignment-free, is significantly faster. Excluding the one-time cost of training, runtimes on a single processor were 49s, 10,566s, and 456s for NADDA, ADDA, and MKDOM2, respectively, for a data set comprised of approximately 2500 sequences. PMID:27552220
Two rapidly evolving genes contribute to male fitness in Drosophila

PubMed Central

Reinhardt, Josephine A; Jones, Corbin D

2013-01-01

Purifying selection often results in conservation of gene sequence and function. The most functionally conserved genes are also thought to be among the most biologically essential. These observations have led to the use of sequence conservation as a proxy for functional conservation. Here we describe two genes that are exceptions to this pattern. We show that lack of sequence conservation among orthologs of CG15460 and CG15323 – herein named jean-baptiste (jb) and karr respectively – does not necessarily predict lack of functional conservation. These two Drosophila melanogaster genes are among the most rapidly evolving protein-coding genes in this species, being nearly as diverged from their D. yakuba orthologs as random sequences are. jb and karr are both expressed at an elevated level in larval males and adult testes, but they are not accessory gland proteins and their loss does not affect male fertility. Instead, knockdown of these genes in D. melanogaster via RNA interference caused male-biased viability defects. These viability effects occur prior to the third instar for jb and during late pupation for karr. We show that putative orthologs to jb and karr are also expressed strongly in the testes of other Drosophila species and have similar gene structure across species despite low levels of sequence conservation. While standard molecular evolution tests could not reject neutrality, other data hint at a role for natural selection. Together these data provide a clear case where a lack of sequence conservation does not imply a lack of conservation of expression or function. PMID:24221639
Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome

PubMed Central

Kim, Gunjune

2017-01-01

Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is “leaves of three, let it be”, which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species. PMID:29125533
Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome.

PubMed

Weisberg, Alexandra J; Kim, Gunjune; Westwood, James H; Jelesko, John G

2017-11-10

Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is "leaves of three, let it be", which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species.
Comparative analysis of complete orthologous centromeres from two subspecies of rice reveals rapid variation of centromere organization and structure.

PubMed

Wu, Jianzhong; Fujisawa, Masaki; Tian, Zhixi; Yamagata, Harumi; Kamiya, Kozue; Shibata, Michie; Hosokawa, Satomi; Ito, Yukiyo; Hamada, Masao; Katagiri, Satoshi; Kurita, Kanako; Yamamoto, Mayu; Kikuta, Ari; Machita, Kayo; Karasawa, Wataru; Kanamori, Hiroyuki; Namiki, Nobukazu; Mizuno, Hiroshi; Ma, Jianxin; Sasaki, Takuji; Matsumoto, Takashi

2009-12-01

Centromeres are sites for assembly of the chromosomal structures that mediate faithful segregation at mitosis and meiosis. This function is conserved across species, but the DNA components that are involved in kinetochore formation differ greatly, even between closely related species. To shed light on the nature, evolutionary timing and evolutionary dynamics of rice centromeres, we decoded a 2.25-Mb DNA sequence covering the centromeric region of chromosome 8 of an indica rice variety, 'Kasalath' (Kas-Cen8). Analysis of repetitive sequences in Kas-Cen8 led to the identification of 222 long terminal repeat (LTR)-retrotransposon elements and 584 CentO satellite monomers, which account for 59.2% of the region. A comparison of the Kas-Cen8 sequence with that of japonica rice 'Nipponbare' (Nip-Cen8) revealed that about 66.8% of the Kas-Cen8 sequence was collinear with that of Nip-Cen8. Although the 27 putative genes are conserved between the two subspecies, only 55.4% of the total LTR-retrotransposon elements in 'Kasalath' had orthologs in 'Nipponbare', thus reflecting recent proliferation of a considerable number of LTR-retrotransposons since the divergence of two rice subspecies of indica and japonica within Oryza sativa. Comparative analysis of the subfamilies, time of insertion, and organization patterns of inserted LTR-retrotransposons between the two Cen8 regions revealed variations between 'Kasalath' and 'Nipponbare' in the preferential accumulation of CRR elements, and the expansion of CentO satellite repeats within the core domain of Cen8. Together, the results provide insights into the recent proliferation of LTR-retrotransposons, and the rapid expansion of CentO satellite repeats, underlying the dynamic variation and plasticity of plant centromeres.
CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison

PubMed Central

Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

2004-01-01

The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features. PMID:15215464
Conservation of the fourth gene among rotaviruses recovered from asymptomatic newborn infants and its possible role in attenuation.

PubMed Central

Flores, J; Midthun, K; Hoshino, Y; Green, K; Gorziglia, M; Kapikian, A Z; Chanock, R M

1986-01-01

RNA-RNA hybridization was performed to assess the extent of genetic relatedness among human rotaviruses isolated from children with gastroenteritis and from asymptomatic newborn infants. 32P-labeled single-stranded RNAs produced by in vitro transcription from viral cores of the different strains tested were used as probes in two different hybridization assays: undenatured genomic RNAs were resolved by polyacrylamide gel electrophoresis, denatured in situ, electrophoretically transferred to diazobenzyloxymethyl-paper (Northern blots), and then hybridized to the probes under two different conditions of stringency; and denatured genomic double-stranded RNAs were hybridized to the probes in solution and the hybrids which formed were identified by polyacrylamide gel electrophoresis. When analyzed by Northern blot hybridization at a low level of stringency, all genes from the strains tested cross-hybridized, providing evidence for some sequence homology in each of the corresponding genes. However, when hybridization stringency was increased, a difference in gene 4 sequence was detected between strains recovered from asymptomatic newborn infants ("nursery strains") and strains recovered from infants and young children with diarrhea. Although the nursery strains exhibited serotypic diversity (i.e., each of the four strains tested belonged to a different serotype), the fourth gene appeared to be highly conserved. Similarly, each of the virulent strains tested belonged to a different serotype; nonetheless, there was significant conservation of sequence among the fourth genes of three of these viruses. Significantly, the conserved fourth genes of the nursery strains were distinct from the fourth gene of each of the virulent viruses. These results were confirmed and extended during experiments in which the RNA-RNA hybridization was carried out in solution and the resulting hybrids were analyzed by polyacrylamide gel electrophoresis. Under these conditions, the fourth genes of the nursery strains were closely related to each other but not to the fourth genes of the virulent viruses. Full-length hybrids did not form between the fourth genes from the nursery strains and the corresponding genes from the strains recovered from symptomatic infants and young children. Images PMID:3023685
Analysis of hepatitis C virus RNA dimerization and core-RNA interactions.

PubMed

Ivanyi-Nagy, Roland; Kanevsky, Igor; Gabus, Caroline; Lavergne, Jean-Pierre; Ficheux, Damien; Penin, François; Fossé, Philippe; Darlix, Jean-Luc

2006-01-01

The core protein of hepatitis C virus (HCV) has been shown previously to act as a potent nucleic acid chaperone in vitro, promoting the dimerization of the 3'-untranslated region (3'-UTR) of the HCV genomic RNA, a process probably mediated by a small, highly conserved palindromic RNA motif, named DLS (dimer linkage sequence) [G. Cristofari, R. Ivanyi-Nagy, C. Gabus, S. Boulant, J. P. Lavergne, F. Penin and J. L. Darlix (2004) Nucleic Acids Res., 32, 2623-2631]. To investigate in depth HCV RNA dimerization, we generated a series of point mutations in the DLS region. We find that both the plus-strand 3'-UTR and the complementary minus-strand RNA can dimerize in the presence of core protein, while mutations in the DLS (among them a single point mutation that abolished RNA replication in a HCV subgenomic replicon system) completely abrogate dimerization. Structural probing of plus- and minus-strand RNAs, in their monomeric and dimeric forms, indicate that the DLS is the major if not the sole determinant of UTR RNA dimerization. Furthermore, the N-terminal basic amino acid clusters of core protein were found to be sufficient to induce dimerization, suggesting that they retain full RNA chaperone activity. These findings may have important consequences for understanding the HCV replicative cycle and the genetic variability of the virus.
Antibiotics reduce genetic diversity of core species in the honeybee gut microbiome.

PubMed

Raymann, Kasie; Bobay, Louis-Marie; Moran, Nancy A

2018-04-01

The gut microbiome plays a key role in animal health, and perturbing it can have detrimental effects. One major source of perturbation to microbiomes, in humans and human-associated animals, is exposure to antibiotics. Most studies of how antibiotics affect the microbiome have used amplicon sequencing of highly conserved 16S rRNA sequences, as in a recent study showing that antibiotic treatment severely alters the species-level composition of the honeybee gut microbiome. But because the standard 16S rRNA-based methods cannot resolve closely related strains, strain-level changes could not be evaluated. To address this gap, we used amplicon sequencing of protein-coding genes to assess effects of antibiotics on fine-scale genetic diversity of the honeybee gut microbiota. We followed the population dynamics of alleles within two dominant core species of the bee gut community, Gilliamella apicola and Snodgrassella alvi, following antibiotic perturbation. Whereas we observed a large reduction in genetic diversity in G. apicola, S. alvi diversity was mostly unaffected. The reduction in G. apicola diversity accompanied an increase in the frequency of several alleles, suggesting resistance to antibiotic treatment. We find that antibiotic perturbation can cause major shifts in diversity and that the extent of these shifts can vary substantially across species. Thus, antibiotics impact not only species composition, but also allelic diversity within species, potentially affecting hosts if variants with particular functions are reduced or eliminated. Overall, we show that amplicon sequencing of protein-coding genes, without clustering into operational taxonomic units, provides an accurate picture of the fine-scale dynamics of microbial communities over time. © 2017 John Wiley & Sons Ltd.
Natural Resources. Ohio's Competency Analysis Profile. Forest Industry Worker. Resource Conservation.

ERIC Educational Resources Information Center

Ohio State Univ., Columbus. Vocational Instructional Materials Lab.

This competency analysis profile lists 155 competencies that have been identified by employers as core competencies for inclusion in programs to train forest industry and resource conservation workers. The core competencies are organized into 10 units dealing the following: general safety precautions, natural resource industry operations, soil…
CodonLogo: a sequence logo-based viewer for codon patterns.

PubMed

Sharma, Virag; Murphy, David P; Provan, Gregory; Baranov, Pavel V

2012-07-15

Conserved patterns across a multiple sequence alignment can be visualized by generating sequence logos. Sequence logos show each column in the alignment as stacks of symbol(s) where the height of a stack is proportional to its informational content, whereas the height of each symbol within the stack is proportional to its frequency in the column. Sequence logos use symbols of either nucleotide or amino acid alphabets. However, certain regulatory signals in messenger RNA (mRNA) act as combinations of codons. Yet no tool is available for visualization of conserved codon patterns. We present the first application which allows visualization of conserved regions in a multiple sequence alignment in the context of codons. CodonLogo is based on WebLogo3 and uses the same heuristics but treats codons as inseparable units of a 64-letter alphabet. CodonLogo can discriminate patterns of codon conservation from patterns of nucleotide conservation that appear indistinguishable in standard sequence logos. The CodonLogo source code and its implementation (in a local version of the Galaxy Browser) are available at http://recode.ucc.ie/CodonLogo and through the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/.
A novel strategy for the determination of a rhabdovirus genome and its application to sequencing of Eggplant mottled dwarf virus.

PubMed

Pappi, Polyxeni G; Dovas, Chrysostomos I; Efthimiou, Konstantinos E; Maliogka, Varvara I; Katis, Nikolaos I

2013-08-01

A novel strategy employing the rhabdovirus untranslated conserved intergenic regions was developed and applied successfully for the determination of the complete nucleotide sequence of Eggplant mottled dwarf virus (EMDV). The EMDV genome contains seven open reading frames with the same organization as Potato yellow dwarf virus (PYDV), the type species of the genus Nucleorhabdovirus. These two species encode five core genes [nucleocapsid (N), phosphoprotein (P), matrix (M), glycoprotein (G), and the polymerase (L)] like other viruses of the genus and an additional one (X), located between N and P, giving rise to a protein with currently unknown function. Furthermore, both EMDV and PYDV contain a gene (Y), inserted between P and M, which probably encodes the virus movement protein, in concordance with the rest of the plant-infecting rhabdoviruses. Phylogenetic analysis of the polymerase gene confirmed the classification of EMDV within the genus Nucleorhabdovirus and showed a close evolutionary relationship to PYDV. The novel sequencing strategy developed is a useful tool for the genome determination of yet uncharacterized rhabdoviruses.
New families of site-specific repetitive DNA sequences that comprise constitutive heterochromatin of the Syrian hamster (Mesocricetus auratus, Cricetinae, Rodentia).

PubMed

Yamada, Kazuhiko; Kamimura, Eikichi; Kondo, Mariko; Tsuchiya, Kimiyuki; Nishida-Umehara, Chizuko; Matsuda, Yoichi

2006-02-01

We molecularly cloned new families of site-specific repetitive DNA sequences from BglII- and EcoRI-digested genomic DNA of the Syrian hamster (Mesocricetus auratus, Cricetrinae, Rodentia) and characterized them by chromosome in situ hybridization and filter hybridization. They were classified into six different types of repetitive DNA sequence families according to chromosomal distribution and genome organization. The hybridization patterns of the sequences were consistent with the distribution of C-positive bands and/or Hoechst-stained heterochromatin. The centromeric major satellite DNA and sex chromosome-specific and telomeric region-specific repetitive sequences were conserved in the same genus (Mesocricetus) but divergent in different genera. The chromosome-2-specific sequence was conserved in two genera, Mesocricetus and Cricetulus, and a low copy number of repetitive sequences on the heterochromatic chromosome arms were conserved in the subfamily Cricetinae but not in the subfamily Calomyscinae. By contrast, the other type of repetitive sequences on the heterochromatic chromosome arms, which had sequence similarities to a LINE sequence of rodents, was conserved through the three subfamilies, Cricetinae, Calomyscinae and Murinae. The nucleotide divergence of the repetitive sequences of heterochromatin was well correlated with the phylogenetic relationships of the Cricetinae species, and each sequence has been independently amplified and diverged in the same genome.
SEPT9 Mutations and a Conserved 17q25 Sequence in Sporadic and Hereditary Brachial Plexus Neuropathy

PubMed Central

Klein, Christopher J.; Wu, Yanhong; Cunningham, Julie M.; Windebank, Anthony J.; Dyck, P. James B.; Friedenberg, Scott M.; Klein, Diane M.; Dyck, Peter J.

2009-01-01

Background The clinical characteristics of sporadic brachial plexus neuropathy (S-BPN) and hereditary brachial plexus neuropathy (H-BPN) are similar. At times of attack inflammation in brachial plexus nerves has been identified in both conditions. SEPT-9 mutations (Arg88Trp, Ser93Phe, 5UTR-131G to C) occur in some families with H-BPN. These mutations were not found in American H-BPN kindreds with a conserved 500 Kb sequence of DNA at 17q25 (the location of SEPT-9) where a founder mutation has been suggested. Objective To study 17q25 and SEPT-9 in S-BPN (56 patients) and H-BPN (13 kindreds). Methods Allele analysis at 17q25, SEPT-9 DNA sequencing and mRNA analysis from lymphoblast cultures. Results A conserved 17q25 sequence was found in 5 of 13 H-BPN kindreds and one S-BPN patient. This conserved sequence was not found in the family with a SEPT-9 mutation (Arg88Trp) or controls (182). SEPT-9 mRNA expression did not differ between forms of H-BPN and controls. No known mutations of SEPT-9 were found in S-BPN. Conclusions/Relevance Rare S-BPN patients have the same conserved 17q25 sequence found in many American H-BPN kindreds. BPN patients with this conserved sequence do not appear to have SEPT-9 mutations or alterations of its mRNA expression levels in lymphoblast cultures. BPN patients with this conserved sequence may have the most common genetic cause in the Americas by a founder effect mutation. PMID:19204161
Assessing the genetic diversity of Cu resistance in mine tailings through high-throughput recovery of full-length copA genes

PubMed Central

Li, Xiaofang; Zhu, Yong-Guan; Shaban, Babak; Bruxner, Timothy J. C.; Bond, Philip L.; Huang, Longbin

2015-01-01

Characterizing the genetic diversity of microbial copper (Cu) resistance at the community level remains challenging, mainly due to the polymorphism of the core functional gene copA. In this study, a local BLASTN method using a copA database built in this study was developed to recover full-length putative copA sequences from an assembled tailings metagenome; these sequences were then screened for potentially functioning CopA using conserved metal-binding motifs, inferred by evolutionary trace analysis of CopA sequences from known Cu resistant microorganisms. In total, 99 putative copA sequences were recovered from the tailings metagenome, out of which 70 were found with high potential to be functioning in Cu resistance. Phylogenetic analysis of selected copA sequences detected in the tailings metagenome showed that topology of the copA phylogeny is largely congruent with that of the 16S-based phylogeny of the tailings microbial community obtained in our previous study, indicating that the development of copA diversity in the tailings might be mainly through vertical descent with few lateral gene transfer events. The method established here can be used to explore copA (and potentially other metal resistance genes) diversity in any metagenome and has the potential to exhaust the full-length gene sequences for downstream analyses. PMID:26286020
Single-molecule DNA unzipping reveals asymmetric modulation of a transcription factor by its binding site sequence and context

PubMed Central

Rudnizky, Sergei; Khamis, Hadeel; Malik, Omri; Squires, Allison H; Meller, Amit; Melamed, Philippa

2018-01-01

Abstract Most functional transcription factor (TF) binding sites deviate from their ‘consensus’ recognition motif, although their sites and flanking sequences are often conserved across species. Here, we used single-molecule DNA unzipping with optical tweezers to study how Egr-1, a TF harboring three zinc fingers (ZF1, ZF2 and ZF3), is modulated by the sequence and context of its functional sites in the Lhb gene promoter. We find that both the core 9 bp bound to Egr-1 in each of the sites, and the base pairs flanking them, modulate the affinity and structure of the protein–DNA complex. The effect of the flanking sequences is asymmetric, with a stronger effect for the sequence flanking ZF3. Characterization of the dissociation time of Egr-1 revealed that a local, mechanical perturbation of the interactions of ZF3 destabilizes the complex more effectively than a perturbation of the ZF1 interactions. Our results reveal a novel role for ZF3 in the interaction of Egr-1 with other proteins and the DNA, providing insight on the regulation of Lhb and other genes by Egr-1. Moreover, our findings reveal the potential of small changes in DNA sequence to alter transcriptional regulation, and may shed light on the organization of regulatory elements at promoters. PMID:29253225
Diversity of Antisense and Other Non-Coding RNAs in Archaea Revealed by Comparative Small RNA Sequencing in Four Pyrobaculum Species

PubMed Central

Bernick, David L.; Dennis, Patrick P.; Lui, Lauren M.; Lowe, Todd M.

2012-01-01

A great diversity of small, non-coding RNA (ncRNA) molecules with roles in gene regulation and RNA processing have been intensely studied in eukaryotic and bacterial model organisms, yet our knowledge of possible parallel roles for small RNAs (sRNA) in archaea is limited. We employed RNA-seq to identify novel sRNA across multiple species of the hyperthermophilic genus Pyrobaculum, known for unusual RNA gene characteristics. By comparing transcriptional data collected in parallel among four species, we were able to identify conserved RNA genes fitting into known and novel families. Among our findings, we highlight three novel cis-antisense sRNAs encoded opposite to key regulatory (ferric uptake regulator), metabolic (triose-phosphate isomerase), and core transcriptional apparatus genes (transcription factor B). We also found a large increase in the number of conserved C/D box sRNA genes over what had been previously recognized; many of these genes are encoded antisense to protein coding genes. The conserved opposition to orthologous genes across the Pyrobaculum genus suggests similarities to other cis-antisense regulatory systems. Furthermore, the genus-specific nature of these sRNAs indicates they are relatively recent, stable adaptations. PMID:22783241
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

2003-12-31

Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less

Evolutionary divergence of core and post-translational circadian clock genes in the pitcher-plant mosquito, Wyeomyia smithii.

PubMed

Tormey, Duncan; Colbourne, John K; Mockaitis, Keithanne; Choi, Jeong-Hyeon; Lopez, Jacqueline; Burkhart, Joshua; Bradshaw, William; Holzapfel, Christina

2015-10-06

Internal circadian (circa, about; dies, day) clocks enable organisms to maintain adaptive timing of their daily behavioral activities and physiological functions. Eukaryotic clocks consist of core transcription-translation feedback loops that generate a cycle and post-translational modifiers that maintain that cycle at about 24 h. We use the pitcher-plant mosquito, Wyeomyia smithii (subfamily Culicini, tribe Sabethini), to test whether evolutionary divergence of the circadian clock genes in this species, relative to other insects, has involved primarily genes in the core feedback loops or the post-translational modifiers. Heretofore, there is no reference transcriptome or genome sequence for any mosquito in the tribe Sabethini, which includes over 375 mainly circumtropical species. We sequenced, assembled and annotated the transcriptome of W. smithii containing nearly 95 % of conserved single-copy orthologs in animal genomes. We used the translated contigs and singletons to determine the average rates of circadian clock-gene divergence in W. smithii relative to three other mosquito genera, to Drosophila, to the butterfly, Danaus, and to the wasp, Nasonia. Over 1.08 million cDNA sequence reads were obtained consisting of 432.5 million nucleotides. Their assembly produced 25,904 contigs and 54,418 singletons of which 62 % and 28 % are annotated as protein-coding genes, respectively, sharing homology with other animal proteomes. The W. smithii transcriptome includes all nine circadian transcription-translation feedback-loop genes and all eight post-translational modifier genes we sought to identify (Fig. 1). After aligning translated W. smithii contigs and singletons from this transcriptome with other insects, we determined that there was no significant difference in the average divergence of W. smithii from the six other taxa between the core feedback-loop genes and post-translational modifiers. The characterized transcriptome is sufficiently complete and of sufficient quality to have uncovered all of the insect circadian clock genes we sought to identify (Fig. 1). Relative divergence does not differ between core feedback-loop genes and post-translational modifiers of those genes in a Sabethine species (W. smithii) that has experienced a continual northward dispersal into temperate regions of progressively longer summer day lengths as compared with six other insect taxa. An associated microarray platform derived from this work will enable the investigation of functional genomics of circadian rhythmicity, photoperiodic time measurement, and diapause along a photic and seasonal geographic gradient.
Structure of CPV17 polyhedrin determined by the improved analysis of serial femtosecond crystallographic data

DOE PAGES

Ginn, Helen M.; Messerschmidt, Marc; Ji, Xiaoyun; ...

2015-03-09

The X-ray free-electron laser (XFEL) allows the analysis of small weakly diffracting protein crystals, but has required very many crystals to obtain good data. Here we use an XFEL to determine the room temperature atomic structure for the smallest cytoplasmic polyhedrosis virus polyhedra yet characterized, which we failed to solve at a synchrotron. These protein microcrystals, roughly a micron across, accrue within infected cells. We use a new physical model for XFEL diffraction, which better estimates the experimental signal, delivering a high-resolution XFEL structure (1.75 Å), using fewer crystals than previously required for this resolution. The crystal lattice and proteinmore » core are conserved compared with a polyhedrin with less than 10% sequence identity. We explain how the conserved biological phenotype, the crystal lattice, is maintained in the face of extreme environmental challenge and massive evolutionary divergence. Our improved methods should open up more challenging biological samples to XFEL analysis.« less
De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes.

PubMed

Zolotarov, Yevgen; Strömvik, Martina

2015-01-01

Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved.
A conserved post-transcriptional BMP2 switch in lung cells.

PubMed

Jiang, Shan; Fritz, David T; Rogers, Melissa B

2010-05-15

An ultra-conserved sequence in the bone morphogenetic protein 2 (BMP2) 3' untranslated region (UTR) markedly represses BMP2 expression in non-transformed lung cells. In contrast, the ultra-conserved sequence stimulates BMP2 expression in transformed lung cells. The ultra-conserved sequence functions as a post-transcriptional cis-regulatory switch. A common single-nucleotide polymorphism (SNP, rs15705, +A1123C), which has been shown to influence human morphology, disrupts a conserved element within the ultra-conserved sequence and altered reporter gene activity in non-transformed lung cells. This polymorphism changed the affinity of the BMP2 RNA for several proteins including nucleolin, which has an increased affinity for the C allele. Elevated BMP2 synthesis is associated with increased malignancy in mouse models of lung cancer and poor lung cancer patient prognosis. Understanding the cis- and trans-regulatory factors that control BMP2 synthesis is relevant to the initiation or progression of pathologies associated with abnormal BMP2 levels. (c) 2010 Wiley-Liss, Inc.
Evaluation of the Abbott RealTime HCV genotype II plus RUO (PLUS) assay with reference to core and NS5B sequencing.

PubMed

Mallory, Melanie A; Lucic, Danijela; Ebbert, Mark T W; Cloherty, Gavin A; Toolsie, Dan; Hillyard, David R

2017-05-01

HCV genotyping remains a critical tool for guiding initiation of therapy and selecting the most appropriate treatment regimen. Current commercial genotyping assays may have difficulty identifying 1a, 1b and genotype 6. To evaluate the concordance for identifying 1a, 1b, and genotype 6 between two methods: the PLUS assay and core/NS5B sequencing. This study included 236 plasma and serum samples previously genotyped by core/NS5B sequencing. Of these, 25 samples were also previously tested by the Abbott RealTime HCV GT II Research Use Only (RUO) assay and yielded ambiguous results. The remaining 211 samples were routine genotype 1 (n=169) and genotype 6 (n=42). Genotypes obtained from sequence data were determined using a laboratory-developed HCV sequence analysis tool and the NCBI non-redundant database. Agreement between the PLUS assay and core/NS5B sequencing for genotype 1 samples was 95.8% (162/169), with 96% (127/132) and 95% (35/37) agreement for 1a and 1b samples respectively. PLUS results agreed with core/NS5B sequencing for 83% (35/42) of unselected genotype 6 samples, with the remaining seven "not detected" by the PLUS assay. Among the 25 samples with ambiguous GT II results, 15 were concordant by PLUS and core/NS5B sequencing, nine were not detected by PLUS, and one sample had an internal control failure. The PLUS assay is an automated method that identifies 1a, 1b and genotype 6 with good agreement with gold-standard core/NS5B sequencing and can aid in the resolution of certain genotype samples with ambiguous GT II results. Copyright © 2017 Elsevier B.V. All rights reserved.
RNA localization in Xenopus oocytes uses a core group of trans-acting factors irrespective of destination.

PubMed

Snedden, Donald D; Bertke, Michelle M; Vernon, Dominic; Huber, Paul W

2013-07-01

The 3' untranslated region of mRNA encoding PHAX, a phosphoprotein required for nuclear export of U-type snRNAs, contains cis-acting sequence motifs E2 and VM1 that are required for localization of RNAs to the vegetal hemisphere of Xenopus oocytes. However, we have found that PHAX mRNA is transported to the opposite, animal, hemisphere. A set of proteins that cross-link to the localization elements of vegetally localized RNAs are also cross-linked to PHAX and An1 mRNAs, demonstrating that the composition of RNP complexes that form on these localization elements is highly conserved irrespective of the final destination of the RNA. The ability of RNAs to bind this core group of proteins is correlated with localization activity. Staufen1, which binds to Vg1 and VegT mRNAs, is not associated with RNAs localized to the animal hemisphere and may determine, at least in part, the direction of RNA movement in Xenopus oocytes.
The haloarchaeal MCM proteins: bioinformatic analysis and targeted mutagenesis of the β7-β8 and β9-β10 hairpin loops and conserved zinc binding domain cysteines.

PubMed

Kristensen, Tatjana P; Maria Cherian, Reeja; Gray, Fiona C; MacNeill, Stuart A

2014-01-01

The hexameric MCM complex is the catalytic core of the replicative helicase in eukaryotic and archaeal cells. Here we describe the first in vivo analysis of archaeal MCM protein structure and function relationships using the genetically tractable haloarchaeon Haloferax volcanii as a model system. Hfx. volcanii encodes a single MCM protein that is part of the previously identified core group of haloarchaeal MCM proteins. Three structural features of the N-terminal domain of the Hfx. volcanii MCM protein were targeted for mutagenesis: the β7-β8 and β9-β10 β-hairpin loops and putative zinc binding domain. Five strains carrying single point mutations in the β7-β8 β-hairpin loop were constructed, none of which displayed impaired cell growth under normal conditions or when treated with the DNA damaging agent mitomycin C. However, short sequence deletions within the β7-β8 β-hairpin were not tolerated and neither was replacement of the highly conserved residue glutamate 187 with alanine. Six strains carrying paired alanine substitutions within the β9-β10 β-hairpin loop were constructed, leading to the conclusion that no individual amino acid within that hairpin loop is absolutely required for MCM function, although one of the mutant strains displays greatly enhanced sensitivity to mitomycin C. Deletions of two or four amino acids from the β9-β10 β-hairpin were tolerated but mutants carrying larger deletions were inviable. Similarly, it was not possible to construct mutants in which any of the conserved zinc binding cysteines was replaced with alanine, underlining the likely importance of zinc binding for MCM function. The results of these studies demonstrate the feasibility of using Hfx. volcanii as a model system for reverse genetic analysis of archaeal MCM protein function and provide important confirmation of the in vivo importance of conserved structural features identified by previous bioinformatic, biochemical and structural studies.
Function-based classification of carbohydrate-active enzymes by recognition of short, conserved peptide motifs.

PubMed

Busk, Peter Kamp; Lange, Lene

2013-06-01

Functional prediction of carbohydrate-active enzymes is difficult due to low sequence identity. However, similar enzymes often share a few short motifs, e.g., around the active site, even when the overall sequences are very different. To exploit this notion for functional prediction of carbohydrate-active enzymes, we developed a simple algorithm, peptide pattern recognition (PPR), that can divide proteins into groups of sequences that share a set of short conserved sequences. When this method was used on 118 glycoside hydrolase 5 proteins with 9% average pairwise identity and representing four characterized enzymatic functions, 97% of the proteins were sorted into groups correlating with their enzymatic activity. Furthermore, we analyzed 8,138 glycoside hydrolase 13 proteins including 204 experimentally characterized enzymes with 28 different functions. There was a 91% correlation between group and enzyme activity. These results indicate that the function of carbohydrate-active enzymes can be predicted with high precision by finding short, conserved motifs in their sequences. The glycoside hydrolase 61 family is important for fungal biomass conversion, but only a few proteins of this family have been functionally characterized. Interestingly, PPR divided 743 glycoside hydrolase 61 proteins into 16 subfamilies useful for targeted investigation of the function of these proteins and pinpointed three conserved motifs with putative importance for enzyme activity. Furthermore, the conserved sequences were useful for cloning of new, subfamily-specific glycoside hydrolase 61 proteins from 14 fungi. In conclusion, identification of conserved sequence motifs is a new approach to sequence analysis that can predict carbohydrate-active enzyme functions with high precision.
Functional region prediction with a set of appropriate homologous sequences-an index for sequence selection by integrating structure and sequence information with spatial statistics

PubMed Central

2012-01-01

Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence-based methods. Conclusions Appropriate homologous sequences are selected automatically and objectively by the index. Such sequence selection improved the performance of functional region prediction. As far as we know, this is the first approach in which spatial statistics have been applied to protein analyses. Such integration of structure and sequence information would be useful for other bioinformatics problems. PMID:22643026
Does Wyoming's Core Area Policy Protect Winter Habitats for Greater Sage-Grouse?

PubMed

Smith, Kurt T; Beck, Jeffrey L; Pratt, Aaron C

2016-10-01

Conservation reserves established to protect important habitat for wildlife species are used world-wide as a wildlife conservation measure. Effective reserves must adequately protect year-round habitats to maintain wildlife populations. Wyoming's Sage-Grouse Core Area policy was established to protect breeding habitats for greater sage-grouse (Centrocercus urophasianus). Protecting only one important seasonal habitat could result in loss or degradation of other important habitats and potential declines in local populations. The purpose of our study was to identify the timing of winter habitat use, the extent which individuals breeding in Core Areas used winter habitats, and develop resource selection functions to assess effectiveness of Core Areas in conserving sage-grouse winter habitats in portions of 5 Core Areas in central and north-central Wyoming during winters 2011-2015. We found that use of winter habitats occured over a longer period than current Core Area winter timing stipulations and a substantial amount of winter habitat outside of Core Areas was used by individuals that bred in Core Areas, particularly in smaller Core Areas. Resource selection functions for each study area indicated that sage-grouse were selecting habitats in response to landscapes dominated by big sagebrush and flatter topography similar to other research on sage-grouse winter habitat selection. The substantial portion of sage-grouse locations and predicted probability of selection during winter outside small Core Areas illustrate that winter requirements for sage-grouse are not adequately met by existing Core Areas. Consequently, further considerations for identifying and managing important winter sage-grouse habitats under Wyoming's Core Area Policy are warranted.
Comparative Sequence Analysis of Multidrug-Resistant IncA/C Plasmids from Salmonella enterica.

PubMed

Hoffmann, Maria; Pettengill, James B; Gonzalez-Escalona, Narjol; Miller, John; Ayers, Sherry L; Zhao, Shaohua; Allard, Marc W; McDermott, Patrick F; Brown, Eric W; Monday, Steven R

2017-01-01

Determinants of multidrug resistance (MDR) are often encoded on mobile elements, such as plasmids, transposons, and integrons, which have the potential to transfer among foodborne pathogens, as well as to other virulent pathogens, increasing the threats these traits pose to human and veterinary health. Our understanding of MDR among Salmonella has been limited by the lack of closed plasmid genomes for comparisons across resistance phenotypes, due to difficulties in effectively separating the DNA of these high-molecular weight, low-copy-number plasmids from chromosomal DNA. To resolve this problem, we demonstrate an efficient protocol for isolating, sequencing and closing IncA/C plasmids from Salmonella sp. using single molecule real-time sequencing on a Pacific Biosciences (Pacbio) RS II Sequencer. We obtained six Salmonella enterica isolates from poultry, representing six different serovars, each exhibiting the MDR-Ampc resistance profile. Salmonella plasmids were obtained using a modified mini preparation and transformed with Escherichia coli DH10Br. A Qiagen Large-Construct kit™ was used to recover highly concentrated and purified plasmid DNA that was sequenced using PacBio technology. These six closed IncA/C plasmids ranged in size from 104 to 191 kb and shared a stable, conserved backbone containing 98 core genes, with only six differences among those core genes. The plasmids encoded a number of antimicrobial resistance genes, including those for quaternary ammonium compounds and mercury. We then compared our six IncA/C plasmid sequences: first with 14 IncA/C plasmids derived from S. enterica available at the National Center for Biotechnology Information (NCBI), and then with an additional 38 IncA/C plasmids derived from different taxa. These comparisons allowed us to build an evolutionary picture of how antimicrobial resistance may be mediated by this common plasmid backbone. Our project provides detailed genetic information about resistance genes in plasmids, advances in plasmid sequencing, and phylogenetic analyses, and important insights about how MDR evolution occurs across diverse serotypes from different animal sources, particularly in agricultural settings where antimicrobial drug use practices vary.
Repeat-Associated Fission Yeast-Like Regional Centromeres in the Ascomycetous Budding Yeast Candida tropicalis

PubMed Central

Chatterjee, Gautam; Sankaranarayanan, Sundar Ram; Guin, Krishnendu; Thattikota, Yogitha; Padmanabhan, Sreedevi; Siddharthan, Rahul; Sanyal, Kaustuv

2016-01-01

The centromere, on which kinetochore proteins assemble, ensures precise chromosome segregation. Centromeres are largely specified by the histone H3 variant CENP-A (also known as Cse4 in yeasts). Structurally, centromere DNA sequences are highly diverse in nature. However, the evolutionary consequence of these structural diversities on de novo CENP-A chromatin formation remains elusive. Here, we report the identification of centromeres, as the binding sites of four evolutionarily conserved kinetochore proteins, in the human pathogenic budding yeast Candida tropicalis. Each of the seven centromeres comprises a 2 to 5 kb non-repetitive mid core flanked by 2 to 5 kb inverted repeats. The repeat-associated centromeres of C. tropicalis all share a high degree of sequence conservation with each other and are strikingly diverged from the unique and mostly non-repetitive centromeres of related Candida species—Candida albicans, Candida dubliniensis, and Candida lusitaniae. Using a plasmid-based assay, we further demonstrate that pericentric inverted repeats and the underlying DNA sequence provide a structural determinant in CENP-A recruitment in C. tropicalis, as opposed to epigenetically regulated CENP-A loading at centromeres in C. albicans. Thus, the centromere structure and its influence on de novo CENP-A recruitment has been significantly rewired in closely related Candida species. Strikingly, the centromere structural properties along with role of pericentric repeats in de novo CENP-A loading in C. tropicalis are more reminiscent to those of the distantly related fission yeast Schizosaccharomyces pombe. Taken together, we demonstrate, for the first time, fission yeast-like repeat-associated centromeres in an ascomycetous budding yeast. PMID:26845548
Whole genome investigation of a divergent clade of the pathogen Streptococcus suis

PubMed Central

Baig, Abiyad; Weinert, Lucy A.; Peters, Sarah E.; Howell, Kate J.; Chaudhuri, Roy R.; Wang, Jinhong; Holden, Matthew T. G.; Parkhill, Julian; Langford, Paul R.; Rycroft, Andrew N.; Wren, Brendan W.; Tucker, Alexander W.; Maskell, Duncan J.

2015-01-01

Streptococcus suis is a major porcine and zoonotic pathogen responsible for significant economic losses in the pig industry and an increasing number of human cases. Multiple isolates of S. suis show marked genomic diversity. Here, we report the analysis of whole genome sequences of nine pig isolates that caused disease typical of S. suis and had phenotypic characteristics of S. suis, but their genomes were divergent from those of many other S. suis isolates. Comparison of protein sequences predicted from divergent genomes with those from normal S. suis reduced the size of core genome from 793 to only 397 genes. Divergence was clear if phylogenetic analysis was performed on reduced core genes and MLST alleles. Phylogenies based on certain other genes (16S rRNA, sodA, recN, and cpn60) did not show divergence for all isolates, suggesting recombination between some divergent isolates with normal S. suis for these genes. Indeed, there is evidence of recent recombination between the divergent and normal S. suis genomes for 249 of 397 core genes. In addition, phylogenetic analysis based on the 16S rRNA gene and 132 genes that were conserved between the divergent isolates and representatives of the broader Streptococcus genus showed that divergent isolates were more closely related to S. suis. Six out of nine divergent isolates possessed a S. suis-like capsule region with variation in capsular gene sequences but the remaining three did not have a discrete capsule locus. The majority (40/70), of virulence-associated genes in normal S. suis were present in the divergent genomes. Overall, the divergent isolates extend the current diversity of S. suis species but the phenotypic similarities and the large amount of gene exchange with normal S. suis gives insufficient evidence to assign these isolates to a new species or subspecies. Further, sampling and whole genome analysis of more isolates is warranted to understand the diversity of the species. PMID:26583006
RAG1 Core and V(D)J Recombination Signal Sequences Were Derived from Transib Transposons

PubMed Central

2005-01-01

The V(D)J recombination reaction in jawed vertebrates is catalyzed by the RAG1 and RAG2 proteins, which are believed to have emerged approximately 500 million years ago from transposon-encoded proteins. Yet no transposase sequence similar to RAG1 or RAG2 has been found. Here we show that the approximately 600-amino acid “core” region of RAG1 required for its catalytic activity is significantly similar to the transposase encoded by DNA transposons that belong to the Transib superfamily. This superfamily was discovered recently based on computational analysis of the fruit fly and African malaria mosquito genomes. Transib transposons also are present in the genomes of sea urchin, yellow fever mosquito, silkworm, dog hookworm, hydra, and soybean rust. We demonstrate that recombination signal sequences (RSSs) were derived from terminal inverted repeats of an ancient Transib transposon. Furthermore, the critical DDE catalytic triad of RAG1 is shared with the Transib transposase as part of conserved motifs. We also studied several divergent proteins encoded by the sea urchin and lancelet genomes that are 25%−30% identical to the RAG1 N-terminal domain and the RAG1 core. Our results provide the first direct evidence linking RAG1 and RSSs to a specific superfamily of DNA transposons and indicate that the V(D)J machinery evolved from transposons. We propose that only the RAG1 core was derived from the Transib transposase, whereas the N-terminal domain was assembled from separate proteins of unknown function that may still be active in sea urchin, lancelet, hydra, and starlet sea anemone. We also suggest that the RAG2 protein was not encoded by ancient Transib transposons but emerged in jawed vertebrates as a counterpart of RAG1 necessary for the V(D)J recombination reaction. PMID:15898832
Transcriptional Activation Signals Found in the Epstein-Barr Virus (EBV) Latency C Promoter Are Conserved in the Latency C Promoter Sequences from Baboon and Rhesus Monkey EBV-Like Lymphocryptoviruses (Cercopithicine Herpesviruses 12 and 15)

PubMed Central

Fuentes-Pananá, Ezequiel M.; Swaminathan, Sankar; Ling, Paul D.

1999-01-01

The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (−1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates. PMID:9847397
Transcriptional activation signals found in the Epstein-Barr virus (EBV) latency C promoter are conserved in the latency C promoter sequences from baboon and Rhesus monkey EBV-like lymphocryptoviruses (cercopithicine herpesviruses 12 and 15).

PubMed

Fuentes-Pananá, E M; Swaminathan, S; Ling, P D

1999-01-01

The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (-1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates.
[Conserved motifs in voltage sensing proteins].

PubMed

Wang, Chang-He; Xie, Zhen-Li; Lv, Jian-Wei; Yu, Zhi-Dan; Shao, Shu-Li

2012-08-25

This paper was aimed to study conserved motifs of voltage sensing proteins (VSPs) and establish a voltage sensing model. All VSPs were collected from the Uniprot database using a comprehensive keyword search followed by manual curation, and the results indicated that there are only two types of known VSPs, voltage gated ion channels and voltage dependent phosphatases. All the VSPs have a common domain of four helical transmembrane segments (TMS, S1-S4), which constitute the voltage sensing module of the VSPs. The S1 segment was shown to be responsible for membrane targeting and insertion of these proteins, while S2-S4 segments, which can sense membrane potential, for protein properties. Conserved motifs/residues and their functional significance of each TMS were identified using profile-to-profile sequence alignments. Conserved motifs in these four segments are strikingly similar for all VSPs, especially, the conserved motif [RK]-X(2)-R-X(2)-R-X(2)-[RK] was presented in all the S4 segments, with positively charged arginine (R) alternating with two hydrophobic or uncharged residues. Movement of these arginines across the membrane electric field is the core mechanism by which the VSPs detect changes in membrane potential. The negatively charged aspartate (D) in the S3 segment is universally conserved in all the VSPs, suggesting that the aspartate residue may be involved in voltage sensing properties of VSPs as well as the electrostatic interactions with the positively charged residues in the S4 segment, which may enhance the thermodynamic stability of the S4 segments in plasma membrane.
Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

PubMed

Dong, Zheng; Zhou, Hongyu; Tao, Peng

2018-02-01

PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.
Quantifying the relationship between sequence and three-dimensional structure conservation in RNA

PubMed Central

2010-01-01

Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and three-dimensional (3D) structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i) there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii) evolution tends to conserve more RNA structure than sequence, and (iii) there is a twilight zone for RNA homology detection. Discussion The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction. PMID:20550657
Differential sequence diversity at merozoite surface protein-1 locus of Plasmodium knowlesi from humans and macaques in Thailand.

PubMed

Putaporntip, Chaturong; Thongaree, Siriporn; Jongwutiwes, Somchai

2013-08-01

To determine the genetic diversity and potential transmission routes of Plasmodium knowlesi, we analyzed the complete nucleotide sequence of the gene encoding the merozoite surface protein-1 of this simian malaria (Pkmsp-1), an asexual blood-stage vaccine candidate, from naturally infected humans and macaques in Thailand. Analysis of Pkmsp-1 sequences from humans (n=12) and monkeys (n=12) reveals five conserved and four variable domains. Most nucleotide substitutions in conserved domains were dimorphic whereas three of four variable domains contained complex repeats with extensive sequence and size variation. Besides purifying selection in conserved domains, evidence of intragenic recombination scattering across Pkmsp-1 was detected. The number of haplotypes, haplotype diversity, nucleotide diversity and recombination sites of human-derived sequences exceeded that of monkey-derived sequences. Phylogenetic networks based on concatenated conserved sequences of Pkmsp-1 displayed a character pattern that could have arisen from sampling process or the presence of two independent routes of P. knowlesi transmission, i.e. from macaques to human and from human to humans in Thailand. Copyright © 2013 Elsevier B.V. All rights reserved.

Airway and Feeding Outcomes of Mandibular Distraction, Tongue-Lip Adhesion, and Conservative Management in Pierre Robin Sequence: A Prospective Study.

PubMed

Khansa, Ibrahim; Hall, Courtney; Madhoun, Lauren L; Splaingard, Mark; Baylis, Adriane; Kirschner, Richard E; Pearson, Gregory D

2017-04-01

Pierre Robin sequence is characterized by mandibular retrognathia and glossoptosis resulting in airway obstruction and feeding difficulties. When conservative management fails, mandibular distraction osteogenesis or tongue-lip adhesion may be required to avoid tracheostomy. The authors' goal was to prospectively evaluate the airway and feeding outcomes of their comprehensive approach to Pierre Robin sequence, which includes conservative management, mandibular distraction osteogenesis, and tongue-lip adhesion. A longitudinal study of newborns with Pierre Robin sequence treated at a pediatric academic medical center between 2010 and 2015 was performed. Baseline feeding and respiratory data were collected. Patients underwent conservative management if they demonstrated sustainable weight gain without tube feeds, and if their airway was stable with positioning alone. Patients who required surgery underwent tongue-lip adhesion or mandibular distraction osteogenesis based on family and surgeon preference. Postoperative airway and feeding data were collected. Twenty-eight patients with Pierre Robin sequence were followed prospectively. Thirty-two percent had a syndrome. Ten underwent mandibular distraction osteogenesis, eight underwent tongue-lip adhesion, and 10 were treated conservatively. There were no differences in days to extubation or discharge, change in weight percentile, requirement for gastrostomy tube, or residual obstructive sleep apnea between the three groups. No patients required tracheostomy. The greatest reduction in apnea-hypopnea index occurred with mandibular distraction osteogenesis, followed by tongue-lip adhesion and conservative management. Careful selection of which patients with Pierre Robin sequence need surgery, and of the most appropriate surgical procedure for each patient, can minimize the need for postprocedure tracheostomy. A comprehensive approach to Pierre Robin sequence that includes conservative management, mandibular distraction osteogenesis, and tongue-lip adhesion can result in excellent airway and feeding outcomes. Therapeutic, II.
Comparative analysis of programmed cell death pathways in filamentous fungi.

PubMed

Fedorova, Natalie D; Badger, Jonathan H; Robson, Geoff D; Wortman, Jennifer R; Nierman, William C

2005-12-08

Fungi can undergo autophagic- or apoptotic-type programmed cell death (PCD) on exposure to antifungal agents, developmental signals, and stress factors. Filamentous fungi can also exhibit a form of cell death called heterokaryon incompatibility (HI) triggered by fusion between two genetically incompatible individuals. With the availability of recently sequenced genomes of Aspergillus fumigatus and several related species, we were able to define putative components of fungi-specific death pathways and the ancestral core apoptotic machinery shared by all fungi and metazoa. Phylogenetic profiling of HI-associated proteins from four Aspergilli and seven other fungal species revealed lineage-specific protein families, orphan genes, and core genes conserved across all fungi and metazoa. The Aspergilli-specific domain architectures include NACHT family NTPases, which may function as key integrators of stress and nutrient availability signals. They are often found fused to putative effector domains such as Pfs, SesB/LipA, and a newly identified domain, HET-s/LopB. Many putative HI inducers and mediators are specific to filamentous fungi and not found in unicellular yeasts. In addition to their role in HI, several of them appear to be involved in regulation of cell cycle, development and sexual differentiation. Finally, the Aspergilli possess many putative downstream components of the mammalian apoptotic machinery including several proteins not found in the model yeast, Saccharomyces cerevisiae. Our analysis identified more than 100 putative PCD associated genes in the Aspergilli, which may help expand the range of currently available treatments for aspergillosis and other invasive fungal diseases. The list includes species-specific protein families as well as conserved core components of the ancestral PCD machinery shared by fungi and metazoa.
The Number, Organization, and Size of Polymorphic Membrane Protein Coding Sequences as well as the Most Conserved Pmp Protein Differ within and across Chlamydia Species.

PubMed

Van Lent, Sarah; Creasy, Heather Huot; Myers, Garry S A; Vanrompay, Daisy

2016-01-01

Variation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis. As different Pmp proteins might be indispensible for the pathogenesis of different Chlamydia species, this study investigated the conservation of Pmp proteins both within and across C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci. The pmp coding sequences were annotated in 16 C. trachomatis, 6 C. pneumoniae, 2 C. abortus, and 16 C. psittaci genomes. The number and organization of polymorphic membrane coding sequences differed within and across the analyzed Chlamydia species. The length of coding sequences of pmpA,pmpB, and pmpH was conserved among all analyzed genomes, while the length of pmpE/F and pmpG, and remarkably also of the subtype pmpD, differed among the analyzed genomes. PmpD, PmpA, PmpH, and PmpA were the most conserved Pmp in C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci, respectively. PmpB was the most conserved Pmp across the 4 analyzed Chlamydia species. © 2016 S. Karger AG, Basel.
Evaluation of the Abbott realtime HCV genotype II RUO (GT II) assay with reference to 5'UTR, core and NS5B sequencing.

PubMed

Mallory, Melanie A; Lucic, Danijela X; Sears, Mitchell T; Cloherty, Gavin A; Hillyard, David R

2014-05-01

HCV genotyping is a critical tool for guiding initiation of therapy and selecting the most appropriate treatment regimen. To evaluate the concordance between the Abbott GT II assay and genotyping by sequencing subregions of the HCV 5'UTR, core and NS5B. The Abbott assay was used to genotype 127 routine patient specimens and 35 patient specimens with unusual subtypes and mixed infection. Abbott results were compared to genotyping by 5'UTR, core and NS5B sequencing. Sequences were genotyped using the NCBI non-redundant database and the online genotyping tool COMET. Among routine specimens, core/NS5B sequencing identified 93 genotype 1s, 13 genotype 2s, 15 genotype 3s, three genotype 4s, two genotype 6s and one recombinant specimen. Genotype calls by 5'UTR, core, NS5B sequencing and the Abbott assay were 97.6% concordant. Core/NS5B sequencing identified two discrepant samples as genotype 6 (subtypes 6l and 6u) while Abbott and 5'UTR sequencing identified these samples as genotype 1 with no subtype. The Abbott assay subtyped 91.4% of genotype 1 specimens. Among the 35 rare specimens, the Abbott assay inaccurately genotyped 3k, 6e, 6o, 6q and one genotype 4 variant; gave indeterminate results for 3g, 3h, 4r, 6m, 6n, and 6q specimens; and agreed with core/NS5B sequencing for mixed specimens. The Abbott assay is an automated HCV genotyping method with improved accuracy over 5'UTR sequencing. Samples identified by the Abbott assay as genotype 1 with no subtype may be rare subtypes of other genotypes and thus require confirmation by another method. Copyright © 2014 Elsevier B.V. All rights reserved.
Evolutionarily conserved regions and hydrophobic contacts at the superfamily level: The case of the fold-type I, pyridoxal-5′-phosphate-dependent enzymes

PubMed Central

Paiardini, Alessandro; Bossa, Francesco; Pascarella, Stefano

2004-01-01

The wealth of biological information provided by structural and genomic projects opens new prospects of understanding life and evolution at the molecular level. In this work, it is shown how computational approaches can be exploited to pinpoint protein structural features that remain invariant upon long evolutionary periods in the fold-type I, PLP-dependent enzymes. A nonredundant set of 23 superposed crystallographic structures belonging to this superfamily was built. Members of this family typically display high-structural conservation despite low-sequence identity. For each structure, a multiple-sequence alignment of orthologous sequences was obtained, and the 23 alignments were merged using the structural information to obtain a comprehensive multiple alignment of 921 sequences of fold-type I enzymes. The structurally conserved regions (SCRs), the evolutionarily conserved residues, and the conserved hydrophobic contacts (CHCs) were extracted from this data set, using both sequence and structural information. The results of this study identified a structural pattern of hydrophobic contacts shared by all of the superfamily members of fold-type I enzymes and involved in native interactions. This profile highlights the presence of a nucleus for this fold, in which residues participating in the most conserved native interactions exhibit preferential evolutionary conservation, that correlates significantly (r = 0.70) with the extent of mean hydrophobic contact value of their apolar fraction. PMID:15498941
Next Generation Sequencing at the University of Chicago Genomics Core

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faber, Pieter

2013-04-24

The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.
Discussion and Reflection on Several Core Issues in the Grand Canal Heritage Conservation Planning Under the Background of Application for World Heritage

NASA Astrophysics Data System (ADS)

Yao, D.; Dai, D. S.; Tang, Y. Z.; Zhu, G. Y.; Chen, X.

2015-08-01

At the turn of the century, a series of new heritage concepts have appeared in the area of international cultural heritage protection, such as cultural landscape, cultural route, heritage corridor, heritage canal, which presents the development of people's recognition of cultural heritage. According to The Operational Guidelines for the Implementation of the World Heritage Convention, management planning must be contained in the material used to apply for world heritage. The State Administration of Cultural Heritage designed the mission and work schedule of China's Grand Canal conservation planning in 2008. This research will introduce the working system of China's Grand Canal conservation planning on three levels: city, province and nation. It will also summarize the characteristics of the core technologies in China's Grand Canal conservation planning, including key issues like the identification of the core characteristic of China's Grand Canal, value assessment and determination of the protection scope. Through reviewing, thinking and analyzing the previous accomplishments, the research will offer some advices for the similar world heritage conservation planning after.
Polymerase Chain Reaction (PCR)-based methods for detection and identification of mycotoxigenic Penicillium species using conserved genes

USDA-ARS?s Scientific Manuscript database

Polymerase chain reaction amplification of conserved genes and sequence analysis provides a very powerful tool for the identification of toxigenic as well as non-toxigenic Penicillium species. Sequences are obtained by amplification of the gene fragment, sequencing via capillary electrophoresis of d...
RNA expression in a cartilaginous fish cell line reveals ancient 3′ noncoding regions highly conserved in vertebrates

PubMed Central

Forest, David; Nishikawa, Ryuhei; Kobayashi, Hiroshi; Parton, Angela; Bayne, Christopher J.; Barnes, David W.

2007-01-01

We have established a cartilaginous fish cell line [Squalus acanthias embryo cell line (SAE)], a mesenchymal stem cell line derived from the embryo of an elasmobranch, the spiny dogfish shark S. acanthias. Elasmobranchs (sharks and rays) first appeared >400 million years ago, and existing species provide useful models for comparative vertebrate cell biology, physiology, and genomics. Comparative vertebrate genomics among evolutionarily distant organisms can provide sequence conservation information that facilitates identification of critical coding and noncoding regions. Although these genomic analyses are informative, experimental verification of functions of genomic sequences depends heavily on cell culture approaches. Using ESTs defining mRNAs derived from the SAE cell line, we identified lengthy and highly conserved gene-specific nucleotide sequences in the noncoding 3′ UTRs of eight genes involved in the regulation of cell growth and proliferation. Conserved noncoding 3′ mRNA regions detected by using the shark nucleotide sequences as a starting point were found in a range of other vertebrate orders, including bony fish, birds, amphibians, and mammals. Nucleotide identity of shark and human in these regions was remarkably well conserved. Our results indicate that highly conserved gene sequences dating from the appearance of jawed vertebrates and representing potential cis-regulatory elements can be identified through the use of cartilaginous fish as a baseline. Because the expression of genes in the SAE cell line was prerequisite for their identification, this cartilaginous fish culture system also provides a physiologically valid tool to test functional hypotheses on the role of these ancient conserved sequences in comparative cell biology. PMID:17227856
Sequence conservation on the Y chromosome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gibson, L.H.; Yang-Feng, L.; Lau, C.

The Y chromosome is present in all mammals and is considered to be essential to sex determination. Despite intense genomic research, only a few genes have been identified and mapped to this chromosome in humans. Several of them, such as SRY and ZFY, have been demonstrated to be conserved and Y-located in other mammals. In order to address the issue of sequence conservation on the Y chromosome, we performed fluorescence in situ hybridization (FISH) with DNA from a human Y cosmid library as a probe to study the Y chromosomes from other mammalian species. Total DNA from 3,000-4,500 cosmid poolsmore » were labeled with biotinylated-dUTP and hybridized to metaphase chromosomes. For human and primate preparations, human cot1 DNA was included in the hybridization mixture to suppress the hybridization from repeat sequences. FISH signals were detected on the Y chromosomes of human, gorilla, orangutan and baboon (Old World monkey) and were absent on those of squirrel monkey (New World monkey), Indian munjac, wood lemming, Chinese hamster, rat and mouse. Since sequence analysis suggested that specific genes, e.g. SRY and ZFY, are conserved between these two groups, the lack of detectable hybridization in the latter group implies either that conservation of the human Y sequences is limited to the Y chromosomes of the great apes and Old World monkeys, or that the size of the syntenic segment is too small to be detected under the resolution of FISH, or that homologeous sequences have undergone considerable divergence. Further studies with reduced hybridization stringency are currently being conducted. Our results provide some clues as to Y-sequence conservation across species and demonstrate the limitations of FISH across species with total DNA sequences from a particular chromosome.« less
HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.

PubMed

Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine

2011-03-10

Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de.
HMMerThread: Detecting Remote, Functional Conserved Domains in Entire Genomes by Combining Relaxed Sequence-Database Searches with Fold Recognition

PubMed Central

Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine

2011-01-01

Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de. PMID:21423752
The Reverse Transcriptase/RNA Maturase Protein MatR Is Required for the Splicing of Various Group II Introns in Brassicaceae Mitochondria

PubMed Central

Sultan, Laure D.; Grewe, Felix; Rolle, Katarzyna; Abudraham, Sivan; Shevtsov, Sofia; Klipcan, Liron; Barciszewski, Jan; Dietrich, André

2016-01-01

Group II introns are large catalytic RNAs that are ancestrally related to nuclear spliceosomal introns. Sequences corresponding to group II RNAs are found in many prokaryotes and are particularly prevalent within plants organellar genomes. Proteins encoded within the introns themselves (maturases) facilitate the splicing of their own host pre-RNAs. Mitochondrial introns in plants have diverged considerably in sequence and have lost their maturases. In angiosperms, only a single maturase has been retained in the mitochondrial DNA: the matR gene found within NADH dehydrogenase 1 (nad1) intron 4. Its conservation across land plants and RNA editing events, which restore conserved amino acids, indicates that matR encodes a functional protein. However, the biological role of MatR remains unclear. Here, we performed an in vivo investigation of the roles of MatR in Brassicaceae. Directed knockdown of matR expression via synthetically designed ribozymes altered the processing of various introns, including nad1 i4. Pull-down experiments further indicated that MatR is associated with nad1 i4 and several other intron-containing pre-mRNAs. MatR may thus represent an intermediate link in the gradual evolutionary transition from the intron-specific maturases in bacteria into their versatile spliceosomal descendants in the nucleus. The similarity between maturases and the core spliceosomal Prp8 protein further supports this intriguing theory. PMID:27760804
A short region of the promoter of the breast cancer associated PLU-1 gene can regulate transcription in vitro and in vivo.

PubMed

Catteau, Aurélie; Rosewell, Ian; Solomon, Ellen; Taylor-Papadimitriou, Joyce

2004-07-01

The recently cloned gene PLU-1 shows restricted expression in adult tissues, with high expression being found in testis, and transiently in the pregnant mammary gland. However, both the gene and the protein product are specifically up-regulated in breast cancer. To investigate the control of expression of the PLU-1 gene, we have cloned and functionally characterised the 5' flanking region of the gene, which was found to contain another putative gene. Two transcription start sites of the PLU-1 gene were mapped by 5' RACE. A short proximal 249 bp region was defined using reporter gene assays, which encompasses the major transcription start site and exhibits a strong constitutive promoter activity in all cell lines tested. However, regions upstream of this sequence repress transcription more effectively in a non-malignant breast cell line as compared to breast cancer cell lines. The 249 bp region is GC-rich and includes consensus Sp1 sites, GC boxes, cAMP-responsive element (CRE) and other putative cis-elements. Mutational analysis showed that two intact conserved Sp1 binding sites (shown here to bind Sp1 and/or Sp3) are critical for constitutive promoter activity, while a negative role for a neighbouring GC box is indicated. The sequence of the core promoter is highly conserved in the mouse and Plu-1 expression in the mouse embryo has been documented. Using transgenesis, we therefore examined the ability of the 249 bp fragment to control expression of a reporter gene during embryogenesis. We found that not only is the core promoter sufficient to activate transcription in vivo, but that the expression of the reporter gene coincides both temporally and spatially with regions where endogenous Plu-1 is highly expressed. This suggests that tissue specific controlling elements are found within the short fragment and are functional in the embryonic environment.
A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

PubMed

Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

2006-04-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.
A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

PubMed Central

Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

2006-01-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031
Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone

PubMed Central

Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

2016-01-01

Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach. PMID:27965389
Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone.

PubMed

Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

2016-12-27

Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.
Targeting Conserved Genes in Penicillium Species.

PubMed

Peterson, Stephen W

2017-01-01

Polymerase chain reaction amplification of conserved genes and sequence analysis provides a very powerful tool for the identification of toxigenic as well as non-toxigenic Penicillium species. Sequences are obtained by amplification of the gene fragment, sequencing via capillary electrophoresis of dideoxynucleotide-labeled fragments or NGS. The sequences are compared to a database of validated isolates. Identification of species indicates the potential of the fungus to make particular mycotoxins.
Listeria costaricensis sp. nov.

PubMed

Núñez-Montero, Kattia; Leclercq, Alexandre; Moura, Alexandra; Vales, Guillaume; Peraza, Johnny; Pizarro-Cerdá, Javier; Lecuit, Marc

2018-03-01

A bacterial strain isolated from a food processing drainage system in Costa Rica fulfilled the criteria as belonging to the genus Listeria, but could not be assigned to any of the known species. Phylogenetic analysis based on the 16S rRNA gene revealed highest sequence similarity with the type strain of Listeria floridensis (98.7 %). Phylogenetic analysis based on Listeria core genomes placed the novel taxon within the Listeria fleishmannii, L. floridensis and Listeria aquatica clade (Listeria sensu lato). Whole-genome sequence analyses based on the average nucleotide blast identity (ANI<80 %) indicated that this isolate belonged to a novel species. Results of pairwise amino acid identity (AAI>70 %) and percentage of conserved proteins (POCP>68 %) with currently known Listeria species, as well as of biochemical characterization, confirmed that the strain constituted a novel species within the genus Listeria. The name Listeria costaricensis sp. nov. is proposed for the novel species, and is represented by the type strain CLIP 2016/00682 T (=CIP 111400 T =DSM 105474 T ).

Identifcation of a Novel Mutation p.I240T in the FRMD7 gene in a Family with Congenital Nystagmus

NASA Astrophysics Data System (ADS)

Zhu, Yihua; Zhuang, Jianfu; Ge, Xianglian; Zhang, Xiao; Wang, Zheng; Sun, Ji; Yang, Juhua; Gu, Feng

2013-10-01

Congenital Nystagmus (CN) is a genetically heterogeneous ocular disease, which causes a significant proportion of childhood visual impairment. To identify the underlying genetic defect of a CN family, twenty-two members were recruited. Genotype analysis showed that affected individuals shared a common haplotype with markers flanking FRMD7 locus. Sequencing FRMD7 revealed a T > C transition in exon 8, causing a conservative substitution of Isoleucine to Tyrosine at codon 240. By protein structural modeling, we found the mutation may disrupt the hydrophobic core and destabilize the protein structure. We reviewed the literature and found that exons 2, 8, and 9 (11.4% of the sequence of FRMD7 mRNA) represent the majority (55.3%) of the reported FRMD7 mutations. In summary, we identified a novel mutation in FRMD7, showed its molecular consequence, and revealed the mutation-rich exons of the FRMD7 gene. Collectively, this provides molecular insights for future CN clinical genetic diagnosis and treatment.
Identifcation of a novel mutation p.I240T in the FRMD7 gene in a family with congenital nystagmus.

PubMed

Zhu, Yihua; Zhuang, Jianfu; Ge, Xianglian; Zhang, Xiao; Wang, Zheng; Sun, Ji; Yang, Juhua; Gu, Feng

2013-10-30

Congenital Nystagmus (CN) is a genetically heterogeneous ocular disease, which causes a significant proportion of childhood visual impairment. To identify the underlying genetic defect of a CN family, twenty-two members were recruited. Genotype analysis showed that affected individuals shared a common haplotype with markers flanking FRMD7 locus. Sequencing FRMD7 revealed a T > C transition in exon 8, causing a conservative substitution of Isoleucine to Tyrosine at codon 240. By protein structural modeling, we found the mutation may disrupt the hydrophobic core and destabilize the protein structure. We reviewed the literature and found that exons 2, 8, and 9 (11.4% of the sequence of FRMD7 mRNA) represent the majority (55.3%) of the reported FRMD7 mutations. In summary, we identified a novel mutation in FRMD7, showed its molecular consequence, and revealed the mutation-rich exons of the FRMD7 gene. Collectively, this provides molecular insights for future CN clinical genetic diagnosis and treatment.
Identifcation of a Novel Mutation p.I240T in the FRMD7 gene in a Family with Congenital Nystagmus

PubMed Central

Zhu, Yihua; Zhuang, Jianfu; Ge, Xianglian; Zhang, Xiao; Wang, Zheng; Sun, Ji; Yang, Juhua; Gu, Feng

2013-01-01

Congenital Nystagmus (CN) is a genetically heterogeneous ocular disease, which causes a significant proportion of childhood visual impairment. To identify the underlying genetic defect of a CN family, twenty-two members were recruited. Genotype analysis showed that affected individuals shared a common haplotype with markers flanking FRMD7 locus. Sequencing FRMD7 revealed a T > C transition in exon 8, causing a conservative substitution of Isoleucine to Tyrosine at codon 240. By protein structural modeling, we found the mutation may disrupt the hydrophobic core and destabilize the protein structure. We reviewed the literature and found that exons 2, 8, and 9 (11.4% of the sequence of FRMD7 mRNA) represent the majority (55.3%) of the reported FRMD7 mutations. In summary, we identified a novel mutation in FRMD7, showed its molecular consequence, and revealed the mutation-rich exons of the FRMD7 gene. Collectively, this provides molecular insights for future CN clinical genetic diagnosis and treatment. PMID:24169426
Integrated sequence stratigraphy of the postimpact sediments from the Eyreville core holes, Chesapeake Bay impact structure inner basin

USGS Publications Warehouse

Browning, J.V.; Miller, K.G.; McLaughlin, P.P.; Edwards, L.E.; Kulpecz, A.A.; Powars, D.S.; Wade, B.S.; Feigenson, M.D.; Wright, J.D.

2009-01-01

The Eyreville core holes provide the first continuously cored record of postimpact sequences from within the deepest part of the central Chesapeake Bay impact crater. We analyzed the upper Eocene to Pliocene postimpact sediments from the Eyreville A and C core holes for lithology (semiquantitative measurements of grain size and composition), sequence stratigraphy, and chronostratigraphy. Age is based primarily on Sr isotope stratigraphy supplemented by biostratigraphy (dinocysts, nannofossils, and planktonic foraminifers); age resolution is approximately ??0.5 Ma for early Miocene sequences and approximately ??1.0 Ma for younger and older sequences. Eocene-lower Miocene sequences are subtle, upper middle to lower upper Miocene sequences are more clearly distinguished, and upper Miocene- Pliocene sequences display a distinct facies pattern within sequences. We recognize two upper Eocene, two Oligocene, nine Miocene, three Pliocene, and one Pleistocene sequence and correlate them with those in New Jersey and Delaware. The upper Eocene through Pleistocene strata at Eyreville record changes from: (1) rapidly deposited, extremely fi ne-grained Eocene strata that probably represent two sequences deposited in a deep (>200 m) basin; to (2) highly dissected Oligocene (two very thin sequences) to lower Miocene (three thin sequences) with a long hiatus; to (3) a thick, rapidly deposited (43-73 m/Ma), very fi ne-grained, biosiliceous middle Miocene (16.5-14 Ma) section divided into three sequences (V5-V3) deposited in middle neritic paleoenvironments; to (4) a 4.5-Ma-long hiatus (12.8-8.3 Ma); to (5) sandy, shelly upper Miocene to Pliocene strata (8.3-2.0 Ma) divided into six sequences deposited in shelf and shoreface environments; and, last, to (6) a sandy middle Pleistocene paralic sequence (~400 ka). The Eyreville cores thus record the fi lling of a deep impact-generated basin where the timing of sequence boundaries is heavily infl uenced by eustasy. ?? 2009 The Geological Society of America.
Nucleotide sequence of the ribosomal RNA gene of Physarum polycephalum: intron 2 and its flanking regions of the 26S rRNA gene.

PubMed Central

Nomiyama, H; Kuhara, S; Kukita, T; Otsuka, T; Sakaki, Y

1981-01-01

The 26S ribosomal RNA gene of Physarum polycephalum is interrupted by two introns, and we have previously determined the sequence of one of them (intron 1) (Nomiyama et al. Proc.Natl.Acad.Sci.USA 78, 1376-1380, 1981). In this study we sequenced the second intron (intron 2) of about 0.5 kb length and its flanking regions, and found that one nucleotide at each junction is identical in intron 1 and intron 2, though the junction regions share no other sequence homology. Comparison of the flanking exon sequences to E. coli 23S rRNA sequences shows that conserved sequences are interspersed with tracts having little homology. In particular, the region encompassing the intron 2 interruption site is highly conserved. The E. coli ribosomal protein L1 binding region is also conserved. Images PMID:6171776
Nucleotide sequence determination of guinea-pig casein B mRNA reveals homology with bovine and rat alpha s1 caseins and conservation of the non-coding regions of the mRNA.

PubMed Central

Hall, L; Laird, J E; Craig, R K

1984-01-01

Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375
RNA connectivity requirements between conserved elements in the core of the yeast telomerase RNP

PubMed Central

Mefford, Melissa A; Rafiq, Qundeel; Zappulla, David C

2013-01-01

Telomerase is a specialized chromosome end-replicating enzyme required for genome duplication in many eukaryotes. An RNA and reverse transcriptase protein subunit comprise its enzymatic core. Telomerase is evolving rapidly, particularly its RNA component. Nevertheless, nearly all telomerase RNAs, including those of H. sapiens and S. cerevisiae, share four conserved structural elements: a core-enclosing helix (CEH), template-boundary element, template, and pseudoknot, in this order along the RNA. It is not clear how these elements coordinate telomerase activity. We find that although rearranging the order of the four conserved elements in the yeast telomerase RNA subunit, TLC1, disrupts activity, the RNA ends can be moved between the template and pseudoknot in vitro and in vivo. However, the ends disrupt activity when inserted between the other structured elements, defining an Area of Required Connectivity (ARC). Within the ARC, we find that only the junction nucleotides between the pseudoknot and CEH are essential. Integrating all of our findings provides a basic map of functional connections in the core of the yeast telomerase RNP and a framework to understand conserved element coordination in telomerase mechanism. PMID:24129512
ElemeNT: a computational tool for detecting core promoter elements.

PubMed

Sloutskin, Anna; Danino, Yehuda M; Orenstein, Yaron; Zehavi, Yonathan; Doniger, Tirza; Shamir, Ron; Juven-Gershon, Tamar

2015-01-01

Core promoter elements play a pivotal role in the transcriptional output, yet they are often detected manually within sequences of interest. Here, we present 2 contributions to the detection and curation of core promoter elements within given sequences. First, the Elements Navigation Tool (ElemeNT) is a user-friendly web-based, interactive tool for prediction and display of putative core promoter elements and their biologically-relevant combinations. Second, the CORE database summarizes ElemeNT-predicted core promoter elements near CAGE and RNA-seq-defined Drosophila melanogaster transcription start sites (TSSs). ElemeNT's predictions are based on biologically-functional core promoter elements, and can be used to infer core promoter compositions. ElemeNT does not assume prior knowledge of the actual TSS position, and can therefore assist in annotation of any given sequence. These resources, freely accessible at http://lifefaculty.biu.ac.il/gershon-tamar/index.php/resources, facilitate the identification of core promoter elements as active contributors to gene expression.
Transcriptome de novo assembly sequencing and analysis of the toxic dinoflagellate Alexandrium catenella using the Illumina platform.

PubMed

Zhang, Shu; Sui, Zhenghong; Chang, Lianpeng; Kang, Kyoungho; Ma, Jinhua; Kong, Fanna; Zhou, Wei; Wang, Jinguo; Guo, Liliang; Geng, Huili; Zhong, Jie; Ma, Qingxia

2014-03-10

In this article, high-throughput de novo transcriptomic sequencing was performed in Alexandrium catenella, which provided the first view of the gene repertoire in this dinoflagellate based on next-generation sequencing (NGS) technologies. A total of 118,304 unigenes were identified with an average length of 673bp (base pair). Of these unigenes, 77,936 (65.9%) were annotated with known proteins based on sequence similarities, among which 24,149 and 22,956 unigenes were assigned to gene ontology categories (GO) and clusters of orthologous groups (COGs), respectively. Furthermore, 16,467 unigenes were mapped onto 322 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG). We also detected 1143 simple sequence repeats (SSRs), in which the tri-nucleotide repeat motif (69.3%) was the most abundant. The genetic facts and significance derived from the transcriptome dataset were suggested and discussed. All four core nucleosomal histones and linker histones were detected, in addition to the unigenes involved in histone modifications.190 unigenes were identified as being involved in the endocytosis pathway, and clathrin-dependent endocytosis was suggested to play a role in the heterotrophy of A. catenella. A conserved 22-nt spliced leader (SL) was identified in 21 unigenes which suggested the existence of trans-splicing processing of mRNA in A. catenella. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.
Pleurochrysome: A Web Database of Pleurochrysis Transcripts and Orthologs Among Heterogeneous Algae

PubMed Central

Fujiwara, Shoko; Takatsuka, Yukiko; Hirokawa, Yasutaka; Tsuzuki, Mikio; Takano, Tomoyuki; Kobayashi, Masaaki; Suda, Kunihiro; Asamizu, Erika; Yokoyama, Koji; Shibata, Daisuke; Tabata, Satoshi; Yano, Kentaro

2016-01-01

Pleurochrysis is a coccolithophorid genus, which belongs to the Coccolithales in the Haptophyta. The genus has been used extensively for biological research, together with Emiliania in the Isochrysidales, to understand distinctive features between the two coccolithophorid-including orders. However, molecular biological research on Pleurochrysis such as elucidation of the molecular mechanism behind coccolith formation has not made great progress at least in part because of lack of comprehensive gene information. To provide such information to the research community, we built an open web database, the Pleurochrysome (http://bioinf.mind.meiji.ac.jp/phapt/), which currently stores 9,023 unique gene sequences (designated as UNIGENEs) assembled from expressed sequence tag sequences of P. haptonemofera as core information. The UNIGENEs were annotated with gene sequences sharing significant homology, conserved domains, Gene Ontology, KEGG Orthology, predicted subcellular localization, open reading frames and orthologous relationship with genes of 10 other algal species, a cyanobacterium and the yeast Saccharomyces cerevisiae. This sequence and annotation information can be easily accessed via several search functions. Besides fundamental functions such as BLAST and keyword searches, this database also offers search functions to explore orthologous genes in the 12 organisms and to seek novel genes. The Pleurochrysome will promote molecular biological and phylogenetic research on coccolithophorids and other haptophytes by helping scientists mine data from the primary transcriptome of P. haptonemofera. PMID:26746174
Activation of Adhesion G Protein-coupled Receptors: AGONIST SPECIFICITY OF STACHEL SEQUENCE-DERIVED PEPTIDES.

PubMed

Demberg, Lilian M; Winkler, Jana; Wilde, Caroline; Simon, Kay-Uwe; Schön, Julia; Rothemund, Sven; Schöneberg, Torsten; Prömel, Simone; Liebscher, Ines

2017-03-17

Members of the adhesion G protein-coupled receptor (aGPCR) family carry an agonistic sequence within their large ectodomains. Peptides derived from this region, called the Stachel sequence, can activate the respective receptor. As the conserved core region of the Stachel sequence is highly similar between aGPCRs, the agonist specificity of Stachel sequence-derived peptides was tested between family members using cell culture-based second messenger assays. Stachel peptides derived from aGPCRs of subfamily VI (GPR110/ADGRF1, GPR116/ADGRF5) and subfamily VIII (GPR64/ADGRG2, GPR126/ADGRG6) are able to activate more than one member of the respective subfamily supporting their evolutionary relationship and defining them as pharmacological receptor subtypes. Extended functional analyses of the Stachel sequences and derived peptides revealed agonist promiscuity, not only within, but also between aGPCR subfamilies. For example, the Stachel -derived peptide of GPR110 (subfamily VI) can activate GPR64 and GPR126 (both subfamily VIII). Our results indicate that key residues in the Stachel sequence are very similar between aGPCRs allowing for agonist promiscuity of several Stachel -derived peptides. Therefore, aGPCRs appear to be pharmacologically more closely related than previously thought. Our findings have direct implications for many aGPCR studies, as potential functional overlap has to be considered for in vitro and in vivo studies. However, it also offers the possibility of a broader use of more potent peptides when the original Stachel sequence is less effective. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Cryo-EM near-atomic structure of a dsRNA fungal virus shows ancient structural motifs preserved in the dsRNA viral lineage

PubMed Central

Luque, Daniel; Gómez-Blanco, Josué; Garriga, Damiá; Brilot, Axel F.; González, José M.; Havens, Wendy M.; Carrascosa, José L.; Trus, Benes L.; Verdaguer, Nuria; Ghabrial, Said A.; Castón, José R.

2014-01-01

Viruses evolve so rapidly that sequence-based comparison is not suitable for detecting relatedness among distant viruses. Structure-based comparisons suggest that evolution led to a small number of viral classes or lineages that can be grouped by capsid protein (CP) folds. Here, we report that the CP structure of the fungal dsRNA Penicillium chrysogenum virus (PcV) shows the progenitor fold of the dsRNA virus lineage and suggests a relationship between lineages. Cryo-EM structure at near-atomic resolution showed that the 982-aa PcV CP is formed by a repeated α-helical core, indicative of gene duplication despite lack of sequence similarity between the two halves. Superimposition of secondary structure elements identified a single “hotspot” at which variation is introduced by insertion of peptide segments. Structural comparison of PcV and other distantly related dsRNA viruses detected preferential insertion sites at which the complexity of the conserved α-helical core, made up of ancestral structural motifs that have acted as a skeleton, might have increased, leading to evolution of the highly varied current structures. Analyses of structural motifs only apparent after systematic structural comparisons indicated that the hallmark fold preserved in the dsRNA virus lineage shares a long (spinal) α-helix tangential to the capsid surface with the head-tailed phage and herpesvirus viral lineage. PMID:24821769
Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

PubMed Central

Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

2015-01-01

Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930
Identification of MicroRNAs in the Coral Stylophora pistillata

PubMed Central

Liew, Yi Jin; Aranda, Manuel; Carr, Adrian; Baumgarten, Sebastian; Zoccola, Didier; Tambutté, Sylvie; Allemand, Denis; Micklem, Gos; Voolstra, Christian R.

2014-01-01

Coral reefs are major contributors to marine biodiversity. However, they are in rapid decline due to global environmental changes such as rising sea surface temperatures, ocean acidification, and pollution. Genomic and transcriptomic analyses have broadened our understanding of coral biology, but a study of the microRNA (miRNA) repertoire of corals is missing. miRNAs constitute a class of small non-coding RNAs of ∼22 nt in size that play crucial roles in development, metabolism, and stress response in plants and animals alike. In this study, we examined the coral Stylophora pistillata for the presence of miRNAs and the corresponding core protein machinery required for their processing and function. Based on small RNA sequencing, we present evidence for 31 bona fide microRNAs, 5 of which (miR-100, miR-2022, miR-2023, miR-2030, and miR-2036) are conserved in other metazoans. Homologues of Argonaute, Piwi, Dicer, Drosha, Pasha, and HEN1 were identified in the transcriptome of S. pistillata based on strong sequence conservation with known RNAi proteins, with additional support derived from phylogenetic trees. Examination of putative miRNA gene targets indicates potential roles in development, metabolism, immunity, and biomineralisation for several of the microRNAs. Here, we present first evidence of a functional RNAi machinery and five conserved miRNAs in S. pistillata, implying that miRNAs play a role in organismal biology of scleractinian corals. Analysis of predicted miRNA target genes in S. pistillata suggests potential roles of miRNAs in symbiosis and coral calcification. Given the importance of miRNAs in regulating gene expression in other metazoans, further expression analyses of small non-coding RNAs in transcriptional studies of corals should be informative about miRNA-affected processes and pathways. PMID:24658574
Scop3D: three-dimensional visualization of sequence conservation.

PubMed

Vermeire, Tessa; Vermaere, Stijn; Schepens, Bert; Saelens, Xavier; Van Gucht, Steven; Martens, Lennart; Vandermarliere, Elien

2015-04-01

The integration of a protein's structure with its known sequence variation provides insight on how that protein evolves, for instance in terms of (changing) function or immunogenicity. Yet, collating the corresponding sequence variants into a multiple sequence alignment, calculating each position's conservation, and mapping this information back onto a relevant structure is not straightforward. We therefore built the Sequence Conservation on Protein 3D structure (scop3D) tool to perform these tasks automatically. The output consists of two modified PDB files in which the B-values for each position are replaced by the percentage sequence conservation, or the information entropy for each position, respectively. Furthermore, text files with absolute and relative amino acid occurrences for each position are also provided, along with snapshots of the protein from six distinct directions in space. The visualization provided by scop3D can for instance be used as an aid in vaccine development or to identify antigenic hotspots, which we here demonstrate based on an analysis of the fusion proteins of human respiratory syncytial virus and mumps virus. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sequencing Needs for Viral Diagnostics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gardner, S N; Lam, M; Mulakken, N J

2004-01-26

We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less
Cloning and sequence analysis of a cDNA encoding the alpha-subunit of mouse beta-N-acetylhexosaminidase and comparison with the human enzyme.

PubMed Central

Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L

1992-01-01

cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046
Increasing Sequence Diversity with Flexible Backbone Protein Design: The Complete Redesign of a Protein Hydrophobic Core

DOE Office of Scientific and Technical Information (OSTI.GOV)

Murphy, Grant S.; Mills, Jeffrey L.; Miley, Michael J.

2015-10-15

Protein design tests our understanding of protein stability and structure. Successful design methods should allow the exploration of sequence space not found in nature. However, when redesigning naturally occurring protein structures, most fixed backbone design algorithms return amino acid sequences that share strong sequence identity with wild-type sequences, especially in the protein core. This behavior places a restriction on functional space that can be explored and is not consistent with observations from nature, where sequences of low identity have similar structures. Here, we allow backbone flexibility during design to mutate every position in the core (38 residues) of a four-helixmore » bundle protein. Only small perturbations to the backbone, 12 {angstrom}, were needed to entirely mutate the core. The redesigned protein, DRNN, is exceptionally stable (melting point >140C). An NMR and X-ray crystal structure show that the side chains and backbone were accurately modeled (all-atom RMSD = 1.3 {angstrom}).« less
Genome-wide identification of conserved microRNA and their response to drought stress in Dongxiang wild rice (Oryza rufipogon Griff.).

PubMed

Zhang, Fantao; Luo, Xiangdong; Zhou, Yi; Xie, Jiankun

2016-04-01

To identify drought stress-responsive conserved microRNA (miRNA) from Dongxiang wild rice (Oryza rufipogon Griff., DXWR) on a genome-wide scale, high-throughput sequencing technology was used to sequence libraries of DXWR samples, treated with and without drought stress. 505 conserved miRNAs corresponding to 215 families were identified. 17 were significantly down-regulated and 16 were up-regulated under drought stress. Stem-loop qRT-PCR revealed the same expression patterns as high-throughput sequencing, suggesting the accuracy of the sequencing result was high. Potential target genes of the drought-responsive miRNA were predicted to be involved in diverse biological processes. Furthermore, 16 miRNA families were first identified to be involved in drought stress response from plants. These results present a comprehensive view of the conserved miRNA and their expression patterns under drought stress for DXWR, which will provide valuable information and sequence resources for future basis studies.
Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

NASA Astrophysics Data System (ADS)

Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

2017-03-01

Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.

Sequencing of mitochondrial genomes of nine Aspergillus and Penicillium species identifies mobile introns and accessory genes as main sources of genome size variability.

PubMed

Joardar, Vinita; Abrams, Natalie F; Hostetler, Jessica; Paukstelis, Paul J; Pakala, Suchitra; Pakala, Suman B; Zafar, Nikhat; Abolude, Olukemi O; Payne, Gary; Andrianopoulos, Alex; Denning, David W; Nierman, William C

2012-12-12

The genera Aspergillus and Penicillium include some of the most beneficial as well as the most harmful fungal species such as the penicillin-producer Penicillium chrysogenum and the human pathogen Aspergillus fumigatus, respectively. Their mitochondrial genomic sequences may hold vital clues into the mechanisms of their evolution, population genetics, and biology, yet only a handful of these genomes have been fully sequenced and annotated. Here we report the complete sequence and annotation of the mitochondrial genomes of six Aspergillus and three Penicillium species: A. fumigatus, A. clavatus, A. oryzae, A. flavus, Neosartorya fischeri (A. fischerianus), A. terreus, P. chrysogenum, P. marneffei, and Talaromyces stipitatus (P. stipitatum). The accompanying comparative analysis of these and related publicly available mitochondrial genomes reveals wide variation in size (25-36 Kb) among these closely related fungi. The sources of genome expansion include group I introns and accessory genes encoding putative homing endonucleases, DNA and RNA polymerases (presumed to be of plasmid origin) and hypothetical proteins. The two smallest sequenced genomes (A. terreus and P. chrysogenum) do not contain introns in protein-coding genes, whereas the largest genome (T. stipitatus), contains a total of eleven introns. All of the sequenced genomes have a group I intron in the large ribosomal subunit RNA gene, suggesting that this intron is fixed in these species. Subsequent analysis of several A. fumigatus strains showed low intraspecies variation. This study also includes a phylogenetic analysis based on 14 concatenated core mitochondrial proteins. The phylogenetic tree has a different topology from published multilocus trees, highlighting the challenges still facing the Aspergillus systematics. The study expands the genomic resources available to fungal biologists by providing mitochondrial genomes with consistent annotations for future genetic, evolutionary and population studies. Despite the conservation of the core genes, the mitochondrial genomes of Aspergillus and Penicillium species examined here exhibit significant amount of interspecies variation. Most of this variation can be attributed to accessory genes and mobile introns, presumably acquired by horizontal gene transfer of mitochondrial plasmids and intron homing.
A core microbiome associated with the peritoneal tumors of pseudomyxoma peritonei

PubMed Central

2013-01-01

Background Pseudomyxoma peritonei (PMP) is a malignancy characterized by dissemination of mucus-secreting cells throughout the peritoneum. This disease is associated with significant morbidity and mortality and despite effective treatment options for early-stage disease, patients with PMP often relapse. Thus, there is a need for additional treatment options to reduce relapse rate and increase long-term survival. A previous study identified the presence of both typed and non-culturable bacteria associated with PMP tissue and determined that increased bacterial density was associated with more severe disease. These findings highlighted the possible role for bacteria in PMP disease. Methods To more clearly define the bacterial communities associated with PMP disease, we employed a sequenced-based analysis to profile the bacterial populations found in PMP tumor and mucin tissue in 11 patients. Sequencing data were confirmed by in situ hybridization at multiple taxonomic depths and by culturing. A pilot clinical study was initiated to determine whether the addition of antibiotic therapy affected PMP patient outcome. Main results We determined that the types of bacteria present are highly conserved in all PMP patients; the dominant phyla are the Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes. A core set of taxon-specific sequences were found in all 11 patients; many of these sequences were classified into taxonomic groups that also contain known human pathogens. In situ hybridization directly confirmed the presence of bacteria in PMP at multiple taxonomic depths and supported our sequence-based analysis. Furthermore, culturing of PMP tissue samples allowed us to isolate 11 different bacterial strains from eight independent patients, and in vitro analysis of subset of these isolates suggests that at least some of these strains may interact with the PMP-associated mucin MUC2. Finally, we provide evidence suggesting that targeting these bacteria with antibiotic treatment may increase the survival of PMP patients. Conclusions Using 16S amplicon-based sequencing, direct in situ hybridization analysis and culturing methods, we have identified numerous bacterial taxa that are consistently present in all PMP patients tested. Combined with data from a pilot clinical study, these data support the hypothesis that adding antimicrobials to the standard PMP treatment could improve PMP patient survival. PMID:23844722
Specific Internalisation of Gold Nanoparticles into Engineered Porous Protein Cages via Affinity Binding

PubMed Central

Peng, Tao; Free, Paul; Fernig, David G.; Lim, Sierin; Tomczak, Nikodem

2016-01-01

Porous protein cages are supramolecular protein self-assemblies presenting pores that allow the access of surrounding molecules and ions into their core in order to store and transport them in biological environments. Protein cages’ pores are attractive channels for the internalisation of inorganic nanoparticles and an alternative for the preparation of hybrid bioinspired nanoparticles. However, strategies based on nanoparticle transport through the pores are largely unexplored, due to the difficulty of tailoring nanoparticles that have diameters commensurate with the pores size and simultaneously displaying specific affinity to the cages’ core and low non-specific binding to the cages’ outer surface. We evaluated the specific internalisation of single small gold nanoparticles, 3.9 nm in diameter, into porous protein cages via affinity binding. The E2 protein cage derived from the Geobacillus stearothermophilus presents 12 pores, 6 nm in diameter, and an empty core of 13 nm in diameter. We engineered the E2 protein by site-directed mutagenesis with oligohistidine sequences exposing them into the cage’s core. Dynamic light scattering and electron microscopy analysis show that the structures of E2 protein cages mutated with bis- or penta-histidine sequences are well conserved. The surface of the gold nanoparticles was passivated with a self-assembled monolayer made of a mixture of short peptidols and thiolated alkane ethylene glycol ligands. Such monolayers are found to provide thin coatings preventing non-specific binding to proteins. Further functionalisation of the peptide coated gold nanoparticles with Ni2+ nitrilotriacetic moieties enabled the specific binding to oligohistidine tagged cages. The internalisation via affinity binding was evaluated by electron microscopy analysis. From the various mutations tested, only the penta-histidine mutated E2 protein cage showed repeatable and stable internalisation. The present work overcomes the limitations of currently available approaches and provides a new route to design tailored and well-controlled hybrid nanoparticles. PMID:27622533
Sequence and structural implications of a bovine corneal keratan sulfate proteoglycan core protein. Protein 37B represents bovine lumican and proteins 37A and 25 are unique

NASA Technical Reports Server (NTRS)

Funderburgh, J. L.; Funderburgh, M. L.; Brown, S. J.; Vergnes, J. P.; Hassell, J. R.; Mann, M. M.; Conrad, G. W.; Spooner, B. S. (Principal Investigator)

1993-01-01

Amino acid sequence from tryptic peptides of three different bovine corneal keratan sulfate proteoglycan (KSPG) core proteins (designated 37A, 37B, and 25) showed similarities to the sequence of a chicken KSPG core protein lumican. Bovine lumican cDNA was isolated from a bovine corneal expression library by screening with chicken lumican cDNA. The bovine cDNA codes for a 342-amino acid protein, M(r) 38,712, containing amino acid sequences identified in the 37B KSPG core protein. The bovine lumican is 68% identical to chicken lumican, with an 83% identity excluding the N-terminal 40 amino acids. Location of 6 cysteine and 4 consensus N-glycosylation sites in the bovine sequence were identical to those in chicken lumican. Bovine lumican had about 50% identity to bovine fibromodulin and 20% identity to bovine decorin and biglycan. About two-thirds of the lumican protein consists of a series of 10 amino acid leucine-rich repeats that occur in regions of calculated high beta-hydrophobic moment, suggesting that the leucine-rich repeats contribute to beta-sheet formation in these proteins. Sequences obtained from 37A and 25 core proteins were absent in bovine lumican, thus predicting a unique primary structure and separate mRNA for each of the three bovine KSPG core proteins.
Solving the Problem: Genome Annotation Standards before the Data Deluge.

PubMed

Klimke, William; O'Donovan, Claire; White, Owen; Brister, J Rodney; Clark, Karen; Fedorov, Boris; Mizrachi, Ilene; Pruitt, Kim D; Tatusova, Tatiana

2011-10-15

The promise of genome sequencing was that the vast undiscovered country would be mapped out by comparison of the multitude of sequences available and would aid researchers in deciphering the role of each gene in every organism. Researchers recognize that there is a need for high quality data. However, different annotation procedures, numerous databases, and a diminishing percentage of experimentally determined gene functions have resulted in a spectrum of annotation quality. NCBI in collaboration with sequencing centers, archival databases, and researchers, has developed the first international annotation standards, a fundamental step in ensuring that high quality complete prokaryotic genomes are available as gold standard references. Highlights include the development of annotation assessment tools, community acceptance of protein naming standards, comparison of annotation resources to provide consistent annotation, and improved tracking of the evidence used to generate a particular annotation. The development of a set of minimal standards, including the requirement for annotated complete prokaryotic genomes to contain a full set of ribosomal RNAs, transfer RNAs, and proteins encoding core conserved functions, is an historic milestone. The use of these standards in existing genomes and future submissions will increase the quality of databases, enabling researchers to make accurate biological discoveries.
The Ditylenchus destructor genome provides new insights into the evolution of plant parasitic nematodes

PubMed Central

Zheng, Jinshui; Peng, Donghai; Chen, Ling; Liu, Hualin; Chen, Feng; Xu, Mengci; Ju, Shouyong; Ruan, Lifang

2016-01-01

Plant-parasitic nematodes were found in 4 of the 12 clades of phylum Nematoda. These nematodes in different clades may have originated independently from their free-living fungivorous ancestors. However, the exact evolutionary process of these parasites is unclear. Here, we sequenced the genome sequence of a migratory plant nematode, Ditylenchus destructor. We performed comparative genomics among the free-living nematode, Caenorhabditis elegans and all the plant nematodes with genome sequences available. We found that, compared with C. elegans, the core developmental control processes underwent heavy reduction, though most signal transduction pathways were conserved. We also found D. destructor contained more homologies of the key genes in the above processes than the other plant nematodes. We suggest that Ditylenchus spp. may be an intermediate evolutionary history stage from free-living nematodes that feed on fungi to obligate plant-parasitic nematodes. Based on the facts that D. destructor can feed on fungi and has a relatively short life cycle, and that it has similar features to both C. elegans and sedentary plant-parasitic nematodes from clade 12, we propose it as a new model to study the biology, biocontrol of plant nematodes and the interaction between nematodes and plants. PMID:27466450
The gene coding for small ribosomal subunit RNA in the basidiomycete Ustilago maydis contains a group I intron.

PubMed Central

De Wachter, R; Neefs, J M; Goris, A; Van de Peer, Y

1992-01-01

The nucleotide sequence of the gene coding for small ribosomal subunit RNA in the basidiomycete Ustilago maydis was determined. It revealed the presence of a group I intron with a length of 411 nucleotides. This is the third occurrence of such an intron discovered in a small subunit rRNA gene encoded by a eukaryotic nuclear genome. The other two occurrences are in Pneumocystis carinii, a fungus of uncertain taxonomic status, and Ankistrodesmus stipitatus, a green alga. The nucleotides of the conserved core structure of 101 group I intron sequences present in different genes and genome types were aligned and their evolutionary relatedness was examined. This revealed a cluster including all group I introns hitherto found in eukaryotic nuclear genes coding for small and large subunit rRNAs. A secondary structure model was designed for the area of the Ustilago maydis small ribosomal subunit RNA precursor where the intron is situated. It shows that the internal guide sequence pairing with the intron boundaries fits between two helices of the small subunit rRNA, and that minimal rearrangement of base pairs suffices to achieve the definitive secondary structure of the 18S rRNA upon splicing. PMID:1561081
Solving the Problem: Genome Annotation Standards before the Data Deluge

PubMed Central

Klimke, William; O'Donovan, Claire; White, Owen; Brister, J. Rodney; Clark, Karen; Fedorov, Boris; Mizrachi, Ilene; Pruitt, Kim D.; Tatusova, Tatiana

2011-01-01

The promise of genome sequencing was that the vast undiscovered country would be mapped out by comparison of the multitude of sequences available and would aid researchers in deciphering the role of each gene in every organism. Researchers recognize that there is a need for high quality data. However, different annotation procedures, numerous databases, and a diminishing percentage of experimentally determined gene functions have resulted in a spectrum of annotation quality. NCBI in collaboration with sequencing centers, archival databases, and researchers, has developed the first international annotation standards, a fundamental step in ensuring that high quality complete prokaryotic genomes are available as gold standard references. Highlights include the development of annotation assessment tools, community acceptance of protein naming standards, comparison of annotation resources to provide consistent annotation, and improved tracking of the evidence used to generate a particular annotation. The development of a set of minimal standards, including the requirement for annotated complete prokaryotic genomes to contain a full set of ribosomal RNAs, transfer RNAs, and proteins encoding core conserved functions, is an historic milestone. The use of these standards in existing genomes and future submissions will increase the quality of databases, enabling researchers to make accurate biological discoveries. PMID:22180819
Hallmarks of Hepatitis C Virus in Equine Hepacivirus

PubMed Central

Tanaka, Tomohisa; Kasai, Hirotake; Yamashita, Atsuya; Okuyama-Dobashi, Kaori; Yasumoto, Jun; Maekawa, Shinya; Enomoto, Nobuyuki; Okamoto, Toru; Matsuura, Yoshiharu; Morimatsu, Masami; Manabe, Noboru; Ochiai, Kazuhiko; Yamashita, Kazuto

2014-01-01

ABSTRACT Equine hepacivirus (EHcV) has been identified as a closely related homologue of hepatitis C virus (HCV) in the United States, the United Kingdom, and Germany, but not in Asian countries. In this study, we genetically and serologically screened 31 serum samples obtained from Japanese-born domestic horses for EHcV infection and subsequently identified 11 PCR-positive and 7 seropositive serum samples. We determined the full sequence of the EHcV genome, including the 3′ untranslated region (UTR), which had previously not been completely revealed. The polyprotein of a Japanese EHcV strain showed approximately 95% homology to those of the reported strains. HCV-like cis-acting RNA elements, including the stem-loop structures of the 3′ UTR and kissing-loop interaction were deduced from regions around both UTRs of the EHcV genome. A comparison of the EHcV and HCV core proteins revealed that Ile190 and Phe191 of the EHcV core protein could be important for cleavage of the core protein by signal peptide peptidase (SPP) and were replaced with Ala and Leu, respectively, which inhibited intramembrane cleavage of the EHcV core protein. The loss-of-function mutant of SPP abrogated intramembrane cleavage of the EHcV core protein and bound EHcV core protein, suggesting that the EHcV core protein may be cleaved by SPP to become a mature form. The wild-type EHcV core protein, but not the SPP-resistant mutant, was localized on lipid droplets and partially on the lipid raft-like membrane in a manner similar to that of the HCV core protein. These results suggest that EHcV may conserve the genetic and biological properties of HCV. IMPORTANCE EHcV, which shows the highest amino acid or nucleotide homology to HCV among hepaciviruses, was previously reported to infect horses from Western, but not Asian, countries. We herein report EHcV infection in Japanese-born horses. In this study, HCV-like RNA secondary structures around both UTRs were predicted by determining the whole-genome sequence of EHcV. Our results also suggest that the EHcV core protein is cleaved by SPP to become a mature form and then is localized on lipid droplets and partially on lipid raft-like membranes in a manner similar to that of the HCV core protein. Hence, EHcV was identified as a closely related homologue of HCV based on its genetic structure as well as its biological properties. A clearer understanding of the epidemiology, genetic structure, and infection mechanism of EHcV will assist in elucidating the evolution of hepaciviruses as well as the development of surrogate models for the study of HCV. PMID:25210167
Hallmarks of hepatitis C virus in equine hepacivirus.

PubMed

Tanaka, Tomohisa; Kasai, Hirotake; Yamashita, Atsuya; Okuyama-Dobashi, Kaori; Yasumoto, Jun; Maekawa, Shinya; Enomoto, Nobuyuki; Okamoto, Toru; Matsuura, Yoshiharu; Morimatsu, Masami; Manabe, Noboru; Ochiai, Kazuhiko; Yamashita, Kazuto; Moriishi, Kohji

2014-11-01

Equine hepacivirus (EHcV) has been identified as a closely related homologue of hepatitis C virus (HCV) in the United States, the United Kingdom, and Germany, but not in Asian countries. In this study, we genetically and serologically screened 31 serum samples obtained from Japanese-born domestic horses for EHcV infection and subsequently identified 11 PCR-positive and 7 seropositive serum samples. We determined the full sequence of the EHcV genome, including the 3' untranslated region (UTR), which had previously not been completely revealed. The polyprotein of a Japanese EHcV strain showed approximately 95% homology to those of the reported strains. HCV-like cis-acting RNA elements, including the stem-loop structures of the 3' UTR and kissing-loop interaction were deduced from regions around both UTRs of the EHcV genome. A comparison of the EHcV and HCV core proteins revealed that Ile(190) and Phe(191) of the EHcV core protein could be important for cleavage of the core protein by signal peptide peptidase (SPP) and were replaced with Ala and Leu, respectively, which inhibited intramembrane cleavage of the EHcV core protein. The loss-of-function mutant of SPP abrogated intramembrane cleavage of the EHcV core protein and bound EHcV core protein, suggesting that the EHcV core protein may be cleaved by SPP to become a mature form. The wild-type EHcV core protein, but not the SPP-resistant mutant, was localized on lipid droplets and partially on the lipid raft-like membrane in a manner similar to that of the HCV core protein. These results suggest that EHcV may conserve the genetic and biological properties of HCV. EHcV, which shows the highest amino acid or nucleotide homology to HCV among hepaciviruses, was previously reported to infect horses from Western, but not Asian, countries. We herein report EHcV infection in Japanese-born horses. In this study, HCV-like RNA secondary structures around both UTRs were predicted by determining the whole-genome sequence of EHcV. Our results also suggest that the EHcV core protein is cleaved by SPP to become a mature form and then is localized on lipid droplets and partially on lipid raft-like membranes in a manner similar to that of the HCV core protein. Hence, EHcV was identified as a closely related homologue of HCV based on its genetic structure as well as its biological properties. A clearer understanding of the epidemiology, genetic structure, and infection mechanism of EHcV will assist in elucidating the evolution of hepaciviruses as well as the development of surrogate models for the study of HCV. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Two Novel Rab2 Interactors Regulate Dense-core Vesicle Maturation

PubMed Central

Ailion, Michael; Hannemann, Mandy; Dalton, Susan; Pappas, Andrea; Watanabe, Shigeki; Hegermann, Jan; Liu, Qiang; Han, Hsiao-Fen; Gu, Mingyu; Goulding, Morgan Q.; Sasidharan, Nikhil; Schuske, Kim; Hullett, Patrick; Eimer, Stefan; Jorgensen, Erik M.

2014-01-01

Summary Peptide neuromodulators are released from a unique organelle: the dense-core vesicle. Dense-core vesicles are generated at the trans-Golgi, and then sort cargo during maturation before being secreted. To identify proteins that act in this pathway, we performed a genetic screen in Caenorhabditis elegans for mutants defective in dense-core vesicle function. We identified two conserved Rab2-binding proteins: RUND-1, a RUN domain protein, and CCCP-1, a coiled-coil protein. RUND-1 and CCCP-1 colocalize with RAB-2 at the Golgi, and rab-2, rund-1 and cccp-1 mutants have similar defects in sorting soluble and transmembrane dense-core vesicle cargos. RUND-1 also interacts with the Rab2 GAP protein TBC-8 and the BAR domain protein RIC-19, a RAB-2 effector. In summary, a new pathway of conserved proteins controls the maturation of dense-core vesicles at the trans-Golgi network. PMID:24698274
Core-SINE blocks comprise a large fraction of monotreme genomes; implications for vertebrate chromosome evolution.

PubMed

Kirby, Patrick J; Greaves, Ian K; Koina, Edda; Waters, Paul D; Marshall Graves, Jennifer A

2007-01-01

The genomes of the egg-laying platypus and echidna are of particular interest because monotremes are the most basal mammal group. The chromosomal distribution of an ancient family of short interspersed repeats (SINEs), the core-SINEs, was investigated to better understand monotreme genome organization and evolution. Previous studies have identified the core-SINE as the predominant SINE in the platypus genome, and in this study we quantified, characterized and localized subfamilies. Dot blot analysis suggested that a very large fraction (32% of the platypus and 16% of the echidna genome) is composed of Mon core-SINEs. Core-SINE-specific primers were used to amplify PCR products from platypus and echidna genomic DNA. Sequence analysis suggests a common consensus sequence Mon 1-B, shared by platypus and echidna, as well as platypus-specific Mon 1-C and echidna specific Mon 1-D consensus sequences. FISH mapping of the Mon core-SINE products to platypus metaphase spreads demonstrates that the Mon-1C subfamily is responsible for the striking Mon core-SINE accumulation in the distal regions of the six large autosomal pairs and the largest X chromosome. This unusual distribution highlights the dichotomy between the seven large chromosome pairs and the 19 smaller pairs in the monotreme karyotype, which has some similarity to the macro- and micro-chromosomes of birds and reptiles, and suggests that accumulation of repetitive sequences may have enlarged small chromosomes in an ancestral vertebrate. In the forthcoming sequence of the platypus genome there are still large gaps, and the extensive Mon core-SINE accumulation on the distal regions of the six large autosomal pairs may provide one explanation for this missing sequence.
CORE-SINEs: eukaryotic short interspersed retroposing elements with common sequence motifs.

PubMed

Gilbert, N; Labuda, D

1999-03-16

A 65-bp "core" sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3' ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome.
Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

PubMed

Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

2010-01-15

Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short. Copyright 2009 Elsevier Inc. All rights reserved.
Coiled-coil length: Size does matter.

PubMed

Surkont, Jaroslaw; Diekmann, Yoan; Ryder, Pearl V; Pereira-Leal, Jose B

2015-12-01

Protein evolution is governed by processes that alter primary sequence but also the length of proteins. Protein length may change in different ways, but insertions, deletions and duplications are the most common. An optimal protein size is a trade-off between sequence extension, which may change protein stability or lead to acquisition of a new function, and shrinkage that decreases metabolic cost of protein synthesis. Despite the general tendency for length conservation across orthologous proteins, the propensity to accept insertions and deletions is heterogeneous along the sequence. For example, protein regions rich in repetitive peptide motifs are well known to extensively vary their length across species. Here, we analyze length conservation of coiled-coils, domains formed by an ubiquitous, repetitive peptide motif present in all domains of life, that frequently plays a structural role in the cell. We observed that, despite the repetitive nature, the length of coiled-coil domains is generally highly conserved throughout the tree of life, even when the remaining parts of the protein change, including globular domains. Length conservation is independent of primary amino acid sequence variation, and represents a conservation of domain physical size. This suggests that the conservation of domain size is due to functional constraints. © 2015 Wiley Periodicals, Inc.
Protein Sectors: Statistical Coupling Analysis versus Conservation

PubMed Central

Teşileanu, Tiberiu; Colwell, Lucy J.; Leibler, Stanislas

2015-01-01

Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation. PMID:25723535
Mapping the transcription start points of the Staphylococcus aureus eap, emp, and vwb promoters reveals a conserved octanucleotide sequence that is essential for expression of these genes.

PubMed

Harraghy, Niamh; Homerova, Dagmar; Herrmann, Mathias; Kormanec, Jan

2008-01-01

Mapping the transcription start points of the eap, emp, and vwb promoters revealed a conserved octanucleotide sequence (COS). Deleting this sequence abolished the expression of eap, emp, and vwb. However, electrophoretic mobility shift assays gave no evidence that this sequence was a binding site for SarA or SaeR, known regulators of eap and emp.
Genome Sequencing of Ralstonia solanacearum CQPS-1, a Phylotype I Strain Collected from a Highland Area with Continuous Cropping of Tobacco

PubMed Central

Liu, Ying; Tang, Yuanman; Qin, Xiyun; Yang, Liang; Jiang, Gaofei; Li, Shili; Ding, Wei

2017-01-01

Ralstonia solanacearum, an agent of bacterial wilt, is a highly variable species with a broad host range and wide geographic distribution. As a species complex, it has extensive genetic diversity and its living environment is polymorphic like the lowland and the highland area, so more genomes are needed for studying population evolution and environment adaptation. In this paper, we reported the genome sequencing of R. solanacearum strain CQPS-1 isolated from wilted tobacco in Pengshui, Chongqing, China, a highland area with severely acidified soil and continuous cropping of tobacco more than 20 years. The comparative genomic analysis among different R. solanacearum strains was also performed. The completed genome size of CQPS-1 was 5.89 Mb and contained the chromosome (3.83 Mb) and the megaplasmid (2.06 Mb). A total of 5229 coding sequences were predicted (the chromosome and megaplasmid encoded 3573 and 1656 genes, respectively). A comparative analysis with eight strains from four phylotypes showed that there was some variation among the species, e.g., a large set of specific genes in CQPS-1. Type III secretion system gene cluster (hrp gene cluster) was conserved in CQPS-1 compared with the reference strain GMI1000. In addition, most genes coding core type III effectors were also conserved with GMI1000, but significant gene variation was found in the gene ripAA: the identity compared with strain GMI1000 was 75% and the hrpII box promoter in the upstream had significantly mutated. This study provided a potential resource for further understanding of the relationship between variation of pathogenicity factors and adaptation to the host environment. PMID:28620361
Genome-Wide Profiling of Small RNAs and Degradome Revealed Conserved Regulations of miRNAs on Auxin-Responsive Genes during Fruit Enlargement in Peaches

PubMed Central

Shi, Mengya; Hu, Xiao; Wei, Yu; Hou, Xu; Yuan, Xue; Liu, Jun; Liu, Yueping

2017-01-01

Auxin has long been known as a critical phytohormone that regulates fruit development in plants. However, due to the lack of an enlarged ovary wall in the model plants Arabidopsis and rice, the molecular regulatory mechanisms of fruit division and enlargement remain unclear. In this study, we performed small RNA sequencing and degradome sequencing analyses to systematically explore post-transcriptional regulation in the mesocarp at the hard core stage following treatment of the peach (Prunus persica L.) fruit with the synthetic auxin α-naphthylacetic acid (NAA). Our analyses identified 24 evolutionarily conserved miRNA genes as well as 16 predicted genes. Experimental verification showed that the expression levels of miR398 and miR408b were significantly upregulated after NAA treatment, whereas those of miR156, miR160, miR166, miR167, miR390, miR393, miR482, miR535 and miR2118 were significantly downregulated. Degradome sequencing coupled with miRNA target prediction analyses detected 119 significant cleavage sites on several mRNA targets, including SQUAMOSA promoter binding protein–like (SPL), ARF, (NAM, ATAF1/2 and CUC2) NAC, Arabidopsis thaliana homeobox protein (ATHB), the homeodomain-leucine zipper transcription factor revoluta(REV), (teosinte-like1, cycloidea and proliferating cell factor1) TCP and auxin signaling F-box protein (AFB) family genes. Our systematic profiling of miRNAs and the degradome in peach fruit suggests the existence of a post-transcriptional regulation network of miRNAs that target auxin pathway genes in fruit development. PMID:29236054
Clustering of Pan- and Core-genome of Lactobacillus provides Novel Evolutionary Insights for Differentiation.

PubMed

Inglin, Raffael C; Meile, Leo; Stevens, Marc J A

2018-04-24

Bacterial taxonomy aims to classify bacteria based on true evolutionary events and relies on a polyphasic approach that includes phenotypic, genotypic and chemotaxonomic analyses. Until now, complete genomes are largely ignored in taxonomy. The genus Lactobacillus consists of 173 species and many genomes are available to study taxonomy and evolutionary events. We analyzed and clustered 98 completely sequenced genomes of the genus Lactobacillus and 234 draft genomes of 5 different Lactobacillus species, i.e. L. reuteri, L. delbrueckii, L. plantarum, L. rhamnosus and L. helveticus. The core-genome of the genus Lactobacillus contains 266 genes and the pan-genome 20'800 genes. Clustering of the Lactobacillus pan- and core-genome resulted in two highly similar trees. This shows that evolutionary history is traceable in the core-genome and that clustering of the core-genome is sufficient to explore relationships. Clustering of core- and pan-genomes at species' level resulted in similar trees as well. Detailed analyses of the core-genomes showed that the functional class "genetic information processing" is conserved in the core-genome but that "signaling and cellular processes" is not. The latter class encodes functions that are involved in environmental interactions. Evolution of lactobacilli seems therefore directed by the environment. The type species L. delbrueckii was analyzed in detail and its pan-genome based tree contained two major clades whose members contained different genes yet identical functions. In addition, evidence for horizontal gene transfer between strains of L. delbrueckii, L. plantarum, and L. rhamnosus, and between species of the genus Lactobacillus is presented. Our data provide evidence for evolution of some lactobacilli according to a parapatric-like model for species differentiation. Core-genome trees are useful to detect evolutionary relationships in lactobacilli and might be useful in taxonomic analyses. Lactobacillus' evolution is directed by the environment and HGT.

Interactions of RadB, a DNA repair protein in archaea, with DNA and ATP.

PubMed

Guy, Colin P; Haldenby, Sam; Brindley, Amanda; Walsh, David A; Briggs, Geoffrey S; Warren, Martin J; Allers, Thorsten; Bolt, Edward L

2006-04-21

The RecA family of recombinases (RecA, Rad51, RadA and UvsX) catalyse strand-exchange between homologous DNA molecules by utilising conserved DNA-binding modules and a common core ATPase domain. RadB was identified in archaea as a Rad51-like protein on the basis of conserved ATPase sequences. However, RadB does not catalyse strand exchange and does not turn over ATP efficiently. RadB does bind DNA, and here we report a triplet of residues (Lys-His-Arg) that is highly conserved at the RadB C terminus, and is crucial for DNA binding. This is consistent with the motif forming a "basic patch" of highly conserved residues identified in an atomic structure of RadB from Thermococcus kodakaraensis. As the triplet motif is conserved at the C terminus of XRCC2 also, a mammalian Rad51-paralogue, we present a phylogenetic analysis that clarifies the relationship between RadB, Rad51-paralogues and recombinases. We investigate interactions between RadB and ATP using genetics and biochemistry; ATP binding by RadB is needed to promote survival of Haloferax volcanii after UV irradiation, and ATP, but not other NTPs, induces pronounced conformational change in RadB. This is the first genetic analysis of radB, and establishes its importance for maintaining genome stability in archaea. ATP-induced conformational change in RadB may explain previous reports that RadB controls Holliday junction resolution by Hjc, depending on the presence or the absence of ATP.
Conserved miRNAs Are Candidate Post-Transcriptional Regulators of Developmental Arrest in Free-Living and Parasitic Nematodes

PubMed Central

Ahmed, Rina; Chang, Zisong; Younis, Abuelhassan Elshazly; Langnick, Claudia; Li, Na; Chen, Wei; Brattig, Norbert; Dieterich, Christoph

2013-01-01

Animal development is complex yet surprisingly robust. Animals may develop alternative phenotypes conditional on environmental changes. Under unfavorable conditions, Caenorhabditis elegans larvae enter the dauer stage, a developmentally arrested, long-lived, and stress-resistant state. Dauer larvae of free-living nematodes and infective larvae of parasitic nematodes share many traits including a conserved endocrine signaling module (DA/DAF-12), which is essential for the formation of dauer and infective larvae. We speculated that conserved post-transcriptional regulatory mechanism might also be involved in executing the dauer and infective larvae fate. We used an unbiased sequencing strategy to characterize the microRNA (miRNA) gene complement in C. elegans, Pristionchus pacificus, and Strongyloides ratti. Our study raised the number of described miRNA genes to 257 for C. elegans, tripled the known gene set for P. pacificus to 362 miRNAs, and is the first to describe miRNAs in a Strongyloides parasite. Moreover, we found a limited core set of 24 conserved miRNA families in all three species. Interestingly, our estimated expression fold changes between dauer versus nondauer stages and infective larvae versus free-living stages reveal that despite the speed of miRNA gene set evolution in nematodes, homologous gene families with conserved “dauer-infective” expression signatures are present. These findings suggest that common post-transcriptional regulatory mechanisms are at work and that the same miRNA families play important roles in developmental arrest and long-term survival in free-living and parasitic nematodes. PMID:23729632
Elucidating the composition and conservation of the autophagy pathway in photosynthetic eukaryotes

PubMed Central

Shemi, Adva; Ben-Dor, Shifra; Vardi, Assaf

2015-01-01

Aquatic photosynthetic eukaryotes represent highly diverse groups (green, red, and chromalveolate algae) derived from multiple endosymbiosis events, covering a wide spectrum of the tree of life. They are responsible for about 50% of the global photosynthesis and serve as the foundation for oceanic and fresh water food webs. Although the ecophysiology and molecular ecology of some algal species are extensively studied, some basic aspects of algal cell biology are still underexplored. The recent wealth of genomic resources from algae has opened new frontiers to decipher the role of cell signaling pathways and their function in an ecological and biotechnological context. Here, we took a bioinformatic approach to explore the distribution and conservation of TOR and autophagy-related (ATG) proteins (Atg in yeast) in diverse algal groups. Our genomic analysis demonstrates conservation of TOR and ATG proteins in green algae. In contrast, in all 5 available red algal genomes, we could not detect the sequences that encode for any of the 17 core ATG proteins examined, albeit TOR and its interacting proteins are conserved. This intriguing data suggests that the autophagy pathway is not conserved in red algae as it is in the entire eukaryote domain. In contrast, chromalveolates, despite being derived from the red-plastid lineage, retain and express ATG genes, which raises a fundamental question regarding the acquisition of ATG genes during algal evolution. Among chromalveolates, Emiliania huxleyi (Haptophyta), a bloom-forming coccolithophore, possesses the most complete set of ATG genes, and may serve as a model organism to study autophagy in marine protists with great ecological significance. PMID:25915714
Antigenic potential of a highly conserved Neisseria meningitidis lipopolysaccharide inner core structure defined by chemical synthesis.

PubMed

Reinhardt, Anika; Yang, You; Claus, Heike; Pereira, Claney L; Cox, Andrew D; Vogel, Ulrich; Anish, Chakkumkal; Seeberger, Peter H

2015-01-22

Neisseria meningitidis is a leading cause of bacterial meningitis worldwide. We studied the potential of synthetic lipopolysaccharide (LPS) inner core structures as broadly protective antigens against N. meningitidis. Based on the specific reactivity of human serum antibodies to synthetic LPS cores, we selected a highly conserved LPS core tetrasaccharide as a promising antigen. This LPS inner core tetrasaccharide induced a robust IgG response in mice when formulated as an immunogenic glycoconjugate. Binding of raised mouse serum to a broad collection of N. meningitidis strains demonstrated the accessibility of the LPS core on viable bacteria. The distal trisaccharide was identified as the crucial epitope, whereas the proximal Kdo moiety was immunodominant and induced mainly nonprotective antibodies that are responsible for lack of functional protection in polyclonal serum. Our results identified key antigenic determinants of LPS core glycan and, hence, may aid the design of a broadly protective immunization against N. meningitidis. Copyright © 2015 Elsevier Ltd. All rights reserved.
Biophysical and structural considerations for protein sequence evolution

PubMed Central

2011-01-01

Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS < 1 and gamma-distributed rates across sites. Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model. PMID:22171550
Wyoming Basin Rapid Ecoregional Assessment

USGS Publications Warehouse

Carr, Natasha B.; Melcher, Cynthia P.

2015-08-28

We evaluated Management Questions (Core and Integrated) for each species and community for the Wyoming Basin REA. Core Management Questions address primary management issues, including (1) where is the Conservation Element, and what are its key ecological attributes (characteristics of species and communities that may affect their long-term persistence or viability); (2) what and where are the Change Agents; and (3) how do the Change Agents affect the key ecological attributes? Integrated Management Questions synthesize the Core Management Questions as follows: (1) where are the areas with high landscape-level ecological values; (2) where are the areas with high landscape-level risks; and (3) where are the potential areas for conservation, restoration, and development? The associated maps and key findings for each Management Question are summarized for each Conservation Element in individual chapters. Additional chapters on landscape intactness and an REA synthesis are included.
Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

PubMed

Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

2012-08-01

Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.
Evolutionary growth process of highly conserved sequences in vertebrate genomes.

PubMed

Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

2012-08-01

Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.
Two-dimensional solitons in conservative and parity-time-symmetric triple-core waveguides with cubic-quintic nonlinearity

NASA Astrophysics Data System (ADS)

Feijoo, David; Zezyulin, Dmitry A.; Konotop, Vladimir V.

2015-12-01

We analyze a system of three two-dimensional nonlinear Schrödinger equations coupled by linear terms and with the cubic-quintic (focusing-defocusing) nonlinearity. We consider two versions of the model: conservative and parity-time (PT ) symmetric. These models describe triple-core nonlinear optical waveguides, with balanced gain and losses in the PT -symmetric case. We obtain families of soliton solutions and discuss their stability. The latter study is performed using a linear stability analysis and checked with direct numerical simulations of the evolutional system of equations. Stable solitons are found in the conservative and PT -symmetric cases. Interactions and collisions between the conservative and PT -symmetric solitons are briefly investigated, as well.
Conserving migratory mule deer through the umbrella of sage-grouse

USGS Publications Warehouse

Copeland, H. E.; Sawyer, H.; Monteith, K. L.; Naugle, D.E.; Pocewicz, Amy; Graf, N.; Kauffman, Matthew J.

2014-01-01

Conserving migratory ungulates in increasingly human-dominated landscapes presents a difficult challenge to land managers and conservation practitioners. Nevertheless, ungulates may receive ancillary benefits from conservation actions designed to protect species of greater conservation priority where their ranges are sympatric. Greater Sage-Grouse (Centrocerus urophasianus), for example, have been proposed as an umbrella species for other sagebrush (Artemesia spp.)-dependent fauna. We examined a landscape where conservation efforts for sage-grouse overlap spatially with mule deer (Odocoileus hemionus) to determine whether sage-grouse conservation measures also might protect important mule deer migration routes and seasonal ranges. We conducted a spatial analysis to determine what proportion of migration routes, stopover areas, and winter ranges used by mule deer were located in areas managed for sage-grouse conservation. Conservation measures overlapped with 66–70% of migration corridors, 74–75% of stopovers, and 52–91% of wintering areas for two mule deer populations in the upper Green River Basin of Wyoming. Of those proportions, conservation actions targeted towards sage-grouse accounted for approximately half of the overlap in corridors and stopover areas, and nearly all overlap on winter ranges, indicating that sage-grouse conservation efforts represent an important step in conserving migratory mule deer. Conservation of migratory species presents unique challenges because although overlap with conserved lands may be high, connectivity of the entire route must be maintained as barriers to movement anywhere within the migration corridor could render it unviable. Where mule deer habitats overlap with sage-grouse core areas, our results indicate that increased protection is afforded to winter ranges and migration routes within the umbrella of sage-grouse conservation, but this protection is contingent on concentrated developments within core areas not intersecting with high-priority stopovers or corridors, and that the policy in turn does not encourage development on deer ranges outside of core areas. With the goal of protecting entire migration routes, our analysis highlights areas of potential conservation focus for mule deer, which are characterized by high exposure to residential development and use by a large proportion of migrating deer.
A chondroitin sulfate chain attached to the bone dentin matrix protein 1 NH2-terminal fragment.

PubMed

Qin, Chunlin; Huang, Bingzhen; Wygant, James N; McIntyre, Bradley W; McDonald, Charles H; Cook, Richard G; Butler, William T

2006-03-24

Dentin matrix protein 1 (DMP1) is an acidic noncollagenous protein shown by gene ablations to be critical for the proper mineralization of bone and dentin. In the extracellular matrix of these tissues DMP1 is present as fragments representing the NH2-terminal (37 kDa) and COOH-terminal (57 kDa) portions of the cDNA-deduced amino acid sequence. During our separation of bone noncollagenous proteins, we observed a high molecular weight, DMP1-related component (designated DMP1-PG). We purified DMP1-PG with a monoclonal anti-DMP1 antibody affinity column. Amino acid analysis and Edman degradation of tryptic peptides proved that the core protein for DMP1-PG is the 37-kDa fragment of DMP1. Chondroitinase treatments demonstrated that the slower migration rate of DMP1-PG is due to the presence of glycosaminoglycan. Quantitative disaccharide analysis indicated that the glycosaminoglycan is made predominantly of chondroitin 4-sulfate. Further analysis on tryptic peptides led us to conclude that a single glycosaminoglycan chain is linked to the core protein via Ser74, located in the Ser74-Gly75 dipeptide, an amino acid sequence specific for the attachment of glycosaminoglycans. Our findings show that in addition to its existence as a phosphoprotein, the NH2-terminal fragment from DMP1 occurs as a proteoglycan. Amino acid sequence alignment analysis showed that the Ser74-Gly75 dipeptide and its flanking regions are highly conserved among a wide range of species from caiman to the Homo sapiens, indicating that this glycosaminoglycan attachment domain has survived an extremely long period of evolution pressure, suggesting that the glycosaminoglycan may be critical for the basic biological functions of DMP1.
Exploring the Genomic Traits of Non-toxigenic Vibrio parahaemolyticus Strains Isolated in Southern Chile

PubMed Central

Castillo, Daniel; Pérez-Reytor, Diliana; Plaza, Nicolás; Ramírez-Araya, Sebastián; Blondel, Carlos J.; Corsini, Gino; Bastías, Roberto; Loyola, David E.; Jaña, Víctor; Pavez, Leonardo; García, Katherine

2018-01-01

Vibrio parahaemolyticus is the leading cause of seafood-borne gastroenteritis worldwide. As reported in other countries, after the rise and fall of the pandemic strain in Chile, other post-pandemic strains have been associated with clinical cases, including strains lacking the major toxins TDH and TRH. Since the presence or absence of tdh and trh genes has been used for diagnostic purposes and as a proxy of the virulence of V. parahaemolyticus isolates, the understanding of virulence in V. parahaemolyticus strains lacking toxins is essential to detect these strains present in water and marine products to avoid possible food-borne infection. In this study, we characterized the genome of four environmental and two clinical non-toxigenic strains (tdh-, trh-, and T3SS2-). Using whole-genome sequencing, phylogenetic, and comparative genome analysis, we identified the core and pan-genome of V. parahaemolyticus of strains of southern Chile. The phylogenetic tree based on the core genome showed low genetic diversity but the analysis of the pan-genome revealed that all strains harbored genomic islands carrying diverse virulence and fitness factors or prophage-like elements that encode toxins like Zot and RTX. Interestingly, the three strains carrying Zot-like toxin have a different sequence, although the alignment showed some conserved areas with the zot sequence found in V. cholerae. In addition, we identified an unexpected diversity in the genetic architecture of the T3SS1 gene cluster and the presence of the T3SS2 gene cluster in a non-pandemic environmental strain. Our study sheds light on the diversity of V. parahaemolyticus strains from the southern Pacific which increases our current knowledge regarding the global diversity of this organism. PMID:29472910
Crystal structure analysis of a bacterial aryl acylamidase belonging to the amidase signature enzyme family

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Saeyoung; Park, Eun-Hye; Ko, Hyeok-Jin

2015-11-13

The atomic structure of a bacterial aryl acylamidase (EC 3.5.1.13; AAA) is reported and structural features are investigated to better understand the catalytic profile of this enzyme. Structures of AAA were determined in its native form and in complex with the analgesic acetanilide, p-acetaminophenol, at 1.70 Å and 1.73 Å resolutions, respectively. The overall structural fold of AAA was identified as an α/β fold class, exhibiting an open twisted β-sheet core surrounded by α-helices. The asymmetric unit contains one AAA molecule and the monomeric form is functionally active. The core structure enclosing the signature sequence region, including the canonical Ser-cisSer-Lys catalytic triad,more » is conserved in all members of the Amidase Signature enzyme family. The structure of AAA in a complex with its ligand reveals a unique organization in the substrate-binding pocket. The binding pocket consists of two loops (loop1 and loop2) in the amidase signature sequence and one helix (α10) in the non-amidase signature sequence. We identified two residues (Tyr{sup 136} and Thr{sup 330}) that interact with the ligand via water molecules, and a hydrogen-bonding network that explains the catalytic affinity over various aryl acyl compounds. The optimum activity of AAA at pH > 10 suggests that the reaction mechanism employs Lys{sup 84} as the catalytic base to polarize the Ser{sup 187} nucleophile in the catalytic triad. - Highlights: • We determined the first structure of a bacterial aryl acylamidase (EC 3.5.1.13). • Structure revealed spatially distinct architecture of the substrate-binding pocket. • Hydrogen-bonding with Tyr{sup 136} and Thr{sup 330} mediates ligand-binding and substrate.« less
Genome Sequencing of Sulfolobus sp. A20 from Costa Rica and Comparative Analyses of the Putative Pathways of Carbon, Nitrogen, and Sulfur Metabolism in Various Sulfolobus Strains.

PubMed

Dai, Xin; Wang, Haina; Zhang, Zhenfeng; Li, Kuan; Zhang, Xiaoling; Mora-López, Marielos; Jiang, Chengying; Liu, Chang; Wang, Li; Zhu, Yaxin; Hernández-Ascencio, Walter; Dong, Zhiyang; Huang, Li

2016-01-01

The genome of Sulfolobus sp. A20 isolated from a hot spring in Costa Rica was sequenced. This circular genome of the strain is 2,688,317 bp in size and 34.8% in G+C content, and contains 2591 open reading frames (ORFs). Strain A20 shares ~95.6% identity at the 16S rRNA gene sequence level and <30% DNA-DNA hybridization (DDH) values with the most closely related known Sulfolobus species (i.e., Sulfolobus islandicus and Sulfolobus solfataricus ), suggesting that it represents a novel Sulfolobus species. Comparison of the genome of strain A20 with those of the type strains of S. solfataricus, Sulfolobus acidocaldarius, S. islandicus , and Sulfolobus tokodaii , which were isolated from geographically separated areas, identified 1801 genes conserved among all Sulfolobus species analyzed (core genes). Comparative genome analyses show that central carbon metabolism in Sulfolobus is highly conserved, and enzymes involved in the Entner-Doudoroff pathway, the tricarboxylic acid cycle and the CO 2 fixation pathways are predominantly encoded by the core genes. All Sulfolobus species encode genes required for the conversion of ammonium into glutamate/glutamine. Some Sulfolobus strains have gained the ability to utilize additional nitrogen source such as nitrate (i.e., S. islandicus strain REY15A, LAL14/1, M14.25, and M16.27) or urea (i.e., S. islandicus HEV10/4, S. tokodaii strain7, and S. metallicus DSM 6482). The strategies for sulfur metabolism are most diverse and least understood. S. tokodaii encodes sulfur oxygenase/reductase (SOR), whereas both S. islandicus and S. solfataricus contain genes for sulfur reductase (SRE). However, neither SOR nor SRE genes exist in the genome of strain A20, raising the possibility that an unknown pathway for the utilization of elemental sulfur may be present in the strain. The ability of Sulfolobus to utilize nitrate or sulfur is encoded by a gene cluster flanked by IS elements or their remnants. These clusters appear to have become fixed at a specific genomic site in some strains and lost in other strains during the course of evolution. The versatility in nitrogen and sulfur metabolism may represent adaptation of Sulfolobus to thriving in different habitats.
Genome Sequencing of Sulfolobus sp. A20 from Costa Rica and Comparative Analyses of the Putative Pathways of Carbon, Nitrogen, and Sulfur Metabolism in Various Sulfolobus Strains

PubMed Central

Dai, Xin; Wang, Haina; Zhang, Zhenfeng; Li, Kuan; Zhang, Xiaoling; Mora-López, Marielos; Jiang, Chengying; Liu, Chang; Wang, Li; Zhu, Yaxin; Hernández-Ascencio, Walter; Dong, Zhiyang; Huang, Li

2016-01-01

The genome of Sulfolobus sp. A20 isolated from a hot spring in Costa Rica was sequenced. This circular genome of the strain is 2,688,317 bp in size and 34.8% in G+C content, and contains 2591 open reading frames (ORFs). Strain A20 shares ~95.6% identity at the 16S rRNA gene sequence level and <30% DNA-DNA hybridization (DDH) values with the most closely related known Sulfolobus species (i.e., Sulfolobus islandicus and Sulfolobus solfataricus), suggesting that it represents a novel Sulfolobus species. Comparison of the genome of strain A20 with those of the type strains of S. solfataricus, Sulfolobus acidocaldarius, S. islandicus, and Sulfolobus tokodaii, which were isolated from geographically separated areas, identified 1801 genes conserved among all Sulfolobus species analyzed (core genes). Comparative genome analyses show that central carbon metabolism in Sulfolobus is highly conserved, and enzymes involved in the Entner-Doudoroff pathway, the tricarboxylic acid cycle and the CO2 fixation pathways are predominantly encoded by the core genes. All Sulfolobus species encode genes required for the conversion of ammonium into glutamate/glutamine. Some Sulfolobus strains have gained the ability to utilize additional nitrogen source such as nitrate (i.e., S. islandicus strain REY15A, LAL14/1, M14.25, and M16.27) or urea (i.e., S. islandicus HEV10/4, S. tokodaii strain7, and S. metallicus DSM 6482). The strategies for sulfur metabolism are most diverse and least understood. S. tokodaii encodes sulfur oxygenase/reductase (SOR), whereas both S. islandicus and S. solfataricus contain genes for sulfur reductase (SRE). However, neither SOR nor SRE genes exist in the genome of strain A20, raising the possibility that an unknown pathway for the utilization of elemental sulfur may be present in the strain. The ability of Sulfolobus to utilize nitrate or sulfur is encoded by a gene cluster flanked by IS elements or their remnants. These clusters appear to have become fixed at a specific genomic site in some strains and lost in other strains during the course of evolution. The versatility in nitrogen and sulfur metabolism may represent adaptation of Sulfolobus to thriving in different habitats. PMID:27965637
Probability of lek collapse is lower inside sage-grouse Core Areas: Effectiveness of conservation policy for a landscape species.

PubMed

Spence, Emma Suzuki; Beck, Jeffrey L; Gregory, Andrew J

2017-01-01

Greater sage-grouse (Centrocercus urophasianus) occupy sagebrush (Artemisia spp.) habitats in 11 western states and 2 Canadian provinces. In September 2015, the U.S. Fish and Wildlife Service announced the listing status for sage-grouse had changed from warranted but precluded to not warranted. The primary reason cited for this change of status was that the enactment of new regulatory mechanisms was sufficient to protect sage-grouse populations. One such plan is the 2008, Wyoming Sage Grouse Executive Order (SGEO), enacted by Governor Freudenthal. The SGEO identifies "Core Areas" that are to be protected by keeping them relatively free from further energy development and limiting other forms of anthropogenic disturbances near active sage-grouse leks. Using the Wyoming Game and Fish Department's sage-grouse lek count database and the Wyoming Oil and Gas Conservation Commission database of oil and gas well locations, we investigated the effectiveness of Wyoming's Core Areas, specifically: 1) how well Core Areas encompass the distribution of sage-grouse in Wyoming, 2) whether Core Area leks have a reduced probability of lek collapse, and 3) what, if any, edge effects intensification of oil and gas development adjacent to Core Areas may be having on Core Area populations. Core Areas contained 77% of male sage-grouse attending leks and 64% of active leks. Using Bayesian binomial probability analysis, we found an average 10.9% probability of lek collapse in Core Areas and an average 20.4% probability of lek collapse outside Core Areas. Using linear regression, we found development density outside Core Areas was related to the probability of lek collapse inside Core Areas. Specifically, probability of collapse among leks >4.83 km from inside Core Area boundaries was significantly related to well density within 1.61 km (1-mi) and 4.83 km (3-mi) outside of Core Area boundaries. Collectively, these data suggest that the Wyoming Core Area Strategy has benefited sage-grouse and sage-grouse habitat conservation; however, additional guidelines limiting development densities adjacent to Core Areas may be necessary to effectively protect Core Area populations.
Core RNAi machinery and gene knockdown in the emerald ash borer (Agrilus planipennis).

PubMed

Zhao, Chaoyang; Alvarez Gonzales, Miguel A; Poland, Therese M; Mittapalli, Omprakash

2015-01-01

The RNA interference (RNAi) technology has been widely used in insect functional genomics research and provides an alternative approach for insect pest management. To understand whether the emerald ash borer (Agrilus planipennis), an invasive and destructive coleopteran insect pest of ash tree (Fraxinus spp.), possesses a strong RNAi machinery that is capable of degrading target mRNA as a response to exogenous double-stranded RNA (dsRNA) induction, we identified three RNAi pathway core component genes, Dicer-2, Argonaute-2 and R2D2, from the A. planipennis genome sequence. Characterization of these core components revealed that they contain conserved domains essential for the proteins to function in the RNAi pathway. Phylogenetic analyses showed that they are closely related to homologs derived from other coleopteran species. We also delivered the dsRNA fragment of AplaScrB-2, a β-fructofuranosidase-encoding gene horizontally acquired by A. planipennis as we reported previously, into A. planipennis adults through microinjection. Quantitative real-time PCR analysis on the dsRNA-treated beetles demonstrated a significantly decreased gene expression level of AplaScrB-2 appearing on day 2 and lasting until at least day 6. This study is the first record of RNAi applied in A. planipennis. Copyright © 2015 Elsevier Ltd. All rights reserved.
cisprimertool: software to implement a comparative genomics strategy for the development of conserved intron scanning (CIS) markers.

PubMed

Jayashree, B; Jagadeesh, V T; Hoisington, D

2008-05-01

The availability of complete, annotated genomic sequence information in model organisms is a rich resource that can be extended to understudied orphan crops through comparative genomic approaches. We report here a software tool (cisprimertool) for the identification of conserved intron scanning regions using expressed sequence tag alignments to a completely sequenced model crop genome. The method used is based on earlier studies reporting the assessment of conserved intron scanning primers (called CISP) within relatively conserved exons located near exon-intron boundaries from onion, banana, sorghum and pearl millet alignments with rice. The tool is freely available to academic users at http://www.icrisat.org/gt-bt/CISPTool.htm. © 2007 ICRISAT.
Conserved structures formed by heterogeneous RNA sequences drive silencing of an inflammation responsive post-transcriptional operon

PubMed Central

Basu, Abhijit; Jain, Niyati; Tolbert, Blanton S.; Komar, Anton A.

2017-01-01

Abstract RNA–protein interactions with physiological outcomes usually rely on conserved sequences within the RNA element. By contrast, activity of the diverse gamma-interferon-activated inhibitor of translation (GAIT)-elements relies on the conserved RNA folding motifs rather than the conserved sequence motifs. These elements drive the translational silencing of a group of chemokine (CC/CXC) and chemokine receptor (CCR) mRNAs, thereby helping to resolve physiological inflammation. Despite sequence dissimilarity, these RNA elements adopt common secondary structures (as revealed by 2D-1H NMR spectroscopy), providing a basis for their interaction with the RNA-binding GAIT complex. However, many of these elements (e.g. those derived from CCL22, CXCL13, CCR4 and ceruloplasmin (Cp) mRNAs) have substantially different affinities for GAIT complex binding. Toeprinting analysis shows that different positions within the overall conserved GAIT element structure contribute to differential affinities of the GAIT protein complex towards the elements. Thus, heterogeneity of GAIT elements may provide hierarchical fine-tuning of the resolution of inflammation. PMID:29069516
Phylogenetic analysis reveals conservation and diversification of micro RNA166 genes among diverse plant species.

PubMed

Barik, Suvakanta; SarkarDas, Shabari; Singh, Archita; Gautam, Vibhav; Kumar, Pramod; Majee, Manoj; Sarkar, Ananda K

2014-01-01

Similar to the majority of the microRNAs, mature miR166s are derived from multiple members of MIR166 genes (precursors) and regulate various aspects of plant development by negatively regulating their target genes (Class III HD-ZIP). The evolutionary conservation or functional diversification of miRNA166 family members remains elusive. Here, we show the phylogenetic relationships among MIR166 precursor and mature sequences from three diverse model plant species. Despite strong conservation, some mature miR166 sequences, such as ppt-miR166m, have undergone sequence variation. Critical sequence variation in ppt-miR166m has led to functional diversification, as it targets non-HD-ZIPIII gene transcript (s). MIR166 precursor sequences have diverged in a lineage specific manner, and both precursors and mature osa-miR166i/j are highly conserved. Interestingly, polycistronic MIR166s were present in Physcomitrella and Oryza but not in Arabidopsis. The nature of cis-regulatory motifs on the upstream promoter sequences of MIR166 genes indicates their possible contribution to the functional variation observed among miR166 species. Copyright © 2013 Elsevier Inc. All rights reserved.

SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments.

PubMed

Ajawatanawong, Pravech; Atkinson, Gemma C; Watson-Haigh, Nathan S; Mackenzie, Bryony; Baldauf, Sandra L

2012-07-01

Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple alignments, particularly insertions/deletions (indels) in protein sequences. We have developed Sequence Feature and Indel Region Extractor (SeqFIRE) to enable the automated identification and extraction of indels from protein sequence alignments. The program can also extract conserved blocks and identify fast evolving sites using a combination of conservation and entropy. All major variables can be adjusted by the user, allowing them to identify the sets of variables most suited to a particular analysis or dataset. Thus, all major tasks in preparing an alignment for further analysis are combined in a single flexible and user-friendly program. The output includes a numbered list of indels, alignments in NEXUS format with indels annotated or removed and indel-only matrices. SeqFIRE is a user-friendly web application, freely available online at www.seqfire.org/.
The conserved baculovirus protein p33 (Ac92) is a flavin adenine dinucleotide-linked sulfhydryl oxidase

DOE Office of Scientific and Technical Information (OSTI.GOV)

Long, C.M.; Rohrmann, G.F.; Merrill, G.F., E-mail: merrillg@onid.orst.ed

2009-06-05

Open reading frame 92 of the Autographa californica baculovirus (Ac92) is one of about 30 core genes present in all sequenced baculovirus genomes. Computer analyses predicted that the Ac92 encoded protein (called p33) and several of its baculovirus orthologs were related to a family of flavin adenine dinucleotide (FAD)-linked sulfhydryl oxidases. Alignment of these proteins indicated that, although they were highly diverse, a number of amino acids in common with the Erv1p/Alrp family of sulfhydryl oxidases are present. Some of these conserved amino acids are predicted to stack against the isoalloxazine and adenine components of FAD, whereas others are involvedmore » in electron transfer. To investigate this relationship, Ac92 was expressed in bacteria as a His-tagged fusion protein, purified, and characterized both spectrophotometrically and for its enzymatic activity. The purified protein was found to have the color (yellow) and absorption spectrum consistent with it being a FAD-containing protein. Furthermore, it was demonstrated to have sulfhydryl oxidase activity using dithiothreitol and thioredoxin as substrates.« less
The conserved baculovirus protein p33 (Ac92) is a flavin adenine dinucleotide-linked sulfhydryl oxidase.

PubMed

Long, C M; Rohrmann, G F; Merrill, G F

2009-06-05

Open reading frame 92 of the Autographa californica baculovirus (Ac92) is one of about 30 core genes present in all sequenced baculovirus genomes. Computer analyses predicted that the Ac92 encoded protein (called p33) and several of its baculovirus orthologs were related to a family of flavin adenine dinucleotide (FAD)-linked sulfhydryl oxidases. Alignment of these proteins indicated that, although they were highly diverse, a number of amino acids in common with the Erv1p/Alrp family of sulfhydryl oxidases are present. Some of these conserved amino acids are predicted to stack against the isoalloxazine and adenine components of FAD, whereas others are involved in electron transfer. To investigate this relationship, Ac92 was expressed in bacteria as a His-tagged fusion protein, purified, and characterized both spectrophotometrically and for its enzymatic activity. The purified protein was found to have the color (yellow) and absorption spectrum consistent with it being a FAD-containing protein. Furthermore, it was demonstrated to have sulfhydryl oxidase activity using dithiothreitol and thioredoxin as substrates.
The D1 and D2 proteins of dinoflagellates: unusually accumulated mutations which influence on PSII photoreaction.

PubMed

Iida, Satoko; Kobiyama, Atsushi; Ogata, Takehiko; Murakami, Akio

2008-01-01

Plastid encoded genes of the dinoflagellates are rapidly evolving and most divergent. The importance of unusually accumulated mutations on structure of PSII core protein and photosynthetic function was examined in the dinoflagellates, Symbiodinium sp. and Alexandrium tamarense. Full-length cDNA sequences of psbA (D1 protein) and psbD (D2 protein) were obtained and compared with the other oxygen-evolving photoautotrophs. Twenty-three amino acid positions (7%) for the D1 protein and 34 positions (10%) for the D2 were mutated in the dinoflagellates, although amino acid residues at these positions were conserved in cyanobacteria, the other algae, and plant. Many mutations were likely to distribute in the N-terminus and the D-E interhelical loop of the D1 protein and helix B of D2 protein, while the remaining regions were well conserved. The different structural properties in these mutated regions were supported by hydropathy profiles. The chlorophyll fluorescence kinetics of the dinoflagellates was compared with Synechocystis sp. PCC6803 in relation to the altered protein structure.
A proposed OB-fold with a protein-interaction surface in Candida albicans telomerase protein Est3

PubMed Central

Yu, Eun Young; Wang, Feng; Lei, Ming; Lue, Neal F

2008-01-01

Ever shorter telomeres 3 (Est3) is an essential telomerase regulatory subunit thought to be unique to budding yeasts. Here we use multiple sequence alignment and hidden Markov model–hidden Markov model (HMM-HMM) comparison to uncover potential similarities between Est3 and the mammalian telomeric protein Tpp1. Analysis of site-specific mutants of Candida albicans Est3 revealed functional distinctions between residues that are conserved between Est3 and Tpp1 and those that are unique to Est3. Although both types of residues are important for telomere maintenance in vivo, only the former contributes to telomerase activity in vitro and facilitates the association of Est3 with telomerase core components. Consistent with a function in protein-protein interaction, the residues common to Est3 and Tpp1 map to one face of an OB-fold model structure, away from the canonical nucleic acid binding surface. We propose that Est3 and the OB-fold domain of Tpp1 mediate a conserved function in telomerase regulation. PMID:19172753
Highly designable phenotypes and mutational buffers emerge from a systematic mapping between network topology and dynamic output.

PubMed

Nochomovitz, Yigal D; Li, Hao

2006-03-14

Deciphering the design principles for regulatory networks is fundamental to an understanding of biological systems. We have explored the mapping from the space of network topologies to the space of dynamical phenotypes for small networks. Using exhaustive enumeration of a simple model of three- and four-node networks, we demonstrate that certain dynamical phenotypes can be generated by an atypically broad spectrum of network topologies. Such dynamical outputs are highly designable, much like certain protein structures can be designed by an unusually broad spectrum of sequences. The network topologies that encode a highly designable dynamical phenotype possess two classes of connections: a fully conserved core of dedicated connections that encodes the stable dynamical phenotype and a partially conserved set of variable connections that controls the transient dynamical flow. By comparing the topologies and dynamics of the three- and four-node network ensembles, we observe a large number of instances of the phenomenon of "mutational buffering," whereby addition of a fourth node suppresses phenotypic variation amongst a set of three-node networks.
Identification and characterization of the autophagy-related genes Atg12 and Atg5 in hydra.

PubMed

Dixit, Nishikant S; Shravage, Bhupendra V; Ghaskadbi, Surendra

2017-01-01

Autophagy is an evolutionarily conserved process in eukaryotic cells that is involved in the degradation of cytoplasmic contents including organelles via the lysosome. Hydra is an early metazoan which exhibits simple tissue grade organization, a primitive nervous system, and is one of the classical non-bilaterian models extensively used in evo-devo research. Here, we describe the characterization of two core autophagy genes, Atg12 and Atg5, from hydra. In silico analyses including sequence similarity, domain analysis, and phylogenetic analysis demonstrate the conservation of these genes across eukaryotes. The predicted 3D structure of hydra Atg12 showed very little variance when compared to human Atg12 and yeast Atg12, whereas the hydra Atg5 predicted 3D structure was found to be variable, when compared with its human and yeast homologs. Strikingly, whole mount in situ hybridization showed high expression of Atg12 transcripts specifically in nematoblasts, whereas Atg5 transcripts were found to be expressed strongly in budding region and growing buds. This study may provide a framework to understand the evolution of autophagy networks in higher eukaryotes.
Multicore and GPU algorithms for Nussinov RNA folding

PubMed Central

2014-01-01

Background One segment of a RNA sequence might be paired with another segment of the same RNA sequence due to the force of hydrogen bonds. This two-dimensional structure is called the RNA sequence's secondary structure. Several algorithms have been proposed to predict an RNA sequence's secondary structure. These algorithms are referred to as RNA folding algorithms. Results We develop cache efficient, multicore, and GPU algorithms for RNA folding using Nussinov's algorithm. Conclusions Our cache efficient algorithm provides a speedup between 1.6 and 3.0 relative to a naive straightforward single core code. The multicore version of the cache efficient single core algorithm provides a speedup, relative to the naive single core algorithm, between 7.5 and 14.0 on a 6 core hyperthreaded CPU. Our GPU algorithm for the NVIDIA C2050 is up to 1582 times as fast as the naive single core algorithm and between 5.1 and 11.2 times as fast as the fastest previously known GPU algorithm for Nussinov RNA folding. PMID:25082539
Defining the Estimated Core Genome of Bacterial Populations Using a Bayesian Decision Model

PubMed Central

van Tonder, Andries J.; Mistry, Shilan; Bray, James E.; Hill, Dorothea M. C.; Cody, Alison J.; Farmer, Chris L.; Klugman, Keith P.; von Gottberg, Anne; Bentley, Stephen D.; Parkhill, Julian; Jolley, Keith A.; Maiden, Martin C. J.; Brueggemann, Angela B.

2014-01-01

The bacterial core genome is of intense interest and the volume of whole genome sequence data in the public domain available to investigate it has increased dramatically. The aim of our study was to develop a model to estimate the bacterial core genome from next-generation whole genome sequencing data and use this model to identify novel genes associated with important biological functions. Five bacterial datasets were analysed, comprising 2096 genomes in total. We developed a Bayesian decision model to estimate the number of core genes, calculated pairwise evolutionary distances (p-distances) based on nucleotide sequence diversity, and plotted the median p-distance for each core gene relative to its genome location. We designed visually-informative genome diagrams to depict areas of interest in genomes. Case studies demonstrated how the model could identify areas for further study, e.g. 25% of the core genes with higher sequence diversity in the Campylobacter jejuni and Neisseria meningitidis genomes encoded hypothetical proteins. The core gene with the highest p-distance value in C. jejuni was annotated in the reference genome as a putative hydrolase, but further work revealed that it shared sequence homology with beta-lactamase/metallo-beta-lactamases (enzymes that provide resistance to a range of broad-spectrum antibiotics) and thioredoxin reductase genes (which reduce oxidative stress and are essential for DNA replication) in other C. jejuni genomes. Our Bayesian model of estimating the core genome is principled, easy to use and can be applied to large genome datasets. This study also highlighted the lack of knowledge currently available for many core genes in bacterial genomes of significant global public health importance. PMID:25144616
Principles of regulatory information conservation between mouse and human.

PubMed

Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; Wu, Weisheng; Cayting, Philip; Boyle, Alan P; Sundaram, Vasavi; Xing, Xiaoyun; Dogan, Nergiz; Li, Jingjing; Euskirchen, Ghia; Lin, Shin; Lin, Yiing; Visel, Axel; Kawli, Trupti; Yang, Xinqiong; Patacsil, Dorrelyn; Keller, Cheryl A; Giardine, Belinda; Kundaje, Anshul; Wang, Ting; Pennacchio, Len A; Weng, Zhiping; Hardison, Ross C; Snyder, Michael P

2014-11-20

To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.
The Variable Regions of Lactobacillus rhamnosus Genomes Reveal the Dynamic Evolution of Metabolic and Host-Adaptation Repertoires

PubMed Central

Ceapa, Corina; Davids, Mark; Ritari, Jarmo; Lambert, Jolanda; Wels, Michiel; Douillard, François P.; Smokvina, Tamara; de Vos, Willem M.; Knol, Jan; Kleerebezem, Michiel

2016-01-01

Lactobacillus rhamnosus is a diverse Gram-positive species with strains isolated from different ecological niches. Here, we report the genome sequence analysis of 40 diverse strains of L. rhamnosus and their genomic comparison, with a focus on the variable genome. Genomic comparison of 40 L. rhamnosus strains discriminated the conserved genes (core genome) and regions of plasticity involving frequent rearrangements and horizontal transfer (variome). The L. rhamnosus core genome encompasses 2,164 genes, out of 4,711 genes in total (the pan-genome). The accessory genome is dominated by genes encoding carbohydrate transport and metabolism, extracellular polysaccharides (EPS) biosynthesis, bacteriocin production, pili production, the cas system, and the associated clustered regularly interspaced short palindromic repeat (CRISPR) loci, and more than 100 transporter functions and mobile genetic elements like phages, plasmid genes, and transposons. A clade distribution based on amino acid differences between core (shared) proteins matched with the clade distribution obtained from the presence–absence of variable genes. The phylogenetic and variome tree overlap indicated that frequent events of gene acquisition and loss dominated the evolutionary segregation of the strains within this species, which is paralleled by evolutionary diversification of core gene functions. The CRISPR-Cas system could have contributed to this evolutionary segregation. Lactobacillus rhamnosus strains contain the genetic and metabolic machinery with strain-specific gene functions required to adapt to a large range of environments. A remarkable congruency of the evolutionary relatedness of the strains’ core and variome functions, possibly favoring interspecies genetic exchanges, underlines the importance of gene-acquisition and loss within the L. rhamnosus strain diversification. PMID:27358423
Functionally essential, invariant glutamate near the C-terminus of strand beta 5 in various (alpha/beta)8-barrel enzymes as a possible indicator of their evolutionary relatedness.

PubMed

Janecek, S; Baláz, S

1995-08-01

Twelve different (alpha/beta)8-barrel enzymes belonging to three structurally distinct families were found to contain, near the C-terminus of their strand beta 5, a conserved invariant glutamic acid residue that plays an important functional role in each of these enzymes. The search was based on the idea that a conserved sequence region of an (alpha/beta)8-barrel enzyme should be more or less conserved also in the equivalent part of the structure of the other enzymes with this folding motif owing to their mutual evolutionary relatedness. For this purpose, the sequence region around the well conserved fifth beta-strand of alpha-amylase containing catalytic glutamate (Glu230, Aspergillus oryzae alpha-amylase numbering), was used as the sequence-structural template. The isolated sequence stretches of the 12 (alpha/beta)8-barrels are discussed from both the sequence-structural and the evolutionary point of view, the invariant glutamate residue being proposed to be a joining feature of the studied group of enzymes remaining from their ancestral (alpha/beta)8-barrel.
Tertiary model of a plant cellulose synthase

PubMed Central

Sethaphong, Latsavongsakda; Haigler, Candace H.; Kubicki, James D.; Zimmer, Jochen; Bonetta, Dario; DeBolt, Seth; Yingling, Yaroslava G.

2013-01-01

A 3D atomistic model of a plant cellulose synthase (CESA) has remained elusive despite over forty years of experimental effort. Here, we report a computationally predicted 3D structure of 506 amino acids of cotton CESA within the cytosolic region. Comparison of the predicted plant CESA structure with the solved structure of a bacterial cellulose-synthesizing protein validates the overall fold of the modeled glycosyltransferase (GT) domain. The coaligned plant and bacterial GT domains share a six-stranded β-sheet, five α-helices, and conserved motifs similar to those required for catalysis in other GT-2 glycosyltransferases. Extending beyond the cross-kingdom similarities related to cellulose polymerization, the predicted structure of cotton CESA reveals that plant-specific modules (plant-conserved region and class-specific region) fold into distinct subdomains on the periphery of the catalytic region. Computational results support the importance of the plant-conserved region and/or class-specific region in CESA oligomerization to form the multimeric cellulose–synthesis complexes that are characteristic of plants. Relatively high sequence conservation between plant CESAs allowed mapping of known mutations and two previously undescribed mutations that perturb cellulose synthesis in Arabidopsis thaliana to their analogous positions in the modeled structure. Most of these mutation sites are near the predicted catalytic region, and the confluence of other mutation sites supports the existence of previously undefined functional nodes within the catalytic core of CESA. Overall, the predicted tertiary structure provides a platform for the biochemical engineering of plant CESAs. PMID:23592721
The Universally Conserved Prokaryotic GTPases

PubMed Central

Verstraeten, Natalie; Fauvart, Maarten; Versées, Wim; Michiels, Jan

2011-01-01

Summary: Members of the large superclass of P-loop GTPases share a core domain with a conserved three-dimensional structure. In eukaryotes, these proteins are implicated in various crucial cellular processes, including translation, membrane trafficking, cell cycle progression, and membrane signaling. As targets of mutation and toxins, GTPases are involved in the pathogenesis of cancer and infectious diseases. In prokaryotes also, it is hard to overestimate the importance of GTPases in cell physiology. Numerous papers have shed new light on the role of bacterial GTPases in cell cycle regulation, ribosome assembly, the stress response, and other cellular processes. Moreover, bacterial GTPases have been identified as high-potential drug targets. A key paper published over 2 decades ago stated that, “It may never again be possible to capture [GTPases] in a family portrait” (H. R. Bourne, D. A. Sanders, and F. McCormick, Nature 348:125-132, 1990) and indeed, the last 20 years have seen a tremendous increase in publications on the subject. Sequence analysis identified 13 bacterial GTPases that are conserved in at least 75% of all bacterial species. We here provide an overview of these 13 protein subfamilies, covering their cellular functions as well as cellular localization and expression levels, three-dimensional structures, biochemical properties, and gene organization. Conserved roles in eukaryotic homologs will be discussed as well. A comprehensive overview summarizing current knowledge on prokaryotic GTPases will aid in further elucidating the function of these important proteins. PMID:21885683
Rearrangement of a polar core provides a conserved mechanism for constitutive activation of class B G protein-coupled receptors

PubMed Central

Yin, Yanting; de Waal, Parker W.; He, Yuanzheng; Zhao, Li-Hua; Yang, Dehua; Cai, Xiaoqing; Jiang, Yi; Melcher, Karsten; Wang, Ming-Wei; Xu, H. Eric

2017-01-01

The glucagon receptor (GCGR) belongs to the secretin-like (class B) family of G protein-coupled receptors (GPCRs) and is activated by the peptide hormone glucagon. The structures of an activated class B GPCR have remained unsolved, preventing a mechanistic understanding of how these receptors are activated. Using a combination of structural modeling and mutagenesis studies, we present here two modes of ligand-independent activation of GCGR. First, we identified a GCGR-specific hydrophobic lock comprising Met-338 and Phe-345 within the IC3 loop and transmembrane helix 6 (TM6) and found that this lock stabilizes the TM6 helix in the inactive conformation. Disruption of this hydrophobic lock led to constitutive G protein and arrestin signaling. Second, we discovered a polar core comprising conserved residues in TM2, TM3, TM6, and TM7, and mutations that disrupt this polar core led to constitutive GCGR activity. On the basis of these results, we propose a mechanistic model of GCGR activation in which TM6 is held in an inactive conformation by the conserved polar core and the hydrophobic lock. Mutations that disrupt these inhibitory elements allow TM6 to swing outward to adopt an active TM6 conformation similar to that of the canonical β2-adrenergic receptor complexed with G protein and to that of rhodopsin complexed with arrestin. Importantly, mutations in the corresponding polar core of several other members of class B GPCRs, including PTH1R, PAC1R, VIP1R, and CRFR1, also induce constitutive G protein signaling, suggesting that the rearrangement of the polar core is a conserved mechanism for class B GPCR activation. PMID:28356352
Rearrangement of a polar core provides a conserved mechanism for constitutive activation of class B G protein-coupled receptors.

PubMed

Yin, Yanting; de Waal, Parker W; He, Yuanzheng; Zhao, Li-Hua; Yang, Dehua; Cai, Xiaoqing; Jiang, Yi; Melcher, Karsten; Wang, Ming-Wei; Xu, H Eric

2017-06-16

The glucagon receptor (GCGR) belongs to the secretin-like (class B) family of G protein-coupled receptors (GPCRs) and is activated by the peptide hormone glucagon. The structures of an activated class B GPCR have remained unsolved, preventing a mechanistic understanding of how these receptors are activated. Using a combination of structural modeling and mutagenesis studies, we present here two modes of ligand-independent activation of GCGR. First, we identified a GCGR-specific hydrophobic lock comprising Met-338 and Phe-345 within the IC3 loop and transmembrane helix 6 (TM6) and found that this lock stabilizes the TM6 helix in the inactive conformation. Disruption of this hydrophobic lock led to constitutive G protein and arrestin signaling. Second, we discovered a polar core comprising conserved residues in TM2, TM3, TM6, and TM7, and mutations that disrupt this polar core led to constitutive GCGR activity. On the basis of these results, we propose a mechanistic model of GCGR activation in which TM6 is held in an inactive conformation by the conserved polar core and the hydrophobic lock. Mutations that disrupt these inhibitory elements allow TM6 to swing outward to adopt an active TM6 conformation similar to that of the canonical β 2 -adrenergic receptor complexed with G protein and to that of rhodopsin complexed with arrestin. Importantly, mutations in the corresponding polar core of several other members of class B GPCRs, including PTH1R, PAC1R, VIP1R, and CRFR1, also induce constitutive G protein signaling, suggesting that the rearrangement of the polar core is a conserved mechanism for class B GPCR activation. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Secondary structure in solution of two anti-HIV-1 hammerhead ribozymes as investigated by two-dimensional 1H 500 MHz NMR spectroscopy in water

NASA Technical Reports Server (NTRS)

Sarma, R. H.; Sarma, M. H.; Rein, R.; Shibata, M.; Setlik, R. S.; Ornstein, R. L.; Kazim, A. L.; Cairo, A.; Tomasi, T. B.

1995-01-01

Two hammerhead chimeric RNA/DNA ribozymes (HRz) were synthesized in pure form. Both were 30 nucleotides long, and the sequences were such that they could be targeted to cleave the HIV-1 gag RNA. Named HRz-W and HRz-M, the former had its invariable core region conserved, the latter had a uridine in the invariable region replaced by a guanine. Their secodary structures were determined by 2D NOESY 1H 500 MHz NMR spectroscopy in 90% water and 10% D2(0), following the imino protons. The data show that both HRz-M and HRz-W form identical secondary structures with stem regions consisting of continuous stacks of AT and GT pairs. An energy minimized computer model of this stem region is provided. The results suggest that the loss of catalytic activity that is known to result when an invariant core residue is replaced is not related to the secondary structure of the ribozymes in the absence of substrate.
Comparative analysis of the small RNA transcriptomes of Pinus contorta and Oryza sativa

PubMed Central

Morin, Ryan D.; Aksay, Gozde; Dolgosheina, Elena; Ebhardt, H. Alexander; Magrini, Vincent; Mardis, Elaine R.; Sahinalp, S. Cenk; Unrau, Peter J.

2008-01-01

The diversity of microRNAs and small-interfering RNAs has been extensively explored within angiosperms by focusing on a few key organisms such as Oryza sativa and Arabidopsis thaliana. A deeper division of the plants is defined by the radiation of the angiosperms and gymnosperms, with the latter comprising the commercially important conifers. The conifers are expected to provide important information regarding the evolution of highly conserved small regulatory RNAs. Deep sequencing provides the means to characterize and quantitatively profile small RNAs in understudied organisms such as these. Pyrosequencing of small RNAs from O. sativa revealed, as expected, ∼21- and ∼24-nt RNAs. The former contained known microRNAs, and the latter largely comprised intergenic-derived sequences likely representing heterochromatin siRNAs. In contrast, sequences from Pinus contorta were dominated by 21-nt small RNAs. Using a novel sequence-based clustering algorithm, we identified sequences belonging to 18 highly conserved microRNA families in P. contorta as well as numerous clusters of conserved small RNAs of unknown function. Using multiple methods, including expressed sequence folding and machine learning algorithms, we found a further 53 candidate novel microRNA families, 51 appearing specific to the P. contorta library. In addition, alignment of small RNA sequences to the O. sativa genome revealed six perfectly conserved classes of small RNA that included chloroplast transcripts and specific types of genomic repeats. The conservation of microRNAs and other small RNAs between the conifers and the angiosperms indicates that important RNA silencing processes were highly developed in the earliest spermatophytes. Genomic mapping of all sequences to the O. sativa genome can be viewed at http://microrna.bcgsc.ca/cgi-bin/gbrowse/rice_build_3/. PMID:18323537
Conservation and Accessibility of an Inner Core Lipopolysaccharide Epitope of Neisseria meningitidis

PubMed Central

Plested, Joyce S.; Makepeace, Katherine; Jennings, Michael P.; Gidney, Margaret Anne J.; Lacelle, Suzanne; Brisson, J.-R.; Cox, Andrew D.; Martin, Adele; Bird, A. Graham; Tang, Christoph M.; Mackinnon, Fiona M.; Richards, James C.; Moxon, E. Richard

1999-01-01

We investigated the conservation and antibody accessibility of inner core epitopes of Neisseria meningitidis lipopolysaccharide (LPS) because of their potential as vaccine candidates. An immunoglobulin G3 murine monoclonal antibody (MAb), designated MAb B5, was obtained by immunizing mice with a galE mutant of N. meningitidis H44/76 (B.15.P1.7,16 immunotype L3). We have shown that MAb B5 can bind to the core LPS of wild-type encapsulated MC58 (B.15.P1.7,16 immunotype L3) organisms in vitro and ex vivo. An inner core structure recognized by MAb B5 is conserved and accessible in 26 of 34 (76%) of group B and 78 of 112 (70%) of groups A, C, W, X, Y, and Z strains. N. meningitidis strains which possess this epitope are immunotypes in which phosphoethanolamine (PEtn) is linked to the 3-position of the β-chain heptose (HepII) of the inner core. In contrast, N. meningitidis strains lacking reactivity with MAb B5 have an alternative core structure in which PEtn is linked to an exocyclic position (i.e., position 6 or 7) of HepII (immunotypes L2, L4, and L6) or is absent (immunotype L5). We conclude that MAb B5 defines one or more of the major inner core glycoforms of N. meningitidis LPS. These findings support the possibility that immunogens capable of eliciting functional antibodies specific to inner core structures could be the basis of a vaccine against invasive infections caused by N. meningitidis. PMID:10496924
Plastome Sequences of Lygodium japonicum and Marsilea crenata Reveal the Genome Organization Transformation from Basal Ferns to Core Leptosporangiates

PubMed Central

Gao, Lei; Wang, Bo; Wang, Zhi-Wei; Zhou, Yuan; Su, Ying-Juan; Wang, Ting

2013-01-01

Previous studies have shown that core leptosporangiates, the most species-rich group of extant ferns (monilophytes), have a distinct plastid genome (plastome) organization pattern from basal fern lineages. However, the details of genome structure transformation from ancestral ferns to core leptosporangiates remain unclear because of limited plastome data available. Here, we have determined the complete chloroplast genome sequences of Lygodium japonicum (Lygodiaceae), a member of schizaeoid ferns (Schizaeales), and Marsilea crenata (Marsileaceae), a representative of heterosporous ferns (Salviniales). The two species represent the sister and the basal lineages of core leptosporangiates, respectively, for which the plastome sequences are currently unavailable. Comparative genomic analysis of all sequenced fern plastomes reveals that the gene order of L. japonicum plastome occupies an intermediate position between that of basal ferns and core leptosporangiates. The two exons of the fern ndhB gene have a unique pattern of intragenic copy number variances. Specifically, the substitution rate heterogeneity between the two exons is congruent with their copy number changes, confirming the constraint role that inverted repeats may play on the substitution rate of chloroplast gene sequences. PMID:23821521

Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

2004-08-06

The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayedmore » embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less
Compact X-ray Binary Re-creation in Core Collapse: NGC 6397

NASA Astrophysics Data System (ADS)

Grindlay, J. E.; Bogdanov, S.; van den Berg, M.; Heinke, C.

2005-12-01

We report new Chandra observations of the core collapsed globular cluster NGC 6397. In comparison with our original Chandra observations (Grindlay et al 2001, ApJ, 563, L53), we now detect some 30 sources (vs. 20) in the cluster. A new CV is confirmed, though new HST/ACS optical observations (see Cohn et al this meeting) show that one of the original CV candidates is a background AGN). The 9 CVs (optically identified) yet only one MSP and one qLMXB suggest either a factor of 7 reduction in NSs/WDs vs. what we find in 47Tuc (see Grindlay 2005, Proc. Cefalu Conf. on Interacting Binaries) or that CVs are produced in the core collapse. The possible second MSP with main sequence companion, source U18 (see Grindlay et al 2001) is similar in its X-ray and optical properties to MSP-W in 47Tuc, which must have swapped its binary companion. Together with the one confirmed (radio) MSP in NGC 6397, with an evolved main sequence secondary, the process of enhanced partner swapping in the high stellar density of core collapse is implicated. At the same time, main sequence - main sequence binaries (active binaries) are depleted in the cluster core, presumably by "binary burning" in core collapse. These binary re-creation and destruction mechanisms in core collapse have profound implications for binary evolution and mergers in globulars that have undergone core collapse.
Expression Patterns, Activities and Carbohydrate-Metabolizing Regulation of Sucrose Phosphate Synthase, Sucrose Synthase and Neutral Invertase in Pineapple Fruit during Development and Ripening

PubMed Central

Zhang, Xiu-Mei; Wang, Wei; Du, Li-Qing; Xie, Jiang-Hui; Yao, Yan-Li; Sun, Guang-Ming

2012-01-01

Differences in carbohydrate contents and metabolizing-enzyme activities were monitored in apical, medial, basal and core sections of pineapple (Ananas comosus cv. Comte de paris) during fruit development and ripening. Fructose and glucose of various sections in nearly equal amounts were the predominant sugars in the fruitlets, and had obvious differences until the fruit matured. The large rise of sucrose/hexose was accompanied by dramatic changes in sucrose phosphate synthase (SPS) and sucrose synthase (SuSy) activities. By contrast, neutral invertase (NI) activity may provide a mechanism to increase fruit sink strength by increasing hexose concentrations. Furthermore, two cDNAs of Ac-sps (accession no. GQ996582) and Ac-ni (accession no. GQ996581) were first isolated from pineapple fruits utilizing conserved amino-acid sequences. Homology alignment reveals that the amino acid sequences contain some conserved function domains. Transcription expression analysis of Ac-sps, Ac-susy and Ac-ni also indicated distinct patterns related to sugar accumulation and composition of pineapple fruits. It suggests that differential expressions of multiple gene families are necessary for sugar metabolism in various parts and developmental stages of pineapple fruit. A cycle of sucrose breakdown in the cytosol of sink tissues could be mediated through both Ac-SuSy and Ac-NI, and Ac-NI could be involved in regulating crucial steps by generating sugar signals to the cells in a temporally and spatially restricted fashion. PMID:22949808
The Reverse Transcriptase/RNA Maturase Protein MatR Is Required for the Splicing of Various Group II Introns in Brassicaceae Mitochondria.

PubMed

Sultan, Laure D; Mileshina, Daria; Grewe, Felix; Rolle, Katarzyna; Abudraham, Sivan; Głodowicz, Paweł; Niazi, Adnan Khan; Keren, Ido; Shevtsov, Sofia; Klipcan, Liron; Barciszewski, Jan; Mower, Jeffrey P; Dietrich, André; Ostersetzer-Biran, Oren

2016-11-01

Group II introns are large catalytic RNAs that are ancestrally related to nuclear spliceosomal introns. Sequences corresponding to group II RNAs are found in many prokaryotes and are particularly prevalent within plants organellar genomes. Proteins encoded within the introns themselves (maturases) facilitate the splicing of their own host pre-RNAs. Mitochondrial introns in plants have diverged considerably in sequence and have lost their maturases. In angiosperms, only a single maturase has been retained in the mitochondrial DNA: the matR gene found within NADH dehydrogenase 1 (nad1) intron 4. Its conservation across land plants and RNA editing events, which restore conserved amino acids, indicates that matR encodes a functional protein. However, the biological role of MatR remains unclear. Here, we performed an in vivo investigation of the roles of MatR in Brassicaceae. Directed knockdown of matR expression via synthetically designed ribozymes altered the processing of various introns, including nad1 i4. Pull-down experiments further indicated that MatR is associated with nad1 i4 and several other intron-containing pre-mRNAs. MatR may thus represent an intermediate link in the gradual evolutionary transition from the intron-specific maturases in bacteria into their versatile spliceosomal descendants in the nucleus. The similarity between maturases and the core spliceosomal Prp8 protein further supports this intriguing theory. © 2016 American Society of Plant Biologists. All rights reserved.
Comparison of Three Different Hepatitis C Virus Genotyping Methods: 5'NCR PCR-RFLP, Core Type-Specific PCR, and NS5b Sequencing in a Tertiary Care Hospital in South India.

PubMed

Daniel, Hubert D-J; David, Joel; Raghuraman, Sukanya; Gnanamony, Manu; Chandy, George M; Sridharan, Gopalan; Abraham, Priya

2017-05-01

Based on genetic heterogeneity, hepatitis C virus (HCV) is classified into seven major genotypes and 64 subtypes. In spite of the sequence heterogeneity, all genotypes share an identical complement of colinear genes within the large open reading frame. The genetic interrelationships between these genes are consistent among genotypes. Due to this property, complete sequencing of the HCV genome is not required. HCV genotypes along with subtypes are critical for planning antiviral therapy. Certain genotypes are also associated with higher progression to liver cirrhosis. In this study, 100 blood samples were collected from individuals who came for routine HCV genotype identification. These samples were used for the comparison of two different genotyping methods (5'NCR PCR-RFLP and HCV core type-specific PCR) with NS5b sequencing. Of the 100 samples genotyped using 5'NCR PCR-RFLP and HCV core type-specific PCR, 90% (κ = 0.913, P < 0.00) and 96% (κ = 0.794, P < 0.00) correlated with NS5b sequencing, respectively. Sixty percent and 75% of discordant samples by 5'NCR PCR-RFLP and HCV core type-specific PCR, respectively, belonged to genotype 6. All the HCV genotype 1 subtypes were classified accurately by both the methods. This study shows that the 5'NCR-based PCR-RFLP and the HCV core type-specific PCR-based assays correctly identified HCV genotypes except genotype 6 from this region. Direct sequencing of the HCV core region was able to identify all the genotype 6 from this region and serves as an alternative to NS5b sequencing. © 2016 Wiley Periodicals, Inc.
Conserved noncoding sequences (CNSs) in higher plants.

PubMed

Freeling, Michael; Subramaniam, Shabarinath

2009-04-01

Plant conserved noncoding sequences (CNSs)--a specific category of phylogenetic footprint--have been shown experimentally to function. No plant CNS is conserved to the extent that ultraconserved noncoding sequences are conserved in vertebrates. Plant CNSs are enriched in known transcription factor or other cis-acting binding sites, and are usually clustered around genes. Genes that encode transcription factors and/or those that respond to stimuli are particularly CNS-rich. Only rarely could this function involve small RNA binding. Some transcribed CNSs encode short translation products as a form of negative control. Approximately 4% of Arabidopsis gene content is estimated to be both CNS-rich and occupies a relatively long stretch of chromosome: Bigfoot genes (long phylogenetic footprints). We discuss a 'DNA-templated protein assembly' idea that might help explain Bigfoot gene CNSs.
The haloarchaeal MCM proteins: bioinformatic analysis and targeted mutagenesis of the β7-β8 and β9-β10 hairpin loops and conserved zinc binding domain cysteines

PubMed Central

Kristensen, Tatjana P.; Maria Cherian, Reeja; Gray, Fiona C.; MacNeill, Stuart A.

2014-01-01

The hexameric MCM complex is the catalytic core of the replicative helicase in eukaryotic and archaeal cells. Here we describe the first in vivo analysis of archaeal MCM protein structure and function relationships using the genetically tractable haloarchaeon Haloferax volcanii as a model system. Hfx. volcanii encodes a single MCM protein that is part of the previously identified core group of haloarchaeal MCM proteins. Three structural features of the N-terminal domain of the Hfx. volcanii MCM protein were targeted for mutagenesis: the β7-β8 and β9-β10 β-hairpin loops and putative zinc binding domain. Five strains carrying single point mutations in the β7-β8 β-hairpin loop were constructed, none of which displayed impaired cell growth under normal conditions or when treated with the DNA damaging agent mitomycin C. However, short sequence deletions within the β7-β8 β-hairpin were not tolerated and neither was replacement of the highly conserved residue glutamate 187 with alanine. Six strains carrying paired alanine substitutions within the β9-β10 β-hairpin loop were constructed, leading to the conclusion that no individual amino acid within that hairpin loop is absolutely required for MCM function, although one of the mutant strains displays greatly enhanced sensitivity to mitomycin C. Deletions of two or four amino acids from the β9-β10 β-hairpin were tolerated but mutants carrying larger deletions were inviable. Similarly, it was not possible to construct mutants in which any of the conserved zinc binding cysteines was replaced with alanine, underlining the likely importance of zinc binding for MCM function. The results of these studies demonstrate the feasibility of using Hfx. volcanii as a model system for reverse genetic analysis of archaeal MCM protein function and provide important confirmation of the in vivo importance of conserved structural features identified by previous bioinformatic, biochemical and structural studies. PMID:24723920
Genetic differences between blood- and brain-derived viral sequences from human immunodeficiency virus type 1-infected patients: evidence of conserved elements in the V3 region of the envelope protein of brain-derived sequences.

PubMed Central

Korber, B T; Kunstman, K J; Patterson, B K; Furtado, M; McEvilly, M M; Levy, R; Wolinsky, S M

1994-01-01

Human immunodeficiency virus type 1 (HIV-1) sequences were generated from blood and from brain tissue obtained by stereotactic biopsy from six patients undergoing a diagnostic neurosurgical procedure. Proviral DNA was directly amplified by nested PCR, and 8 to 36 clones from each sample were sequenced. Phylogenetic analysis of intrapatient envelope V3-V5 region HIV-1 DNA sequence sets revealed that brain viral sequences were clustered relative to the blood viral sequences, suggestive of tissue-specific compartmentalization of the virus in four of the six cases. In the other two cases, the blood and brain virus sequences were intermingled in the phylogenetic analyses, suggesting trafficking of virus between the two tissues. Slide-based PCR-driven in situ hybridization of two of the patients' brain biopsy samples confirmed our interpretation of the intrapatient phylogenetic analyses. Interpatient V3 region brain-derived sequence distances were significantly less than blood-derived sequence distances. Relative to the tip of the loop, the set of brain-derived viral sequences had a tendency towards negative or neutral charge compared with the set of blood-derived viral sequences. Entropy calculations were used as a measure of the variability at each position in alignments of blood and brain viral sequences. A relatively conserved set of positions were found, with a significantly lower entropy in the brain-than in the blood-derived viral sequences. These sites constitute a brain "signature pattern," or a noncontiguous set of amino acids in the V3 region conserved in viral sequences derived from brain tissue. This brain-derived signature pattern was also well preserved among isolates previously characterized in vitro as macrophage tropic. Macrophage-monocyte tropism may be the biological constraint that results in the conservation of the viral brain signature pattern. Images PMID:7933130
The punctilious RNA polymerase II core promoter

PubMed Central

Vo ngoc, Long; Wang, Yuan-Liang; Kassavetis, George A.; Kadonaga, James T.

2017-01-01

The signals that direct the initiation of transcription ultimately converge at the core promoter, which is the gateway to transcription. Here we provide an overview of the RNA polymerase II core promoter in bilateria (bilaterally symmetric animals). The core promoter is diverse in terms of its composition and function yet is also punctilious, as it acts with strict rules and precision. We additionally describe an expanded view of the core promoter that comprises the classical DNA sequence motifs, sequence-specific DNA-binding transcription factors, chromatin signals, and DNA structure. This model may eventually lead to a more unified conceptual understanding of the core promoter. PMID:28808065
Utility of 17 chloroplast genes for inferring the phylogeny of the basal angiosperms.

PubMed

Graham, S W; Olmstead, R G

2000-11-01

Sequences from 14 slowly evolving chloroplast genes (including three highly conserved introns) were obtained for representative basal angiosperm and seed-plant taxa, using novel primers described here. These data were combined with published sequences from atpB, rbcL, and newly obtained sequences from ndhF. Combined data from these 17 genes permit sturdy, well-resolved inference of major aspects of basal angiosperm relationships, demonstrating that the new primers are valuable tools for sorting out the deepest events in flowering plant phylogeny. Sequences from the inverted repeat (IR) proved to be particularly reliable (low homoplasy, high retention index). Representatives of Cabomba and Illicium were the first two successive branches of the angiosperms in an initial sampling of 19 exemplar taxa. This result was strongly supported by bootstrap analysis and by two small insertion/deletion events in the slowly evolving introns. Several paleoherb groups (representatives of Piperales) formed a strongly supported clade with taxa representing core woody magnoliids (Laurales, Magnoliales, and Winteraceae). The monophyly of the sampled eudicots and monocots was also well supported. Analyses of three major partitions of the data showed many of the same clades and supported the rooting seen with all the data combined. While Amborella trichopoda was supported as the sister group of the remaining angiosperms when we added Amborella and Nymphaea odorata to the analysis, a strongly conflicting rooting was observed when Amborella alone was added.
Regional centromeres in the yeast Candida lusitaniae lack pericentromeric heterochromatin

PubMed Central

Kapoor, Shivali; Zhu, Lisha; Froyd, Cara; Liu, Tao; Rusche, Laura N.

2015-01-01

Point centromeres are specified by a short consensus sequence that seeds kinetochore formation, whereas regional centromeres lack a conserved sequence and instead are epigenetically inherited. Regional centromeres are generally flanked by heterochromatin that ensures high levels of cohesin and promotes faithful chromosome segregation. However, it is not known whether regional centromeres require pericentromeric heterochromatin. In the yeast Candida lusitaniae, we identified a distinct type of regional centromere that lacks pericentromeric heterochromatin. Centromere locations were determined by ChIP-sequencing of two key centromere proteins, Cse4 and Mif2, and are consistent with bioinformatic predictions. The centromeric DNA sequence was unique for each chromosome and spanned 4–4.5 kbp, consistent with regional epigenetically inherited centromeres. However, unlike other regional centromeres, there was no evidence of pericentromeric heterochromatin in C. lusitaniae. In particular, flanking genes were expressed at a similar level to the rest of the genome, and a URA3 reporter inserted adjacent to a centromere was not repressed. In addition, regions flanking the centromeric core were not associated with hypoacetylated histones or a sirtuin deacetylase that generates heterochromatin in other yeast. Interestingly, the centromeric chromatin had a distinct pattern of histone modifications, being enriched for methylated H3K79 and H3R2 but lacking methylation of H3K4, which is found at other regional centromeres. Thus, not all regional centromeres require flanking heterochromatin. PMID:26371315
Structural, evolutionary and genetic analysis of the histidine biosynthetic "core" in the genus Burkholderia.

PubMed

Papaleo, Maria Cristiana; Russo, Edda; Fondi, Marco; Emiliani, Giovanni; Frandi, Antonio; Brilli, Matteo; Pastorelli, Roberta; Fani, Renato

2009-12-01

In this work a detailed analysis of the structure, the expression and the organization of his genes belonging to the core of histidine biosynthesis (hisBHAF) in 40 newly determined and 13 available sequences of Burkholderia strains was carried out. Data obtained revealed a strong conservation of the structure and organization of these genes through the entire genus. The phylogenetic analysis showed the monophyletic origin of this gene cluster and indicated that it did not undergo horizontal gene transfer events. The analysis of the intergenic regions, based on the substitution rate, entropy plot and bendability suggested the existence of a putative transcription promoter upstream of hisB, that was supported by the genetic analysis that showed that this cluster was able to complement Escherichia colihisA, hisB, and hisF mutations. Moreover, a preliminary transcriptional analysis and the analysis of microarray data revealed that the expression of the his core was constitutive. These findings are in agreement with the fact that the entire Burkholderiahis operon is heterogeneous, in that it contains "alien" genes apparently not involved in histidine biosynthesis. Besides, they also support the idea that the proteobacterial his operon was piece-wisely assembled, i.e. through accretion of smaller units containing only some of the genes (eventually together with their own promoters) involved in this biosynthetic route. The correlation existing between the structure, organization and regulation of his "core" genes and the function(s) they perform in cellular metabolism is discussed.
Unexpected DNA affinity and sequence selectivity through core rigidity in guanidinium-based minor groove binders.

PubMed

Nagle, Padraic S; McKeever, Caitriona; Rodriguez, Fernando; Nguyen, Binh; Wilson, W David; Rozas, Isabel

2014-09-25

In this paper we report the design and biophysical evaluation of novel rigid-core symmetric and asymmetric dicationic DNA binders containing 9H-fluorene and 9,10-dihydroanthracene cores as well as the synthesis of one of these fluorene derivatives. First, the affinity toward particular DNA sequences of these compounds and flexible core derivatives was evaluated by means of surface plasmon resonance and thermal denaturation experiments finding that the position of the cations significantly influence the binding strength. Then their affinity and mode of binding were further studied by performing circular dichroism and UV studies and the results obtained were rationalized by means of DFT calculations. We found that the fluorene derivatives prepared have the ability to bind to the minor groove of certain DNA sequences and intercalate to others, whereas the dihydroanthracene compounds bind via intercalation to all the DNA sequences studied here.
The kinetoplast DNA of the Australian trypanosome, Trypanosoma copemani, shares features with Trypanosoma cruzi and Trypanosoma lewisi.

PubMed

Botero, Adriana; Kapeller, Irit; Cooper, Crystal; Clode, Peta L; Shlomai, Joseph; Thompson, R C Andrew

2018-05-17

Kinetoplast DNA (kDNA) is the mitochondrial genome of trypanosomatids. It consists of a few dozen maxicircles and several thousand minicircles, all catenated topologically to form a two-dimensional DNA network. Minicircles are heterogeneous in size and sequence among species. They present one or several conserved regions that contain three highly conserved sequence blocks. CSB-1 (10 bp sequence) and CSB-2 (8 bp sequence) present lower interspecies homology, while CSB-3 (12 bp sequence) or the Universal Minicircle Sequence is conserved within most trypanosomatids. The Universal Minicircle Sequence is located at the replication origin of the minicircles, and is the binding site for the UMS binding protein, a protein involved in trypanosomatid survival and virulence. Here, we describe the structure and organisation of the kDNA of Trypanosoma copemani, a parasite that has been shown to infect mammalian cells and has been associated with the drastic decline of the endangered Australian marsupial, the woylie (Bettongia penicillata). Deep genomic sequencing showed that T. copemani presents two classes of minicircles that share sequence identity and organisation in the conserved sequence blocks with those of Trypanosoma cruzi and Trypanosoma lewisi. A 19,257 bp partial region of the maxicircle of T. copemani that contained the entire coding region was obtained. Comparative analysis of the T. copemani entire maxicircle coding region with the coding regions of T. cruzi and T. lewisi showed they share 71.05% and 71.28% identity, respectively. The shared features in the maxicircle/minicircle organisation and sequence between T. copemani and T. cruzi/T. lewisi suggest similarities in their process of kDNA replication, and are of significance in understanding the evolution of Australian trypanosomes. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Conserved Sequence Preferences Contribute to Substrate Recognition by the Proteasome*

PubMed Central

Yu, Houqing; Singh Gautam, Amit K.; Wilmington, Shameika R.; Wylie, Dennis; Martinez-Fonts, Kirby; Kago, Grace; Warburton, Marie; Chavali, Sreenivas; Inobe, Tomonao; Finkelstein, Ilya J.; Babu, M. Madan

2016-01-01

The proteasome has pronounced preferences for the amino acid sequence of its substrates at the site where it initiates degradation. Here, we report that modulating these sequences can tune the steady-state abundance of proteins over 2 orders of magnitude in cells. This is the same dynamic range as seen for inducing ubiquitination through a classic N-end rule degron. The stability and abundance of His3 constructs dictated by the initiation site affect survival of yeast cells and show that variation in proteasomal initiation can affect fitness. The proteasome's sequence preferences are linked directly to the affinity of the initiation sites to their receptor on the proteasome and are conserved between Saccharomyces cerevisiae, Schizosaccharomyces pombe, and human cells. These findings establish that the sequence composition of unstructured initiation sites influences protein abundance in vivo in an evolutionarily conserved manner and can affect phenotype and fitness. PMID:27226608
Transcription of fgf8 is regulated by activating and repressive cis-elements at the midbrain-hindbrain boundary in zebrafish embryos.

PubMed

Inoue, Fumitaka; Parvin, Mst Shahnaj; Yamasu, Kyo

2008-04-15

Fgf8 is expressed in the isthmic region of the developing brain, serving an organizing function in vertebrate embryos. We previously identified S4.2 downstream to the zebrafish fgf8 gene as a regulatory region that drives transcription in the anterior hindbrain. Here, we investigated the mechanism of fgf8 regulation by the S4.2 region during development. Reporter analyses in embryos revealed that S4.2 closely recapitulates fgf8 expression in the anteriormost hindbrain during somitogenesis. This region contains a sequence highly conserved in fgf8 of diverse vertebrates. Further analyses of S4.2 revealed a 342-bp core region composed of three subregions (#2, #3, and #4). Regions #3 and #4 drove expression broadly in the brain from the midbrain to r5 of the hindbrain, whereas a 28-bp sequence in #2 repressed ectopic expression in the midbrain and in r2 to r5. The enhancer function of S4.2 was absent in pax2a mutant embryos, while it was activated ectopically by pax2a misexpression in the hindbrain. We identified two sites in the core region that are bound by Pax2a in vitro and in vivo, the disruption of which abrogated the S4.2 activity. Thus, fgf8 expression in the anteriormost hindbrain involves activation and repression, with Pax2a as a pivotal regulator.
Nosiheptide Biosynthesis Featuring a Unique Indole Side Ring Formation on the Characteristic Thiopeptide Framework

PubMed Central

Yu, Yi; Duan, Lian; Zhang, Qi; Liao, Rijing; Ding, Ying; Pan, Haixue; Wendt-Pienkowski, Evelyn; Tang, Gongli; Shen, Ben; Liu, Wen

2009-01-01

Nosiheptide (NOS), belonging to the e series of thiopeptide antibiotics that exhibit potent activity against various bacterial pathogens, bears a unique indole side ring system and regiospecific hydroxyl groups on the characteristic macrocyclic core. Here, cloning, sequencing and characterization of the nos gene cluster from Streptomyces actuosus ATCC 25421 as a model for this series of thiopeptides has unveiled new insights into their biosynthesis. Bioinformatics-based sequence analysis and in vivo investigation into the gene functions show that NOS biosynthesis shares a common strategy with recently characterized b or c series thiopeptides for forming the characteristic macrocyclic core, which features a ribosomally synthesized precursor peptide with conserved posttranslational modifications. However, it apparently proceeds via a different route for tailoring the thiopeptide framework, allowing the final product to exhibit the distinct structural characteristics of e series thiopeptides, such as the indole side ring system. Chemical complementation supports the notion that the S-adenosylmethionine (AdoMet)-dependent protein NosL may play a central role in converting Trp to the key 3-methylindole moiety by an unusual carbon side chain rearrangement, most likely via a radical-initiated mechanism. Characterization of the indole side ring-opened analog of NOS from the nosN mutant strain is consistent with the proposed methyltransferase activity of its encoded protein, shedding light into the timing of the individual steps for indole side ring biosynthesis. These results also suggest the feasibility of engineering novel thiopeptides for drug discovery by manipulating the NOS biosynthetic machinery. PMID:19678698
Domain organization, genomic structure, evolution, and regulation of expression of the aggrecan gene family.

PubMed

Schwartz, N B; Pirok, E W; Mensch, J R; Domowicz, M S

1999-01-01

Proteoglycans are complex macromolecules, consisting of a polypeptide backbone to which are covalently attached one or more glycosaminoglycan chains. Molecular cloning has allowed identification of the genes encoding the core proteins of various proteoglycans, leading to a better understanding of the diversity of proteoglycan structure and function, as well as to the evolution of a classification of proteoglycans on the basis of emerging gene families that encode the different core proteins. One such family includes several proteoglycans that have been grouped with aggrecan, the large aggregating chondroitin sulfate proteoglycan of cartilage, based on a high number of sequence similarities within the N- and C-terminal domains. Thus far these proteoglycans include versican, neurocan, and brevican. It is now apparent that these proteins, as a group, are truly a gene family with shared structural motifs on the protein and nucleotide (mRNA) levels, and with nearly identical genomic organizations. Clearly a common ancestral origin is indicated for the members of the aggrecan family of proteoglycans. However, differing patterns of amplification and divergence have also occurred within certain exons across species and family members, leading to the class-characteristic protein motifs in the central carbohydrate-rich region exclusively. Thus the overall domain organization strongly suggests that sequence conservation in the terminal globular domains underlies common functions, whereas differences in the central portions of the genes account for functional specialization among the members of this gene family.
Genome-wide discovery of novel and conserved microRNAs in white shrimp (Litopenaeus vannamei).

PubMed

Xi, Qian-Yun; Xiong, Yuan-Yan; Wang, Yuan-Mei; Cheng, Xiao; Qi, Qi-En; Shu, Gang; Wang, Song-Bo; Wang, Li-Na; Gao, Ping; Zhu, Xiao-Tong; Jiang, Qing-Yan; Zhang, Yong-Liang; Liu, Li

2015-01-01

Of late years, a large amount of conserved and species-specific microRNAs (miRNAs) have been performed on identification from species which are economically important but lack a full genome sequence. In this study, Solexa deep sequencing and cross-species miRNA microarray were used to detect miRNAs in white shrimp. We identified 239 conserved miRNAs, 14 miRNA* sequences and 20 novel miRNAs by bioinformatics analysis from 7,561,406 high-quality reads representing 325,370 distinct sequences. The all 20 novel miRNAs were species-specific in white shrimp and not homologous in other species. Using the conserved miRNAs from the miRBase database as a query set to search for homologs from shrimp expressed sequence tags (ESTs), 32 conserved computationally predicted miRNAs were discovered in shrimp. In addition, using microarray analysis in the shrimp fed with Panax ginseng polysaccharide complex, 151 conserved miRNAs were identified, 18 of which were significant up-expression, while 49 miRNAs were significant down-expression. In particular, qRT-PCR analysis was also performed for nine miRNAs in three shrimp tissues such as muscle, gill and hepatopancreas. Results showed that these miRNAs expression are tissue specific. Combining results of the three methods, we detected 20 novel and 394 conserved miRNAs. Verification with quantitative reverse transcription (qRT-PCR) and Northern blot showed a high confidentiality of data. The study provides the first comprehensive specific miRNA profile of white shrimp, which includes useful information for future investigations into the function of miRNAs in regulation of shrimp development and immunology.
Sequence and conformational preferences at termini of α-helices in membrane proteins: role of the helix environment.

PubMed

Shelar, Ashish; Bansal, Manju

2014-12-01

α-Helices are amongst the most common secondary structural elements seen in membrane proteins and are packed in the form of helix bundles. These α-helices encounter varying external environments (hydrophobic, hydrophilic) that may influence the sequence preferences at their N and C-termini. The role of the external environment in stabilization of the helix termini in membrane proteins is still unknown. Here we analyze α-helices in a high-resolution dataset of integral α-helical membrane proteins and establish that their sequence and conformational preferences differ from those in globular proteins. We specifically examine these preferences at the N and C-termini in helices initiating/terminating inside the membrane core as well as in linkers connecting these transmembrane helices. We find that the sequence preferences and structural motifs at capping (Ncap and Ccap) and near-helical (N' and C') positions are influenced by a combination of features including the membrane environment and the innate helix initiation and termination property of residues forming structural motifs. We also find that a large number of helix termini which do not form any particular capping motif are stabilized by formation of hydrogen bonds and hydrophobic interactions contributed from the neighboring helices in the membrane protein. We further validate the sequence preferences obtained from our analysis with data from an ultradeep sequencing study that identifies evolutionarily conserved amino acids in the rat neurotensin receptor. The results from our analysis provide insights for the secondary structure prediction, modeling and design of membrane proteins. © 2014 Wiley Periodicals, Inc.

Increasing genomic diversity and evidence of constrained lifestyle evolution due to insertion sequences in Aeromonas salmonicida.

PubMed

Vincent, Antony T; Trudel, Mélanie V; Freschi, Luca; Nagar, Vandan; Gagné-Thivierge, Cynthia; Levesque, Roger C; Charette, Steve J

2016-01-12

Aeromonads make up a group of Gram-negative bacteria that includes human and fish pathogens. The Aeromonas salmonicida species has the peculiarity of including five known subspecies. However, few studies of the genomes of A. salmonicida subspecies have been reported to date. We sequenced the genomes of additional A. salmonicida isolates, including three from India, using next-generation sequencing in order to gain a better understanding of the genomic and phylogenetic links between A. salmonicida subspecies. Their relative phylogenetic positions were confirmed by a core genome phylogeny based on 1645 gene sequences. The Indian isolates, which formed a sub-group together with A. salmonicida subsp. pectinolytica, were able to grow at either at 18 °C and 37 °C, unlike the A. salmonicida psychrophilic isolates that did not grow at 37 °C. Amino acid frequencies, GC content, tRNA composition, loss and gain of genes during evolution, pseudogenes as well as genes under positive selection and the mobilome were studied to explain this intraspecies dichotomy. Insertion sequences appeared to be an important driving force that locked the psychrophilic strains into their particular lifestyle in order to conserve their genomic integrity. This observation, based on comparative genomics, is in agreement with previous results showing that insertion sequence mobility induced by heat in A. salmonicida subspecies causes genomic plasticity, resulting in a deleterious effect on the virulence of the bacterium. We provide a proof-of-concept that selfish DNAs play a major role in the evolution of bacterial species by modeling genomes.
The Complete Genome of a New Betabaculovirus from Clostera anastomosis

PubMed Central

Yin, Feifei; Zhu, Zheng; Liu, Xiaoping; Hou, Dianhai; Wang, Jun; Zhang, Lei; Wang, Manli; Kou, Zheng; Wang, Hualin; Deng, Fei; Hu, Zhihong

2015-01-01

Clostera anastomosis (Lepidoptera: Notodontidae) is a defoliating forest insect pest. Clostera anastomosis granulovirus-B (ClasGV-B) belonging to the genus Betabaculovirus of family Baculoviridae has been used for biological control of the pest. Here we reported the full genome sequence of ClasGV-B and compared it to other previously sequenced baculoviruses. The circular double-stranded DNA genome is 107,439 bp in length, with a G+C content of 37.8% and contains 123 open reading frames (ORFs) representing 93% of the genome. ClasGV-B contains 37 baculovirus core genes, 25 lepidopteran baculovirus specific genes, 19 betabaculovirus specific genes, 39 other genes with homologues to baculoviruses and 3 ORFs unique to ClasGV-B. Hrs appear to be absent from the ClasGV-B genome, however, two non-hr repeats were found. Phylogenetic tree based on 37 core genes from 73 baculovirus genomes placed ClasGV-B in the clade b of betabaculoviruses and was most closely related to Erinnyis ello GV (ErelGV). The gene arrangement of ClasGV-B also shared the strongest collinearity with ErelGV but differed from Clostera anachoreta GV (ClanGV), Clostera anastomosis GV-A (ClasGV-A, previously also called CaLGV) and Epinotia aporema GV (EpapGV) with a 20 kb inversion. ClasGV-B genome contains three copies of polyhedron envelope protein gene (pep) and phylogenetic tree divides the PEPs of betabaculoviruses into three major clades: PEP-1, PEP-2 and PEP/P10. ClasGV-B also contains three homologues of P10 which all harbor an N-terminal coiled-coil domain and a C-terminal basic sequence. ClasGV-B encodes three fibroblast growth factor (FGF) homologues which are conserved in all sequenced betabaculoviruses. Phylogenetic analysis placed these three FGFs into different groups and suggested that the FGFs were evolved at the early stage of the betabaculovirus expansion. ClasGV-B is different from previously reported ClasGV-A and ClanGV isolated from Notodontidae in sequence and gene arrangement, indicating the virus is a new notodontid betabaculovirus. PMID:26168260
Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

PubMed Central

2012-01-01

Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678
A Novel Protective Vaccine Antigen from the Core Escherichia coli Genome

PubMed Central

Moriel, Danilo G.; Tan, Lendl; Goh, Kelvin G. K.; Ipe, Deepak S.; Lo, Alvin W.; Peters, Kate M.

2016-01-01

ABSTRACT Escherichia coli is a versatile pathogen capable of causing intestinal and extraintestinal infections that result in a huge burden of global human disease. The diversity of E. coli is reflected by its multiple different pathotypes and mosaic genome composition. E. coli strains are also a major driver of antibiotic resistance, emphasizing the urgent need for new treatment and prevention measures. Here, we used a large data set comprising 1,700 draft and complete genomes to define the core and accessory genome of E. coli and demonstrated the overlapping relationship between strains from different pathotypes. In combination with proteomic investigation, this analysis revealed core genes that encode surface-exposed or secreted proteins that represent potential broad-coverage vaccine antigens. One of these antigens, YncE, was characterized as a conserved immunogenic antigen able to protect against acute systemic infection in mice after vaccination. Overall, this work provides a genomic blueprint for future analyses of conserved and accessory E. coli genes. The work also identified YncE as a novel antigen that could be exploited in the development of a vaccine against all pathogenic E. coli strains—an important direction given the high global incidence of infections caused by multidrug-resistant strains for which there are few effective antibiotics. IMPORTANCE E. coli is a multifaceted pathogen of major significance to global human health and an important contributor to increasing antibiotic resistance. Given the paucity of therapies still effective against multidrug-resistant pathogenic E. coli strains, novel treatment and prevention strategies are urgently required. In this study, we defined the core and accessory components of the E. coli genome by examining a large collection of draft and completely sequenced strains available from public databases. This data set was mined by employing a reverse-vaccinology approach in combination with proteomics to identify putative broadly protective vaccine antigens. One such antigen was identified that was highly immunogenic and induced protection in a mouse model of bacteremia. Overall, our study provides a genomic and proteomic framework for the selection of novel vaccine antigens that could mediate broad protection against pathogenic E. coli. PMID:27904885
Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lan, Yemin; Rosen, Gail; Hershberg, Ruth

The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less
Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

DOE PAGES

Lan, Yemin; Rosen, Gail; Hershberg, Ruth

2016-05-03

The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less
Conservation of Three-Dimensional Helix-Loop-Helix Structure through the Vertebrate Lineage Reopens the Cold Case of Gonadotropin-Releasing Hormone-Associated Peptide.

PubMed

Pérez Sirkin, Daniela I; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M; Vissio, Paula G; Dufour, Sylvie

2017-01-01

GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation.
Conservation of Three-Dimensional Helix-Loop-Helix Structure through the Vertebrate Lineage Reopens the Cold Case of Gonadotropin-Releasing Hormone-Associated Peptide

PubMed Central

Pérez Sirkin, Daniela I.; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M.; Vissio, Paula G.; Dufour, Sylvie

2017-01-01

GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation. PMID:28878737
Evidence for the Concerted Evolution between Short Linear Protein Motifs and Their Flanking Regions

PubMed Central

Chica, Claudia; Diella, Francesca; Gibson, Toby J.

2009-01-01

Background Linear motifs are short modules of protein sequences that play a crucial role in mediating and regulating many protein–protein interactions. The function of linear motifs strongly depends on the context, e.g. functional instances mainly occur inside flexible regions that are accessible for interaction. Sometimes linear motifs appear as isolated islands of conservation in multiple sequence alignments. However, they also occur in larger blocks of sequence conservation, suggesting an active role for the neighbouring amino acids. Results The evolution of regions flanking 116 functional linear motif instances was studied. The conservation of the amino acid sequence and order/disorder tendency of those regions was related to presence/absence of the instance. For the majority of the analysed instances, the pairs of sequences conserving the linear motif were also observed to maintain a similar local structural tendency and/or to have higher local sequence conservation when compared to pairs of sequences where one is missing the linear motif. Furthermore, those instances have a higher chance to co–evolve with the neighbouring residues in comparison to the distant ones. Those findings are supported by examples where the regulation of the linear motif–mediated interaction has been shown to depend on the modifications (e.g. phosphorylation) at neighbouring positions or is thought to benefit from the binding versatility of disordered regions. Conclusion The results suggest that flanking regions are relevant for linear motif–mediated interactions, both at the structural and sequence level. More interestingly, they indicate that the prediction of linear motif instances can be enriched with contextual information by performing a sequence analysis similar to the one presented here. This can facilitate the understanding of the role of these predicted instances in determining the protein function inside the broader context of the cellular network where they arise. PMID:19584925
CoSMoS: Conserved Sequence Motif Search in the proteome

PubMed Central

Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I

2006-01-01

Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915
Identification of a Conserved Non-Protein-Coding Genomic Element that Plays an Essential Role in Alphabaculovirus Pathogenesis

PubMed Central

Kikhno, Irina

2014-01-01

Highly homologous sequences 154–157 bp in length grouped under the name of “conserved non-protein-coding element” (CNE) were revealed in all of the sequenced genomes of baculoviruses belonging to the genus Alphabaculovirus. A CNE alignment led to the detection of a set of highly conserved nucleotide clusters that occupy strictly conserved positions in the CNE sequence. The significant length of the CNE and conservation of both its length and cluster architecture were identified as a combination of characteristics that make this CNE different from known viral non-coding functional sequences. The essential role of the CNE in the Alphabaculovirus life cycle was demonstrated through the use of a CNE-knockout Autographa californica multiple nucleopolyhedrovirus (AcMNPV) bacmid. It was shown that the essential function of the CNE was not mediated by the presumed expression activities of the protein- and non-protein-coding genes that overlap the AcMNPV CNE. On the basis of the presented data, the AcMNPV CNE was categorized as a complex-structured, polyfunctional genomic element involved in an essential DNA transaction that is associated with an undefined function of the baculovirus genome. PMID:24740153
Portrayal of sustainability principles in the mission statements and on home pages of the world's largest organizations.

PubMed

Garnett, Stephen T; Lawes, Michael J; James, Robyn; Bigland, Kristen; Zander, Kerstin K

2016-04-01

Conservation can be achieved only if sustainability is embraced as core to organizational cultures. To test the extent to which the related concepts of sustainability, conservation, response to climate change, poverty alleviation, and gender equity have been incorporated into organizational culture, we compared mission statements published from 1990 to 2000 with those published in 2014 for 150 organizations, including conservation nongovernmental organizations (NGOs), aid NGOs, government development agencies, resource extraction companies, and retailers (30 in each category). We also analyzed the 2014 home web pages of each organization. Relative to the earlier period, the frequency with which mission statements mentioned poverty alleviation, biodiversity conservation, and a range of sustainable practices increased only slightly by 2014, particularly among resource extractors and retail companies. Few organizations in any sector had embedded either climate change or gender equity into their mission statements. In addition, the proportional intensity with which any of the aspirations were expressed did not change between periods. For current home pages, conservation NGOs, resource extractors, and government agencies were significantly more likely to acknowledge the importance of matters that were not part of their core business, but few aid agencies or retail companies promoted goals beyond alleviation of crises and profit maximization, respectively. Overall, there has been some progress in recognizing poverty alleviation, biodiversity conservation, and sustainable practices, but gender equity and a determination to reduce impacts on climate change are still rarely promoted as central institutional concerns. Sustainability in general, and biodiversity conservation in particular, will not be achieved unless their importance is more widely apparent in core communication products of organizations. © 2015 Society for Conservation Biology.
Genome-Wide Association Study of a Validated Case Definition of Gulf War Illness in a Population-Representative Sample

DTIC Science & Technology

2013-09-01

sequence dataset. All procedures were performed by personnel in the IIMT UT Southwestern Genomics and Microarray Core using standard protocols. More... sequencing run, samples were demultiplexed using standard algorithms in the Genomics and Microarray Core and processed into individual sample Illumina single... Sequencing (RNA-Seq), using Illumina’s multiplexing mRNA-Seq to generate full sequence libraries from the poly-A tailed RNA to a read depth of 30
Essentiality, conservation, evolutionary pressure and codon bias in bacterial genomes.

PubMed

Dilucca, Maddalena; Cimini, Giulio; Giansanti, Andrea

2018-07-15

Essential genes constitute the core of genes which cannot be mutated too much nor lost along the evolutionary history of a species. Natural selection is expected to be stricter on essential genes and on conserved (highly shared) genes, than on genes that are either nonessential or peculiar to a single or a few species. In order to further assess this expectation, we study here how essentiality of a gene is connected with its degree of conservation among several unrelated bacterial species, each one characterised by its own codon usage bias. Confirming previous results on E. coli, we show the existence of a universal exponential relation between gene essentiality and conservation in bacteria. Moreover, we show that, within each bacterial genome, there are at least two groups of functionally distinct genes, characterised by different levels of conservation and codon bias: i) a core of essential genes, mainly related to cellular information processing; ii) a set of less conserved nonessential genes with prevalent functions related to metabolism. In particular, the genes in the first group are more retained among species, are subject to a stronger purifying conservative selection and display a more limited repertoire of synonymous codons. The core of essential genes is close to the minimal bacterial genome, which is in the focus of recent studies in synthetic biology, though we confirm that orthologs of genes that are essential in one species are not necessarily essential in other species. We also list a set of highly shared genes which, reasonably, could constitute a reservoir of targets for new anti-microbial drugs. Copyright © 2018 Elsevier B.V. All rights reserved.
Comparative Sequence and X-Inactivation Analyses of a Domain of Escape in Human Xp11.2 and the Conserved Segment in Mouse

PubMed Central

Tsuchiya, Karen D.; Greally, John M.; Yi, Yajun; Noel, Kevin P.; Truong, Jean-Pierre; Disteche, Christine M.

2004-01-01

We have performed X-inactivation and sequence analyses on 350 kb of sequence from human Xp11.2, a region shown previously to contain a cluster of genes that escape X inactivation, and we compared this region with the region of conserved synteny in mouse. We identified several new transcripts from this region in human and in mouse, which defined the full extent of the domain escaping X inactivation in both species. In human, escape from X inactivation involves an uninterrupted 235-kb domain of multiple genes. Despite highly conserved gene content and order between the two species, Smcx is the only mouse gene from the conserved segment that escapes inactivation. As repetitive sequences are believed to facilitate spreading of X inactivation along the chromosome, we compared the repetitive sequence composition of this region between the two species. We found that long terminal repeats (LTRs) were decreased in the human domain of escape, but not in the majority of the conserved mouse region adjacent to Smcx in which genes were subject to X inactivation, suggesting that these repeats might be excluded from escape domains to prevent spreading of silencing. Our findings indicate that genomic context, as well as gene-specific regulatory elements, interact to determine expression of a gene from the inactive X-chromosome. PMID:15197169
Effects of a Non-Conservative Sequence on the Properties of β-glucuronidase from Aspergillus terreus Li-20

PubMed Central

Liu, Yanli; Huangfu, Jie; Qi, Feng; Kaleem, Imdad; E, Wenwen; Li, Chun

2012-01-01

We cloned the β-glucuronidase gene (AtGUS) from Aspergillus terreus Li-20 encoding 657 amino acids (aa), which can transform glycyrrhizin into glycyrrhetinic acid monoglucuronide (GAMG) and glycyrrhetinic acid (GA). Based on sequence alignment, the C-terminal non-conservative sequence showed low identity with those of other species; thus, the partial sequence AtGUS(-3t) (1–592 aa) was amplified to determine the effects of the non-conservative sequence on the enzymatic properties. AtGUS and AtGUS(-3t) were expressed in E. coli BL21, producing AtGUS-E and AtGUS(-3t)-E, respectively. At the similar optimum temperature (55°C) and pH (AtGUS-E, 6.6; AtGUS(-3t)-E, 7.0) conditions, the thermal stability of AtGUS(-3t)-E was enhanced at 65°C, and the metal ions Co2+, Ca2+ and Ni2+ showed opposite effects on AtGUS-E and AtGUS(-3t)-E, respectively. Furthermore, Km of AtGUS(-3t)-E (1.95 mM) was just nearly one-seventh that of AtGUS-E (12.9 mM), whereas the catalytic efficiency of AtGUS(-3t)-E was 3.2 fold higher than that of AtGUS-E (7.16 vs. 2.24 mM s−1), revealing that the truncation of non-conservative sequence can significantly improve the catalytic efficiency of AtGUS. Conformational analysis illustrated significant difference in the secondary structure between AtGUS-E and AtGUS(-3t)-E by circular dichroism (CD). The results showed that the truncation of the non-conservative sequence could preferably alter and influence the stability and catalytic efficiency of enzyme. PMID:22347419
Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

2004-08-06

Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. Results We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene,more » and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Conclusions Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less
Genome analysis of smooth tubercle bacilli provides insights into ancestry and pathoadaptation of the etiologic agent of tuberculosis

PubMed Central

Supply, Philip; Marceau, Michael; Mangenot, Sophie; Roche, David; Rouanet, Carine; Khanna, Varun; Majlessi, Laleh; Criscuolo, Alexis; Tap, Julien; Pawlik, Alexandre; Fiette, Laurence; Orgeur, Mickael; Fabre, Michel; Parmentier, Cécile; Frigui, Wafa; Simeone, Roxane; Boritsch, Eva C.; Debrie, Anne-Sophie; Willery, Eve; Walker, Danielle; Quail, Michael A.; Ma, Laurence; Bouchier, Christiane; Salvignol, Grégory; Sayes, Fadel; Cascioferro, Alessandro; Seemann, Torsten; Barbe, Valérie; Locht, Camille; Gutierrez, Maria-Cristina; Leclerc, Claude; Bentley, Stephen; Stinear, Timothy P.; Brisse, Sylvain; Médigue, Claudine; Parkhill, Julian; Cruveiller, Stéphane; Brosch, Roland

2013-01-01

Global spread and genetic monomorphism are hallmarks of Mycobacterium tuberculosis, the agent of human tuberculosis. In contrast, Mycobacterium canettii, and related tubercle bacilli that also cause human tuberculosis and exhibit unusual smooth colony morphology, are restricted to East-Africa. Here, we sequenced and analyzed the genomes of five representative strains of smooth tubercle bacilli (STB) using Sanger (4-5x coverage), 454/Roche (13-18x coverage) and/or Illumina DNA sequencing (45-105x coverage). We show that STB are highly recombinogenic and evolutionary early-branching, with larger genome sizes, 25-fold more SNPs, fewer molecular scars and distinct CRISPR-Cas systems relative to M. tuberculosis. Despite the differences, all tuberculosis-causing mycobacteria share a highly conserved core genome. Mouse-infection experiments revealed that STB are less persistent and virulent than M. tuberculosis. We conclude that M. tuberculosis emerged from an ancestral, STB-like pool of mycobacteria by gain of persistence and virulence mechanisms and we provide genome-wide insights into the molecular events involved. PMID:23291586
Delineating slowly and rapidly evolving fractions of the Drosophila genome.

PubMed

Keith, Jonathan M; Adams, Peter; Stephen, Stuart; Mattick, John S

2008-05-01

Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted approximately 62-66% of the D. melanogaster genome. Almost all (>90%) of the aligned protein-coding sequence is in this fraction, suggesting much of it (comprising the majority of the Drosophila genome, including approximately 56% of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6% to 4.8%. This fraction is also enriched for protein-coding sequence (while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification. Software, data, and results available at www.maths.qut.edu.au/-keithj/. Genomic segments comprising the conservation classes available in BED format.
CORE-SINEs: Eukaryotic short interspersed retroposing elements with common sequence motifs

PubMed Central

Gilbert, Nicolas; Labuda, Damian

1999-01-01

A 65-bp “core” sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3′ ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome. PMID:10077603

COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures

DOE PAGES

Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.; ...

2016-09-20

There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
Conservation of hot regions in protein-protein interaction in evolution.

PubMed

Hu, Jing; Li, Jiarui; Chen, Nansheng; Zhang, Xiaolong

2016-11-01

The hot regions of protein-protein interactions refer to the active area which formed by those most important residues to protein combination process. With the research development on protein interactions, lots of predicted hot regions can be discovered efficiently by intelligent computing methods, while performing biology experiments to verify each every prediction is hardly to be done due to the time-cost and the complexity of the experiment. This study based on the research of hot spot residue conservations, the proposed method is used to verify authenticity of predicted hot regions that using machine learning algorithm combined with protein's biological features and sequence conservation, though multiple sequence alignment, module substitute matrix and sequence similarity to create conservation scoring algorithm, and then using threshold module to verify the conservation tendency of hot regions in evolution. This research work gives an effective method to verify predicted hot regions in protein-protein interactions, which also provides a useful way to deeply investigate the functional activities of protein hot regions. Copyright © 2016. Published by Elsevier Inc.
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.

There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
Principles of regulatory information conservation between mouse and human

DOE PAGES

Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; ...

2014-11-19

To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human–mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and withmore » genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Lastly, single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.« less
THE GRK4 SUBFAMILY OF G PROTEIN-COUPLED RECEPTOR KINASES: ALTERNATIVE SPLICING, GENE ORGANIZATION, AND SEQUENCE CONSERVATION

EPA Science Inventory

The GRK4 subfamily of G protein-coupled receptor kinases. Alternative splicing, gene organization, and sequence conservation.

Premont RT, Macrae AD, Aparicio SA, Kendall HE, Welch JE, Lefkowitz RJ.

Department of Medicine, Howard Hughes Medical Institute, Duke Univer...
18 CFR 401.37 - Sequence of approval.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 18 Conservation of Power and Water Resources 2 2011-04-01 2011-04-01 false Sequence of approval. 401.37 Section 401.37 Conservation of Power and Water Resources DELAWARE RIVER BASIN COMMISSION ADMINISTRATIVE MANUAL RULES OF PRACTICE AND PROCEDURE Project Review Under Section 3.8 of the Compact § 401.37...
Cytotoxic T lymphocytes and CD4 epitope mutations in the pre-core/core region of hepatitis B virus in chronic hepatitis B carriers in Northeast Iran.

PubMed

Zhand, Sareh; Tabarraei, Alijan; Nazari, Amineh; Moradi, Abdolvahab

2017-07-01

Hepatitis B virus (HBV) is vulnerable to many various mutations. Those within epitopes recognized by sensitized T cells may influence the re-emergence of the virus. This study was designed to investigate the mutation in immune epitope regions of HBV pre-core/core among chronic HBV patients of Golestan province, Northeast Iran. In 120 chronic HBV carriers, HBV DNA was extracted from blood plasma samples and PCR was done using specific primers. Direct sequencing and alignment of the pre-core/core region were applied using reference sequence from Gene Bank database (Accession Number AB033559). The study showed 27 inferred amino acid substitutions, 9 of which (33.3%) were in CD4 and 2 (7.4%) in cytotoxic T lymphocytes' (CTL) epitopes and 16 other mutations (59.2%) were observed in other regions. CTL escape mutations were not commonly observed in pre-core/core sequences of chronic HBV carriers in the locale of study. It can be concluded that most of the inferred amino acid substitutions occur in different immune epitopes other than CTL and CD4.
Protein architecture and core residues in unwound α-helices provide insights to the transport function of plant AtCHX17

DOE Office of Scientific and Technical Information (OSTI.GOV)

Czerny, Daniel D.; Padmanaban, Senthilkumar; Anishkin, Andriy

Using Arabidopsis thaliana AtCHX17 as an example, we combine structural modeling and mutagenesis to provide insights on its protein architecture and transport function which is poorly characterized. This approach is based on the observation that protein structures are significantly more conserved in evolution than linear sequences, and mechanistic similarities among diverse transporters are emerging. Two homology models of AtCHX17 were obtained that show a protein fold similar to known structures of bacterial Na +/H + antiporters, EcNhaA and TtNapA. The distinct secondary and tertiary structure models highlighted residues at positions potentially important for CHX17 activity. Mutagenesis showed that asparagine-N200 andmore » aspartate-D201 inside transmembrane5 (TM5), and lysine-K355 inside TM10 are critical for AtCHX17 activity. We reveal previously unrecognized threonine-T170 and lysine-K383 as key residues at unwound regions in the middle of TM4 and TM11 α-helices, respectively. Mutation of glutamate-E111 located near the membrane surface inhibited AtCHX17 activity, suggesting a role in pH sensing. The long carboxylic tail of unknown purpose has an alternating β-sheet and α-helix secondary structure that is conserved in prokaryote universal stress proteins. Here, these results support the overall architecture of AtCHX17 and identify D201, N200 and novel residues T170 and K383 at the functional core which likely participates in ion recognition, coordination and/or translocation, similar to characterized cation/H + exchangers. The core of AtCHX17 models according to EcNhaA and TtNapA templates faces inward and outward, respectively, which may reflect two conformational states of the alternating access transport mode for proteins belonging to the plant CHX family.« less
Protein architecture and core residues in unwound α-helices provide insights to the transport function of plant AtCHX17

DOE PAGES

Czerny, Daniel D.; Padmanaban, Senthilkumar; Anishkin, Andriy; ...

2016-05-11

Using Arabidopsis thaliana AtCHX17 as an example, we combine structural modeling and mutagenesis to provide insights on its protein architecture and transport function which is poorly characterized. This approach is based on the observation that protein structures are significantly more conserved in evolution than linear sequences, and mechanistic similarities among diverse transporters are emerging. Two homology models of AtCHX17 were obtained that show a protein fold similar to known structures of bacterial Na +/H + antiporters, EcNhaA and TtNapA. The distinct secondary and tertiary structure models highlighted residues at positions potentially important for CHX17 activity. Mutagenesis showed that asparagine-N200 andmore » aspartate-D201 inside transmembrane5 (TM5), and lysine-K355 inside TM10 are critical for AtCHX17 activity. We reveal previously unrecognized threonine-T170 and lysine-K383 as key residues at unwound regions in the middle of TM4 and TM11 α-helices, respectively. Mutation of glutamate-E111 located near the membrane surface inhibited AtCHX17 activity, suggesting a role in pH sensing. The long carboxylic tail of unknown purpose has an alternating β-sheet and α-helix secondary structure that is conserved in prokaryote universal stress proteins. Here, these results support the overall architecture of AtCHX17 and identify D201, N200 and novel residues T170 and K383 at the functional core which likely participates in ion recognition, coordination and/or translocation, similar to characterized cation/H + exchangers. The core of AtCHX17 models according to EcNhaA and TtNapA templates faces inward and outward, respectively, which may reflect two conformational states of the alternating access transport mode for proteins belonging to the plant CHX family.« less
Rickettsia Phylogenomics: Unwinding the Intricacies of Obligate Intracellular Life

PubMed Central

Gillespie, Joseph J.; Williams, Kelly; Shukla, Maulik; Snyder, Eric E.; Nordberg, Eric K.; Ceraul, Shane M.; Dharmanolla, Chitti; Rainey, Daphne; Soneja, Jeetendra; Shallom, Joshua M.; Vishnubhat, Nataraj Dongre; Wattam, Rebecca; Purkayastha, Anjan; Czar, Michael; Crasta, Oswald; Setubal, Joao C.; Azad, Abdu F.; Sobral, Bruno S.

2008-01-01

Background Completed genome sequences are rapidly increasing for Rickettsia, obligate intracellular α-proteobacteria responsible for various human diseases, including epidemic typhus and Rocky Mountain spotted fever. In light of phylogeny, the establishment of orthologous groups (OGs) of open reading frames (ORFs) will distinguish the core rickettsial genes and other group specific genes (class 1 OGs or C1OGs) from those distributed indiscriminately throughout the rickettsial tree (class 2 OG or C2OGs). Methodology/Principal Findings We present 1823 representative (no gene duplications) and 259 non-representative (at least one gene duplication) rickettsial OGs. While the highly reductive (∼1.2 MB) Rickettsia genomes range in predicted ORFs from 872 to 1512, a core of 752 OGs was identified, depicting the essential Rickettsia genes. Unsurprisingly, this core lacks many metabolic genes, reflecting the dependence on host resources for growth and survival. Additionally, we bolster our recent reclassification of Rickettsia by identifying OGs that define the AG (ancestral group), TG (typhus group), TRG (transitional group), and SFG (spotted fever group) rickettsiae. OGs for insect-associated species, tick-associated species and species that harbor plasmids were also predicted. Through superimposition of all OGs over robust phylogeny estimation, we discern between C1OGs and C2OGs, the latter depicting genes either decaying from the conserved C1OGs or acquired laterally. Finally, scrutiny of non-representative OGs revealed high levels of split genes versus gene duplications, with both phenomena confounding gene orthology assignment. Interestingly, non-representative OGs, as well as OGs comprised of several gene families typically involved in microbial pathogenicity and/or the acquisition of virulence factors, fall predominantly within C2OG distributions. Conclusion/Significance Collectively, we determined the relative conservation and distribution of 14354 predicted ORFs from 10 rickettsial genomes across robust phylogeny estimation. The data, available at PATRIC (PathoSystems Resource Integration Center), provide novel information for unwinding the intricacies associated with Rickettsia pathogenesis, expanding the range of potential diagnostic, vaccine and therapeutic targets. PMID:19194535
[Analysis of cis-regulatory element distribution in gene promoters of Gossypium raimondii and Arabidopsis thaliana].

PubMed

Sun, Gao-Fei; He, Shou-Pu; Du, Xiong-Ming

2013-10-01

Cotton genomic studies have boomed since the release of Gossypium raimondii draft genome. In this study, cis-regulatory element (CRE) in 1 kb length sequence upstream 5' UTR of annotated genes were selected and scanned in the Arabidopsis thaliana (At) and Gossypium raimondii (Gr) genomes, based on the database of PLACE (Plant cis-acting Regulatory DNA Elements). According to the definition of this study, 44 (12.3%) and 57 (15.5%) CREs presented "peak-like" distribution in the 1 kb selected sequences of both genomes, respectively. Thirty-four of them were peak-like distributed in both genomes, which could be further categorized into 4 types based on their core sequences. The coincidence of TATABOX peak position and their actual position ((-) -30 bp) indicated that the position of a common CRE was conservative in different genes, which suggested that the peak position of these CREs was their possible actual position of transcription factors. The position of a common CRE was also different between the two genomes due to stronger length variation of 5' UTR in Gr than At. Furthermore, most of the peak-like CREs were located in the region of -110 bp-0 bp, which suggested that concentrated distribution might be conductive to the interaction of transcription factors, and then regulate the gene expression in downstream.
The Ditylenchus destructor genome provides new insights into the evolution of plant parasitic nematodes.

PubMed

Zheng, Jinshui; Peng, Donghai; Chen, Ling; Liu, Hualin; Chen, Feng; Xu, Mengci; Ju, Shouyong; Ruan, Lifang; Sun, Ming

2016-07-27

Plant-parasitic nematodes were found in 4 of the 12 clades of phylum Nematoda. These nematodes in different clades may have originated independently from their free-living fungivorous ancestors. However, the exact evolutionary process of these parasites is unclear. Here, we sequenced the genome sequence of a migratory plant nematode, Ditylenchus destructor We performed comparative genomics among the free-living nematode, Caenorhabditis elegans and all the plant nematodes with genome sequences available. We found that, compared with C. elegans, the core developmental control processes underwent heavy reduction, though most signal transduction pathways were conserved. We also found D. destructor contained more homologies of the key genes in the above processes than the other plant nematodes. We suggest that Ditylenchus spp. may be an intermediate evolutionary history stage from free-living nematodes that feed on fungi to obligate plant-parasitic nematodes. Based on the facts that D. destructor can feed on fungi and has a relatively short life cycle, and that it has similar features to both C. elegans and sedentary plant-parasitic nematodes from clade 12, we propose it as a new model to study the biology, biocontrol of plant nematodes and the interaction between nematodes and plants. © 2016 The Author(s).
A surprisingly large RNase P RNA in Candida glabrata

PubMed Central

KACHOURI, RYM; STRIBINSKIS, VILIUS; ZHU, YANGLONG; RAMOS, KENNETH S.; WESTHOF, ERIC; LI, YONG

2005-01-01

We have found an extremely large ribonuclease P (RNase P) RNA (RPR1) in the human pathogen Candida glabrata and verified that this molecule is expressed and present in the active enzyme complex of this hemiascomycete yeast. A structural alignment of the C. glabrata sequence with 36 other hemiascomycete RNase P RNAs (abbreviated as P RNAs) allows us to characterize the types of insertions. In addition, 15 P RNA sequences were newly characterized by searching in the recently sequenced genomes Candida albicans, C. glabrata, Debaryomyces hansenii, Eremothecium gossypii, Kluyveromyces lactis, Kluyveromyces waltii, Naumovia castellii, Saccharomyces kudriavzevii, Saccharomyces mikatae, and Yarrowia lipolytica; and by PCR amplification for other Candida species (Candida guilliermondii, Candida krusei, Candida parapsilosis, Candida stellatoidea, and Candida tropicalis). The phylogenetic comparative analysis identifies a hemiascomycete secondary structure consensus that presents a conserved core in all species with variable insertions or deletions. The most significant variability is found in C. glabrata P RNA in which three insertions exceeding in total 700 nt are present in the Specificity domain. This P RNA is more than twice the length of any other homologous P RNAs known in the three domains of life and is eight times the size of the smallest. RNase P RNA, therefore, represents one of the most diversified noncoding RNAs in terms of size variation and structural diversity. PMID:15987816
Aliphatic peptides show similar self-assembly to amyloid core sequences, challenging the importance of aromatic interactions in amyloidosis.

PubMed

Lakshmanan, Anupama; Cheong, Daniel W; Accardo, Angelo; Di Fabrizio, Enzo; Riekel, Christian; Hauser, Charlotte A E

2013-01-08

The self-assembly of abnormally folded proteins into amyloid fibrils is a hallmark of many debilitating diseases, from Alzheimer's and Parkinson diseases to prion-related disorders and diabetes type II. However, the fundamental mechanism of amyloid aggregation remains poorly understood. Core sequences of four to seven amino acids within natural amyloid proteins that form toxic fibrils have been used to study amyloidogenesis. We recently reported a class of systematically designed ultrasmall peptides that self-assemble in water into cross-β-type fibers. Here we compare the self-assembly of these peptides with natural core sequences. These include core segments from Alzheimer's amyloid-β, human amylin, and calcitonin. We analyzed the self-assembly process using circular dichroism, electron microscopy, X-ray diffraction, rheology, and molecular dynamics simulations. We found that the designed aliphatic peptides exhibited a similar self-assembly mechanism to several natural sequences, with formation of α-helical intermediates being a common feature. Interestingly, the self-assembly of a second core sequence from amyloid-β, containing the diphenylalanine motif, was distinctly different from all other examined sequences. The diphenylalanine-containing sequence formed β-sheet aggregates without going through the α-helical intermediate step, giving a unique fiber-diffraction pattern and simulation structure. Based on these results, we propose a simplified aliphatic model system to study amyloidosis. Our results provide vital insight into the nature of early intermediates formed and suggest that aromatic interactions are not as important in amyloid formation as previously postulated. This information is necessary for developing therapeutic drugs that inhibit and control amyloid formation.
Functions of the 3′ and 5′ genome RNA regions of members of the genus Flavivirus

PubMed Central

Brinton, Margo A.; Basu, Mausumi

2015-01-01

The positive sense genomes of members of the genus Flavivirus in the family Flaviviridae are ~11 kb nts in length and have a 5′ type I cap but no 3′ poly A. The 5′ and 3′ terminal regions contain short conserved sequences that are proposed to be repeated remnants of an ancient sequence. However, the functions of most of these conserved sequences have not yet been determined. The terminal regions of the genome also contain multiple conserved RNA structures. Functional data for many of these structures has been obtained. Three sets of complementary 3′ and 5′ terminal region sequences, some of which are located in conserved RNA structures, interact to form a panhandle structure that is required for initiation of minus strand RNA synthesis with the 5′ terminal structure functioning as the promoter. How the switch from the terminal RNA structure base pairing to the long distance RNA-RNA interaction is triggered and regulated is not well understood but evidence suggests involvement of a cell protein binding to three sites on the 3′ terminal RNA structures and a cis-acting metastable 3′ RNA element in the 3′ terminal structure. Cell proteins may also be involved in facilitating exponential replication of nascent genomic RNA within replication vesicles at later times of infection cycle. Other conserved RNA structures and/or sequences in the 5′ and 3′ terminal regions have been proposed to regulate genome translation. Additional functions of the 5′ and 3′ terminal sequences have also been reported. PMID:25683510
Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

PubMed Central

Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

2012-01-01

Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086
Phylogenetic analysis of members of the Phycodnaviridae virus family, using amplified fragments of the major capsid protein gene.

PubMed

Larsen, J B; Larsen, A; Bratbak, G; Sandaa, R-A

2008-05-01

Algal viruses are considered ecologically important by affecting host population dynamics and nutrient flow in aquatic food webs. Members of the family Phycodnaviridae are also interesting due to their extraordinary genome size. Few algal viruses in the Phycodnaviridae family have been sequenced, and those that have been have few genes in common and low gene homology. It has hence been difficult to design general PCR primers that allow further studies of their ecology and diversity. In this study, we screened the nine type I core genes of the nucleocytoplasmic large DNA viruses for sequences suitable for designing a general set of primers. Sequence comparison between members of the Phycodnaviridae family, including three partly sequenced viruses infecting the prymnesiophyte Pyramimonas orientalis and the haptophytes Phaeocystis pouchetii and Chrysochromulina ericina (Pyramimonas orientalis virus 01B [PoV-01B], Phaeocystis pouchetii virus 01 [PpV-01], and Chrysochromulina ericina virus 01B [CeV-01B], respectively), revealed eight conserved regions in the major capsid protein (MCP). Two of these regions also showed conservation at the nucleotide level, and this allowed us to design degenerate PCR primers. The primers produced 347- to 518-bp amplicons when applied to lysates from algal viruses kept in culture and from natural viral communities. The aim of this work was to use the MCP as a proxy to infer phylogenetic relationships and genetic diversity among members of the Phycodnaviridae family and to determine the occurrence and diversity of this gene in natural viral communities. The results support the current legitimate genera in the Phycodnaviridae based on alga host species. However, while placing the mimivirus in close proximity to the type species, PBCV-1, of Phycodnaviridae along with the three new viruses assigned to the family (PoV-01B, PpV-01, and CeV-01B), the results also indicate that the coccolithoviruses and phaeoviruses are more diverged from this group. Phylogenetic analysis of amplicons from virus assemblages from Norwegian coastal waters as well as from isolated algal viruses revealed a cluster of viruses infecting members of the prymnesiophyte and prasinophyte alga divisions. Other distinct clusters were also identified, containing amplicons from this study as well as sequences retrieved from the Sargasso Sea metagenome. This shows that closely related sequences of this family are present at geographically distant locations within the marine environment.
Comparison of ZP3 protein sequences among vertebrate species: to obtain a consensus sequence for immunocontraception.

PubMed

Zhu, X; Naz, R K

1999-03-01

The deduced ZP3 amino acid (aa) sequences of 13 vertebrate species namely mouse, hamster, rabbit, pig, porcine, cow, dog, cat, human, bonnet, marmoset, carp, and frog were compared using the PILEUP and PRETTY alignment programs (GCG, Wisconsin, USA). The published aa sequences obtained from 13 vertebrate species indicated the overall evolutionarily conservation in the N-terminus, central region, and C-terminus of the ZP3 polypeptide. More variations of ZP3 polypeptide sequences were seen in the alignments of carp and frog from the 11 mammalian species making the leader sequence more prominent. The canonical furin proteolytic processing signal at the C-terminus was found in all the ZP3 polypeptide sequences except of carp and frog. In the central region, the ZP3 deduced aa sequences of all the 13 vertebrate species aligned well, and six relatively conserved sequences were found. There are 11 conserved cysteine residues in the central region across all species including carp and frog, indicating that these residues have longer evolutionary history. The ZP3 aa sequence similarities were examined using the GAP program (GCG). The highest aa similarities are observed between the members of the same order within the class mammalia, and also (95.4%) between pig (ungulata) and rabbit (lagomorpha). The deduced ZP3 aa sequences per se may not be enough to build a phylogenetic tree.
A conserved RNA structural element within the hepatitis B virus post-transcriptional regulatory element enhance nuclear export of intronless transcripts and repress the splicing mechanism.

PubMed

Visootsat, Akasit; Payungporn, Sunchai; T-Thienprasert, Nattanan P

2015-12-01

Hepatitis B virus (HBV) infection is a primary cause of hepatocellular carcinoma and liver cirrhosis worldwide. To develop novel antiviral drugs, a better understanding of HBV gene expression regulation is vital. One important aspect is to understand how HBV hijacks the cellular machinery to export unspliced RNA from the nucleus. The HBV post-transcriptional regulatory element (HBV PRE) has been proposed to be the HBV RNA nuclear export element. However, the function remains controversial, and the core element is unclear. This study, therefore, aimed to identify functional regulatory elements within the HBV PRE and investigate their functions. Using bioinformatics programs based on sequence conservation and conserved RNA secondary structures, three regulatory elements were predicted, namely PRE 1151-1410, PRE 1520-1620 and PRE 1650-1684. PRE 1151-1410 significantly increased intronless and unspliced luciferase activity in both HepG2 and COS-7 cells. Likewise, PRE 1151-1410 significantly elevated intronless and unspliced HBV surface transcripts in liver cancer cells. Moreover, motif analysis predicted that PRE 1151-1410 contains several regulatory motifs. This study reported the roles of PRE 1151-1410 in intronless transcript nuclear export and the splicing mechanism. Additionally, these results provide knowledge in the field of HBV RNA regulation. Moreover, PRE 1151-1410 may be used to enhance the expression of other mRNAs in intronless reporter plasmids.
Comparative Analysis of Wolbachia Genomes Reveals Streamlining and Divergence of Minimalist Two-Component Systems

PubMed Central

Christensen, Steen; Serbus, Laura Renee

2015-01-01

Two-component regulatory systems are commonly used by bacteria to coordinate intracellular responses with environmental cues. These systems are composed of functional protein pairs consisting of a sensor histidine kinase and cognate response regulator. In contrast to the well-studied Caulobacter crescentus system, which carries dozens of these pairs, the streamlined bacterial endosymbiont Wolbachia pipientis encodes only two pairs: CckA/CtrA and PleC/PleD. Here, we used bioinformatic tools to compare characterized two-component system relays from C. crescentus, the related Anaplasmataceae species Anaplasma phagocytophilum and Ehrlichia chaffeensis, and 12 sequenced Wolbachia strains. We found the core protein pairs and a subset of interacting partners to be highly conserved within Wolbachia and these other Anaplasmataceae. Genes involved in two-component signaling were positioned differently within the various Wolbachia genomes, whereas the local context of each gene was conserved. Unlike Anaplasma and Ehrlichia, Wolbachia two-component genes were more consistently found clustered with metabolic genes. The domain architecture and key functional residues standard for two-component system proteins were well-conserved in Wolbachia, although residues that specify cognate pairing diverged substantially from other Anaplasmataceae. These findings indicate that Wolbachia two-component signaling pairs share considerable functional overlap with other α-proteobacterial systems, whereas their divergence suggests the potential for regulatory differences and cross-talk. PMID:25809075

In silico modeling of the Moniliophthora perniciosa Atg8 protein.

PubMed

Pereira, A C F; Cardoso, T H S; Brendel, M; Pungartnik, C

2013-12-11

Autophagy is defined as an intracellular system of lysosomal degradation in eukaryotic cells, and the genes involved in this process are conserved from yeast to humans. Among these genes, ATG8 encodes a ubiquitin-like protein that is conjugated to a phosphatidylethanolamine (PE) membrane by the ubiquitination system. The Atg8p-PE complex is important in initiating the formation of the autophagosome and thus plays a critical role in autophagy. In silico modeling of Atg8p of Moniliophthora perniciosa revealed its three-dimensional structure and enabled comparison with its Saccharomyces cerevisiae homologue ScAtg8p. Some common and distinct features were observed between these two proteins, including the conservation of residues required to allow the interaction of α-helix1 with the ubiquitin core. However, the electrostatic potential surfaces of these helices differ, implying particular roles in selecting specific binding partners. The proposed structure was validated by the programs PROCHECK 3.4, ANOLEA, and QMEAN, which demonstrated 100% of amino acids located in favorable regions with low total energy. Our results showed that MpAtg8p contains the same functional domains (3 α-helices and 4 β-sheets) and is similar in structure as the ScAtg8p yeast. Both proteins have many conserved sequences in common, and therefore, their proposed three-dimensional models show similar configuration.
Conservation of Planar Polarity Pathway Function Across the Animal Kingdom.

PubMed

Hale, Rosalind; Strutt, David

2015-01-01

Planar polarity is a well-studied phenomenon resulting in the directional coordination of cells in the plane of a tissue. In invertebrates and vertebrates, planar polarity is established and maintained by the largely independent core and Fat/Dachsous/Four-jointed (Ft-Ds-Fj) pathways. Loss of function of these pathways can result in a wide range of developmental or cellular defects, including failure of gastrulation and problems with placement and function of cilia. This review discusses the conservation of these pathways across the animal kingdom. The lack of vital core pathway components in basal metazoans suggests that the core planar polarity pathway evolved shortly after, but not necessarily alongside, the emergence of multicellularity.
Computing prokaryotic gene ubiquity: rescuing the core from extinction.

PubMed

Charlebois, Robert L; Doolittle, W Ford

2004-12-01

The genomic core concept has found several uses in comparative and evolutionary genomics. Defined as the set of all genes common to (ubiquitous among) all genomes in a phylogenetically coherent group, core size decreases as the number and phylogenetic diversity of the relevant group increases. Here, we focus on methods for defining the size and composition of the core of all genes shared by sequenced genomes of prokaryotes (Bacteria and Archaea). There are few (almost certainly less than 50) genes shared by all of the 147 genomes compared, surely insufficient to conduct all essential functions. Sequencing and annotation errors are responsible for the apparent absence of some genes, while very limited but genuine disappearances (from just one or a few genomes) can account for several others. Core size will continue to decrease as more genome sequences appear, unless the requirement for ubiquity is relaxed. Such relaxation seems consistent with any reasonable biological purpose for seeking a core, but it renders the problem of definition more problematic. We propose an alternative approach (the phylogenetically balanced core), which preserves some of the biological utility of the core concept. Cores, however delimited, preferentially contain informational rather than operational genes; we present a new hypothesis for why this might be so.
Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

PubMed Central

Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

2013-01-01

Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343
The wheat cytochrome oxidase subunit II gene has an intron insert and three radical amino acid changes relative to maize

PubMed Central

Bonen, Linda; Boer, Poppo H.; Gray, Michael W.

1984-01-01

We have determined the sequence of the wheat mitochondrial gene for cytochrome oxidase subunit II (COII) and find that its derived protein sequence differs from that of maize at only three amino acid positions. Unexpectedly, all three replacements are non-conservative ones. The wheat COII gene has a highly-conserved intron at the same position as in maize, but the wheat intron is 1.5 times longer because of an insert relative to its maize counterpart. Hybridization analysis of mitochondrial DNA from rye, pea, broad bean and cucumber indicates strong sequence conservation of COII coding sequences among all these higher plants. However, only rye and maize mitochondrial DNA show homology with wheat COII intron sequences and rye alone with intron-insert sequences. We find that a sequence identical to the region of the 5' exon corresponding to the transmembrane domain of the COII protein is present at a second genomic location in wheat mitochondria. These variations in COII gene structure and size, as well as the presence of repeated COII sequences, illustrate at the DNA sequence level, factors which contribute to higher plant mitochondrial DNA diversity and complexity. ImagesFig. 3.Fig. 4.Fig. 5. PMID:16453565
Analyses of the Stability and Core Taxonomic Memberships of the Human Microbiome

PubMed Central

Li, Kelvin; Bihan, Monika; Methé, Barbara A.

2013-01-01

Analyses of the taxonomic diversity associated with the human microbiome continue to be an area of great importance. The study of the nature and extent of the commonly shared taxa (“core”), versus those less prevalent, establishes a baseline for comparing healthy and diseased groups by quantifying the variation among people, across body habitats and over time. The National Institutes of Health (NIH) sponsored Human Microbiome Project (HMP) has provided an unprecedented opportunity to examine and better define what constitutes the taxonomic core within and across body habitats and individuals through pyrosequencing-based profiling of 16S rRNA gene sequences from oral, skin, distal gut (stool), and vaginal body habitats from over 200 healthy individuals. A two-parameter model is introduced to quantitatively identify the core taxonomic members of each body habitat’s microbiota across the healthy cohort. Using only cutoffs for taxonomic ubiquity and abundance, core taxonomic members were identified for each of the 18 body habitats and also for the 4 higher-level body regions. Although many microbes were shared at low abundance, they exhibited a relatively continuous spread in both their abundance and ubiquity, as opposed to a more discretized separation. The numbers of core taxa members in the body regions are comparatively small and stable, reflecting the relatively high, but conserved, interpersonal variability within the cohort. Core sizes increased across the body regions in the order of: vagina, skin, stool, and oral cavity. A number of “minor” oral taxonomic core were also identified by their majority presence across the cohort, but with relatively low and stable abundances. A method for quantifying the difference between two cohorts was introduced and applied to samples collected on a second visit, revealing that over time, the oral, skin, and stool body regions tended to be more transient in their taxonomic structure than the vaginal body region. PMID:23671663
Governing Portable Conservation and Development Landscapes: Reconsidering Evidence in the Context of the Mbaracayú Biosphere Reserve

ERIC Educational Resources Information Center

Elgert, Laureen

2014-01-01

Conservation-with-development landscapes, such as UNESCO's Man and Biosphere Reserves, differentiate between areas of "nature" and "society". In Paraguay's Mbaracayú Biosphere Reserve, as elsewhere, this model has been used to support governance that focuses on conservation in the "core area" and sustainable…
Ecology and conservation of the Marbled Murrelet

Treesearch

C. John Ralph; George L. Hunt; Martin G. Raphael; John F. Piatt

1995-01-01

This report on the Marbled Murrelet (Brachyramphus marmoratus) was compiled and editied by the interagency Marbled Murrelet Conservation Assessment Core Team. The 37 Chapters cover both original studies and literature reviews of many aspects of the speciesâ biology, ecology, and conservation needs. It includes new information on the forest habitat...
Conservation of Transcription Start Sites within Genes across a Bacterial Genus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shao, Wenjun; Price, Morgan N.; Deutschbauer, Adam M.

Transcription start sites (TSSs) lying inside annotated genes, on the same or opposite strand, have been observed in diverse bacteria, but the function of these unexpected transcripts is unclear. Here, we use the metal-reducing bacterium Shewanella oneidensis MR-1 and its relatives to study the evolutionary conservation of unexpected TSSs. Using high-resolution tiling microarrays and 5'-end RNA sequencing, we identified 2,531 TSSs in S. oneidensis MR-1, of which 18% were located inside coding sequences (CDSs). Comparative transcriptome analysis with seven additional Shewanella species revealed that the majority (76%) of the TSSs within the upstream regions of annotated genes (gTSSs) were conserved.more » Thirty percent of the TSSs that were inside genes and on the sense strand (iTSSs) were also conserved. Sequence analysis around these iTSSs showed conserved promoter motifs, suggesting that many iTSS are under purifying selection. Furthermore, conserved iTSSs are enriched for regulatory motifs, suggesting that they are regulated, and they tend to eliminate polar effects, which confirms that they are functional. In contrast, the transcription of antisense TSSs located inside CDSs (aTSSs) was significantly less likely to be conserved (22%). However, aTSSs whose transcription was conserved often have conserved promoter motifs and drive the expression of nearby genes. Overall, our findings demonstrate that some internal TSSs are conserved and drive protein expression despite their unusual locations, but the majority are not conserved and may reflect noisy initiation of transcription rather than a biological function.« less
Domain architecture conservation in orthologs

PubMed Central

2011-01-01

Background As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence. To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs. Results The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation. The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent. Conclusions On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance. PMID:21819573
The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes.

PubMed

Treangen, Todd J; Ondov, Brian D; Koren, Sergey; Phillippy, Adam M

2014-01-01

Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods are limited in their ability to perform sequence comparisons of multiple sequences simultaneously. Here we present the Harvest suite of core-genome alignment and visualization tools for the rapid and simultaneous analysis of thousands of intraspecific microbial strains. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Together they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees. Using simulated and real data we demonstrate that our approach exhibits unrivaled speed while maintaining the accuracy of existing methods. The Harvest suite is open-source and freely available from: http://github.com/marbl/harvest.
Predictive modelling of JT-60SA high-beta steady-state plasma with impurity accumulation

NASA Astrophysics Data System (ADS)

Hayashi, N.; Hoshino, K.; Honda, M.; Ide, S.

2018-06-01

The integrated modelling code TOPICS has been extended to include core impurity transport, and applied to predictive modelling of JT-60SA high-beta steady-state plasma with the accumulation of impurity seeded to reduce the divertor heat load. In the modelling, models and conditions are selected for a conservative prediction, which considers a lower bound of plasma performance with the maximum accumulation of impurity. The conservative prediction shows the compatibility of impurity seeding with core plasma with high-beta (β N > 3.5) and full current drive conditions, i.e. when Ar seeding reduces the divertor heat load below 10 MW m‑2, its accumulation in the core is so moderate that the core plasma performance can be recovered by additional heating within the machine capability to compensate for Ar radiation. Due to the strong dependence of accumulation on the pedestal density gradient, high separatrix density is important for the low accumulation as well as the low divertor heat load. The conservative prediction also shows that JT-60SA has enough capability to explore the divertor heat load control by impurity seeding in high-beta steady-state plasmas.
Complete Genome Viral Phylogenies Suggests the Concerted Evolution of Regulatory Cores and Accessory Satellites

PubMed Central

Zanotto, Paolo Marinho de Andrade; Krakauer, David C.

2008-01-01

We consider the concerted evolution of viral genomes in four families of DNA viruses. Given the high rate of horizontal gene transfer among viruses and their hosts, it is an open question as to how representative particular genes are of the evolutionary history of the complete genome. To address the concerted evolution of viral genes, we compared genomic evolution across four distinct, extant viral families. For all four viral families we constructed DNA-dependent DNA polymerase-based (DdDp) phylogenies and in addition, whole genome sequence, as quantitative descriptions of inter-genome relationships. We found that the history of the polymerase gene was highly predictive of the history of the genome as a whole, which we explain in terms of repeated, co-divergence events of the core DdDp gene accompanied by a number of satellite, accessory genetic loci. We also found that the rate of gene gain in baculovirus and poxviruses proceeds significantly more quickly than the rate of gene loss and that there is convergent acquisition of satellite functions promoting contextual adaptation when distinct viral families infect related hosts. The congruence of the genome and polymerase trees suggests that a large set of viral genes, including polymerase, derive from a phylogenetically conserved core of genes of host origin, secondarily reinforced by gene acquisition from common hosts or co-infecting viruses within the host. A single viral genome can be thought of as a mutualistic network, with the core genes acting as an effective host and the satellite genes as effective symbionts. Larger virus genomes show a greater departure from linkage equilibrium between core and satellites functions. PMID:18941535
The first genetic map of the American cranberry: exploration of synteny conservation and quantitative trait loci.

PubMed

Georgi, Laura; Johnson-Cicalese, Jennifer; Honig, Josh; Das, Sushma Parankush; Rajah, Veeran D; Bhattacharya, Debashish; Bassil, Nahla; Rowland, Lisa J; Polashock, James; Vorsa, Nicholi

2013-03-01

The first genetic map of cranberry (Vaccinium macrocarpon) has been constructed, comprising 14 linkage groups totaling 879.9 cM with an estimated coverage of 82.2 %. This map, based on four mapping populations segregating for field fruit-rot resistance, contains 136 distinct loci. Mapped markers include blueberry-derived simple sequence repeat (SSR) and cranberry-derived sequence-characterized amplified region markers previously used for fingerprinting cranberry cultivars. In addition, SSR markers were developed near cranberry sequences resembling genes involved in flavonoid biosynthesis or defense against necrotrophic pathogens, or conserved orthologous set (COS) sequences. The cranberry SSRs were developed from next-generation cranberry genomic sequence assemblies; thus, the positions of these SSRs on the genomic map provide information about the genomic location of the sequence scaffold from which they were derived. The use of SSR markers near COS and other functional sequences, plus 33 SSR markers from blueberry, facilitates comparisons of this map with maps of other plant species. Regions of the cranberry map were identified that showed conservation of synteny with Vitis vinifera and Arabidopsis thaliana. Positioned on this map are quantitative trait loci (QTL) for field fruit-rot resistance (FFRR), fruit weight, titratable acidity, and sound fruit yield (SFY). The SFY QTL is adjacent to one of the fruit weight QTL and may reflect pleiotropy. Two of the FFRR QTL are in regions of conserved synteny with grape and span defense gene markers, and the third FFRR QTL spans a flavonoid biosynthetic gene.
[Comparative analysis of clustered regularly interspaced short palindromic repeats (CRISPRs) loci in the genomes of halophilic archaea].

PubMed

Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian

2009-11-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.
A comprehensive analysis of three Asiatic black bear mitochondrial genomes (subspecies ussuricus, formosanus and mupinensis), with emphasis on the complete mtDNA sequence of Ursus thibetanus ussuricus (Ursidae).

PubMed

Hwang, Dae-Sik; Ki, Jang-Seu; Jeong, Dong-Hyuk; Kim, Bo-Hyun; Lee, Bae-Keun; Han, Sang-Hoon; Lee, Jae-Seong

2008-08-01

In the present paper, we describe the mitochondrial genome sequence of the Asiatic black bear (Ursus thibetanus ussuricus) with particular emphasis on the control region (CR), and compared with mitochondrial genomes on molecular relationships among the bears. The mitochondrial genome sequence of U. thibetanus ussuricus was 16,700 bp in size with mostly conserved structures (e.g. 13 protein-coding, two rRNA genes, 22 tRNA genes). The CR consisted of several typical conserved domains such as F, E, D, and C boxes, and a conserved sequence block. Nucleotide sequences and the repeated motifs in the CR were different among the bear species, and their copy numbers were also variable according to populations, even within F1 generations of U. thibetanus ussuricus. Comparative analyses showed that the CR D1 region was highly informative for the discrimination of the bear family. These findings suggest that nucleotide sequences of both repeated motifs and CR D1 in the bear family are good markers for species discriminations.
Ana3 is a conserved protein required for the structural integrity of centrioles and basal bodies.

PubMed

Stevens, Naomi R; Dobbelaere, Jeroen; Wainman, Alan; Gergely, Fanni; Raff, Jordan W

2009-11-02

Recent studies have identified a conserved "core" of proteins that are required for centriole duplication. A small number of additional proteins have recently been identified as potential duplication factors, but it is unclear whether any of these proteins are components of the core duplication machinery. In this study, we investigate the function of one of these proteins, Drosophila melanogaster Ana3. We show that Ana3 is present in centrioles and basal bodies, but its behavior is distinct from that of the core duplication proteins. Most importantly, we find that Ana3 is required for the structural integrity of both centrioles and basal bodies and for centriole cohesion, but it is not essential for centriole duplication. We show that Ana3 has a mammalian homologue, Rotatin, that also localizes to centrioles and basal bodies and appears to be essential for cilia function. Thus, Ana3 defines a conserved family of centriolar proteins and plays an important part in ensuring the structural integrity of centrioles and basal bodies.
Conserved sequence-specific lincRNA-steroid receptor interactions drive transcriptional repression and direct cell fate

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hudson, William H.; Pickard, Mark R.; de Vera, Ian Mitchelle S.

2014-12-23

The majority of the eukaryotic genome is transcribed, generating a significant number of long intergenic noncoding RNAs (lincRNAs). Although lincRNAs represent the most poorly understood product of transcription, recent work has shown lincRNAs fulfill important cellular functions. In addition to low sequence conservation, poor understanding of structural mechanisms driving lincRNA biology hinders systematic prediction of their function. Here we report the molecular requirements for the recognition of steroid receptors (SRs) by the lincRNA growth arrest-specific 5 (Gas5), which regulates steroid-mediated transcriptional regulation, growth arrest and apoptosis. We identify the functional Gas5-SR interface and generate point mutations that ablate the SR-Gas5more » lincRNA interaction, altering Gas5-driven apoptosis in cancer cell lines. Further, we find that the Gas5 SR-recognition sequence is conserved among haplorhines, with its evolutionary origin as a splice acceptor site. This study demonstrates that lincRNAs can recognize protein targets in a conserved, sequence-specific manner in order to affect critical cell functions.« less
Sequence, structure and function relationships in flaviviruses as assessed by evolutive aspects of its conserved non-structural protein domains.

PubMed

da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas

2017-10-28

Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.
Analysis of hepatitis C virus RNA dimerization and core–RNA interactions

PubMed Central

Ivanyi-Nagy, Roland; Kanevsky, Igor; Gabus, Caroline; Lavergne, Jean-Pierre; Ficheux, Damien; Penin, François; Fossé, Philippe; Darlix, Jean-Luc

2006-01-01

The core protein of hepatitis C virus (HCV) has been shown previously to act as a potent nucleic acid chaperone in vitro, promoting the dimerization of the 3′-untranslated region (3′-UTR) of the HCV genomic RNA, a process probably mediated by a small, highly conserved palindromic RNA motif, named DLS (dimer linkage sequence) [G. Cristofari, R. Ivanyi-Nagy, C. Gabus, S. Boulant, J. P. Lavergne, F. Penin and J. L. Darlix (2004) Nucleic Acids Res., 32, 2623–2631]. To investigate in depth HCV RNA dimerization, we generated a series of point mutations in the DLS region. We find that both the plus-strand 3′-UTR and the complementary minus-strand RNA can dimerize in the presence of core protein, while mutations in the DLS (among them a single point mutation that abolished RNA replication in a HCV subgenomic replicon system) completely abrogate dimerization. Structural probing of plus- and minus-strand RNAs, in their monomeric and dimeric forms, indicate that the DLS is the major if not the sole determinant of UTR RNA dimerization. Furthermore, the N-terminal basic amino acid clusters of core protein were found to be sufficient to induce dimerization, suggesting that they retain full RNA chaperone activity. These findings may have important consequences for understanding the HCV replicative cycle and the genetic variability of the virus. PMID:16707664

Myosin MyTH4-FERM structures highlight important principles of convergent evolution.

PubMed

Planelles-Herrero, Vicente José; Blanc, Florian; Sirigu, Serena; Sirkia, Helena; Clause, Jeffrey; Sourigues, Yannick; Johnsrud, Daniel O; Amigues, Beatrice; Cecchini, Marco; Gilbert, Susan P; Houdusse, Anne; Titus, Margaret A

2016-05-24

Myosins containing MyTH4-FERM (myosin tail homology 4-band 4.1, ezrin, radixin, moesin, or MF) domains in their tails are found in a wide range of phylogenetically divergent organisms, such as humans and the social amoeba Dictyostelium (Dd). Interestingly, evolutionarily distant MF myosins have similar roles in the extension of actin-filled membrane protrusions such as filopodia and bind to microtubules (MT), suggesting that the core functions of these MF myosins have been highly conserved over evolution. The structures of two DdMyo7 signature MF domains have been determined and comparison with mammalian MF structures reveals that characteristic features of MF domains are conserved. However, across millions of years of evolution conserved class-specific insertions are seen to alter the surfaces and the orientation of subdomains with respect to each other, likely resulting in new sites for binding partners. The MyTH4 domains of Myo10 and DdMyo7 bind to MT with micromolar affinity but, surprisingly, their MT binding sites are on opposite surfaces of the MyTH4 domain. The structural analysis in combination with comparison of diverse MF myosin sequences provides evidence that myosin tail domain features can be maintained without strict conservation of motifs. The results illustrate how tuning of existing features can give rise to new structures while preserving the general properties necessary for myosin tails. Thus, tinkering with the MF domain enables it to serve as a multifunctional platform for cooperative recruitment of various partners, allowing common properties such as autoinhibition of the motor and microtubule binding to arise through convergent evolution.
Creation of a data base for sequences of ribosomal nucleic acids and detection of conserved restriction endonucleases sites through computerized processing.

PubMed Central

Patarca, R; Dorta, B; Ramirez, J L

1982-01-01

As part of a project pertaining the organization of ribosomal genes in Kinetoplastidae, we have created a data base for published sequences of ribosomal nucleic acids, with information in Spanish. As a first step in their processing, we have written a computer program which introduces the new feature of determining the length of the fragments produced after single or multiple digestion with any of the known restriction enzymes. With this information we have detected conserved SAU 3A sites: (i) at the 5' end of the 5.8S rRNA and at the 3' end of the small subunit rRNA, both included in similar larger sequences; (ii) in the 5.8S rRNA of vertebrates (a second one), which is not present in lower eukaryotes, showing a clear evolutive divergence; and, (iii) at the 5' terminal of the small subunit rRNA, included in a larger conserved sequence. The possible biological importance of these sequences is discussed. PMID:6278402
Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis.

PubMed

Sharmin, Refat; Islam, Abul B M M K

2016-01-01

MERS-CoV is a newly emerged human coronavirus reported closely related with HKU4 and HKU5 Bat coronaviruses. Bat and MERS corona-viruses are structurally related. Therefore, it is of interest to estimate the degree of conserved antigenic sites among them. It is of importance to elucidate the shared antigenic-sites and extent of conservation between them to understand the evolutionary dynamics of MERS-CoV. Multiple sequence alignment of the spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins was employed to identify the sequence conservation among MERS and Bat (HKU4, HKU5) coronaviruses. We used various in silico tools to predict the conserved antigenic sites. We found that MERS-CoV shared 30 % of its S protein antigenic sites with HKU4 and 70 % with HKU5 bat-CoV. Whereas 100 % of its E, M and N protein's antigenic sites are found to be conserved with those in HKU4 and HKU5. This sharing suggests that in case of pathogenicity MERS-CoV is more closely related to HKU5 bat-CoV than HKU4 bat-CoV. The conserved epitopes indicates their evolutionary relationship and ancestry of pathogenicity.
Recognition of the Xenopus ribosomal core promoter by the transcription factor xUBF involves multiple HMG box domains and leads to an xUBF interdomain interaction.

PubMed

Leblanc, B; Read, C; Moss, T

1993-02-01

The interaction of the ribosomal transcription factor xUBF with the RNA polymerase I core promoter of Xenopus laevis has been studied both at the DNA and protein levels. It is shown that a single xUBF-DNA complex forms over the 40S initiation site (+1) and involves at least the DNA sequences between -20 and +60 bp. DNA sequences upstream of +10 and downstream of +18 are each sufficient to direct complex formation independently. HMG box 1 of xUBF independently recognizes the sequences -20 to -1 and +1 to +22 and the addition of the N-terminal dimerization domain to HMG box 1 stabilizes its interaction with these sequences approximately 10-fold. HMG boxes 2/3 interact with the DNA downstream of +22 and can independently position xUBF across the initiation site. The C-terminal segment of xUBF, HMG boxes 4, 5 or the acidic domain, directly or indirectly interact with HMG box 1, making the core promoter sequences between -11 and -15 hypersensitive to DNase. This interaction also requires the DNA sequences between +17 and +32, i.e. the HMG box 2/3 binding site. The data suggest extensive folding of the core promoter within the xUBF complex.
T box transcription antitermination riboswitch: Influence of nucleotide sequence and orientation on tRNA binding by the antiterminator element

PubMed Central

Fauzi, Hamid; Agyeman, Akwasi; Hines, Jennifer V.

2008-01-01

Many bacteria utilize riboswitch transcription regulation to monitor and appropriately respond to cellular levels of important metabolites or effector molecules. The T box transcription antitermination riboswitch responds to cognate uncharged tRNA by specifically stabilizing an antiterminator element in the 5′-untranslated mRNA leader region and precluding formation of a thermodynamically more stable terminator element. Stabilization occurs when the tRNA acceptor end base pairs with the first four nucleotides in the seven nucleotide bulge of the highly conserved antiterminator element. The significance of the conservation of the antiterminator bulge nucleotides that do not base pair with the tRNA is unknown, but they are required for optimal function. In vitro selection was used to determine if the isolated antiterminator bulge context alone dictates the mode in which the tRNA acceptor end binds the bulge nucleotides. No sequence conservation beyond complementarity was observed and the location was not constrained to the first four bases of the bulge. The results indicate that formation of a structure that recognizes the tRNA acceptor end in isolation is not the determinant driving force for the high phylogenetic sequence conservation observed within the antiterminator bulge. Additional factors or T box leader features more likely influenced the phylogenetic sequence conservation. PMID:19152843
Sequence conservation, HLA-E-Restricted peptide, and best-defined CTL/CD8+ epitopes in gag P24 (capsid) of HIV-1 subtype B

NASA Astrophysics Data System (ADS)

Prasetyo, Afiono Agung; Dharmawan, Ruben; Sari, Yulia; Sariyatun, Ratna

2017-02-01

Human immunodeficiency virus type 1 (HIV-1) remains a cause of global health problem. Continuous studies of HIV-1 genetic and immunological profiles are important to find strategies against the virus. This study aimed to conduct analysis of sequence conservation, HLA-E-restricted peptide, and best-defined CTL/CD8+ epitopes in p24 (capsid) of HIV-1 subtype B worldwide. The p24-coding sequences from 3,557 HIV subtype B isolates were aligned using MUSCLE and analysed. Some highly conserved regions (sequence conservation ≥95%) were observed. Two considerably long series of sequences with conservation of 100% was observed at base 349-356 and 550-557 of p24 (HXB2 numbering). The consensus from all aligned isolates was precisely the same as consensus B in the Los Alamos HIV Database. The HLA-E-restricted peptide in amino acid (aa) 14-22 of HIV-1 p24 (AISPRTLNA) was found in 55.9% (1,987/3,557) of HIV-1 subtype B worldwide. Forty-four best-defined CTL/CD8+ epitopes were observed, in which VKNWMTETL epitope (aa 181-189 of p24) restricted by B*4801 was the most frequent, as found in 94.9% of isolates. The results of this study would contribute information about HIV-1 subtype B and benefits for further works willing to develop diagnostic and therapeutic strategies against the virus.
Huntingtin-interacting protein 1 (Hip1) and Hip1-related protein (Hip1R) bind the conserved sequence of clathrin light chains and thereby influence clathrin assembly in vitro and actin distribution in vivo.

PubMed

Chen, Chih-Ying; Brodsky, Frances M

2005-02-18

Clathrin heavy and light chains form triskelia, which assemble into polyhedral coats of membrane vesicles that mediate transport for endocytosis and organelle biogenesis. Light chain subunits regulate clathrin assembly in vitro by suppressing spontaneous self-assembly of the heavy chains. The residues that play this regulatory role are at the N terminus of a conserved 22-amino acid sequence that is shared by all vertebrate light chains. Here we show that these regulatory residues and others in the conserved sequence mediate light chain interaction with Hip1 and Hip1R. These related proteins were previously found to be enriched in clathrin-coated vesicles and to promote clathrin assembly in vitro. We demonstrate Hip1R binding preference for light chains associated with clathrin heavy chain and show that Hip1R stimulation of clathrin assembly in vitro is blocked by mutations in the conserved sequence of light chains that abolish interaction with Hip1 and Hip1R. In vivo overexpression of a fragment of clathrin light chain comprising the Hip1R-binding region affected cellular actin distribution. Together these results suggest that the roles of Hip1 and Hip1R in affecting clathrin assembly and actin distribution are mediated by their interaction with the conserved sequence of clathrin light chains.
A conserved predicted pseudoknot in the NS2A-encoding sequence of West Nile and Japanese encephalitis flaviviruses suggests NS1' may derive from ribosomal frameshifting

PubMed Central

Firth, Andrew E; Atkins, John F

2009-01-01

Japanese encephalitis, West Nile, Usutu and Murray Valley encephalitis viruses form a tight subgroup within the larger Flavivirus genus. These viruses utilize a single-polyprotein expression strategy, resulting in ~10 mature proteins. Plotting the conservation at synonymous sites along the polyprotein coding sequence reveals strong conservation peaks at the very 5' end of the coding sequence, and also at the 5' end of the sequence encoding the NS2A protein. Such peaks are generally indicative of functionally important non-coding sequence elements. The second peak corresponds to a predicted stable pseudoknot structure whose biological importance is supported by compensatory mutations that preserve the structure. The pseudoknot is preceded by a conserved slippery heptanucleotide (Y CCU UUU), thus forming a classical stimulatory motif for -1 ribosomal frameshifting. We hypothesize, therefore, that the functional importance of the pseudoknot is to stimulate a portion of ribosomes to shift -1 nt into a short (45 codon), conserved, overlapping open reading frame, termed foo. Since cleavage at the NS1-NS2A boundary is known to require synthesis of NS2A in cis, the resulting transframe fusion protein is predicted to be NS1-NS2AN-term-FOO. We hypothesize that this may explain the origin of the previously identified NS1 'extension' protein in JEV-group flaviviruses, known as NS1'. PMID:19196463
Development of Mycoplasma synoviae (MS) core genome multilocus sequence typing (cgMLST) scheme.

PubMed

Ghanem, Mostafa; El-Gazzar, Mohamed

2018-05-01

Mycoplasma synoviae (MS) is a poultry pathogen with reported increased prevalence and virulence in recent years. MS strain identification is essential for prevention, control efforts and epidemiological outbreak investigations. Multiple multilocus based sequence typing schemes have been developed for MS, yet the resolution of these schemes could be limited for outbreak investigation. The cost of whole genome sequencing became close to that of sequencing the seven MLST targets; however, there is no standardized method for typing MS strains based on whole genome sequences. In this paper, we propose a core genome multilocus sequence typing (cgMLST) scheme as a standardized and reproducible method for typing MS based whole genome sequences. A diverse set of 25 MS whole genome sequences were used to identify 302 core genome genes as cgMLST targets (35.5% of MS genome) and 44 whole genome sequences of MS isolates from six countries in four continents were used for typing applying this scheme. cgMLST based phylogenetic trees displayed a high degree of agreement with core genome SNP based analysis and available epidemiological information. cgMLST allowed evaluation of two conventional MLST schemes of MS. The high discriminatory power of cgMLST allowed differentiation between samples of the same conventional MLST type. cgMLST represents a standardized, accurate, highly discriminatory, and reproducible method for differentiation between MS isolates. Like conventional MLST, it provides stable and expandable nomenclature, allowing for comparing and sharing the typing results between different laboratories worldwide. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Survey of genome sequences in a wild sweet potato, Ipomoea trifida (H. B. K.) G. Don

PubMed Central

Hirakawa, Hideki; Okada, Yoshihiro; Tabuchi, Hiroaki; Shirasawa, Kenta; Watanabe, Akiko; Tsuruoka, Hisano; Minami, Chiharu; Nakayama, Shinobu; Sasamoto, Shigemi; Kohara, Mitsuyo; Kishida, Yoshie; Fujishiro, Tsunakazu; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yoshinaga, Masaru; Takahata, Yasuhiro; Tanaka, Masaru; Tabata, Satoshi; Isobe, Sachiko N.

2015-01-01

Ipomoea trifida (H. B. K.) G. Don. is the most likely diploid ancestor of the hexaploid sweet potato, I. batatas (L.) Lam. To assist in analysis of the sweet potato genome, de novo whole-genome sequencing was performed with two lines of I. trifida, namely the selfed line Mx23Hm and the highly heterozygous line 0431-1, using the Illumina HiSeq platform. We classified the sequences thus obtained as either ‘core candidates’ (common to the two lines) or ‘line specific’. The total lengths of the assembled sequences of Mx23Hm (ITR_r1.0) was 513 Mb, while that of 0431-1 (ITRk_r1.0) was 712 Mb. Of the assembled sequences, 240 Mb (Mx23Hm) and 353 Mb (0431-1) were classified into core candidate sequences. A total of 62,407 (62.4 Mb) and 109,449 (87.2 Mb) putative genes were identified, respectively, in the genomes of Mx23Hm and 0431-1, of which 11,823 were derived from core sequences of Mx23Hm, while 28,831 were from the core candidate sequence of 0431-1. There were a total of 1,464,173 single-nucleotide polymorphisms and 16,682 copy number variations (CNVs) in the two assembled genomic sequences (under the condition of log2 ratio of >1 and CNV size >1,000 bases). The results presented here are expected to contribute to the progress of genomic and genetic studies of I. trifida, as well as studies of the sweet potato and the genus Ipomoea in general. PMID:25805887
Application of cytochrome b DNA sequences for the authentication of endangered snake species.

PubMed

Wong, Ka-Lok; Wang, Jun; But, Paul Pui-Hay; Shaw, Pang-Chui

2004-01-06

In order to enforce the conservation program and curbing the illegal trading and consumption of endangered snake species, the value of cytochrome b sequence in the authentication of snake species was evaluated. As an illustration, DNA was extracted, selected cytochrome b DNA sequences amplified and sequenced from six snakes commonly consumed in Hong Kong. Cataloging with sequences available in public, a cytochrome b database containing 90 species of snakes was constructed. In this database, sequence homology between snakes ranged from 70.68 to 95.11%. On the other hand, intraspecific variation of three tested snakes was 0-0.98%. Using the database, we were able to determine the identity of six meat samples confiscated by the Agriculture, Fisheries and Conservation Department, HKSAR.
Insights into the fold organization of TIM barrel from interaction energy based structure networks.

PubMed

Vijayabaskar, M S; Vishveshwara, Saraswathi

2012-01-01

There are many well-known examples of proteins with low sequence similarity, adopting the same structural fold. This aspect of sequence-structure relationship has been extensively studied both experimentally and theoretically, however with limited success. Most of the studies consider remote homology or "sequence conservation" as the basis for their understanding. Recently "interaction energy" based network formalism (Protein Energy Networks (PENs)) was developed to understand the determinants of protein structures. In this paper we have used these PENs to investigate the common non-covalent interactions and their collective features which stabilize the TIM barrel fold. We have also developed a method of aligning PENs in order to understand the spatial conservation of interactions in the fold. We have identified key common interactions responsible for the conservation of the TIM fold, despite high sequence dissimilarity. For instance, the central beta barrel of the TIM fold is stabilized by long-range high energy electrostatic interactions and low-energy contiguous vdW interactions in certain families. The other interfaces like the helix-sheet or the helix-helix seem to be devoid of any high energy conserved interactions. Conserved interactions in the loop regions around the catalytic site of the TIM fold have also been identified, pointing out their significance in both structural and functional evolution. Based on these investigations, we have developed a novel network based phylogenetic analysis for remote homologues, which can perform better than sequence based phylogeny. Such an analysis is more meaningful from both structural and functional evolutionary perspective. We believe that the information obtained through the "interaction conservation" viewpoint and the subsequently developed method of structure network alignment, can shed new light in the fields of fold organization and de novo computational protein design.
Conservation of tubulin-binding sequences in TRPV1 throughout evolution.

PubMed

Sardar, Puspendu; Kumar, Abhishek; Bhandari, Anita; Goswami, Chandan

2012-01-01

Transient Receptor Potential Vanilloid sub type 1 (TRPV1), commonly known as capsaicin receptor can detect multiple stimuli ranging from noxious compounds, low pH, temperature as well as electromagnetic wave at different ranges. In addition, this receptor is involved in multiple physiological and sensory processes. Therefore, functions of TRPV1 have direct influences on adaptation and further evolution also. Availability of various eukaryotic genomic sequences in public domain facilitates us in studying the molecular evolution of TRPV1 protein and the respective conservation of certain domains, motifs and interacting regions that are functionally important. Using statistical and bioinformatics tools, our analysis reveals that TRPV1 has evolved about ∼420 million years ago (MYA). Our analysis reveals that specific regions, domains and motifs of TRPV1 has gone through different selection pressure and thus have different levels of conservation. We found that among all, TRP box is the most conserved and thus have functional significance. Our results also indicate that the tubulin binding sequences (TBS) have evolutionary significance as these stretch sequences are more conserved than many other essential regions of TRPV1. The overall distribution of positively charged residues within the TBS motifs is conserved throughout evolution. In silico analysis reveals that the TBS-1 and TBS-2 of TRPV1 can form helical structures and may play important role in TRPV1 function. Our analysis identifies the regions of TRPV1, which are important for structure-function relationship. This analysis indicates that tubulin binding sequence-1 (TBS-1) near the TRP-box forms a potential helix and the tubulin interactions with TRPV1 via TBS-1 have evolutionary significance. This interaction may be required for the proper channel function and regulation and may also have significance in the context of Taxol®-induced neuropathy.
Genetic and structural analyses of cytochrome P450 hydroxylases in sex hormone biosynthesis: Sequential origin and subsequent coevolution.

PubMed

Goldstone, Jared V; Sundaramoorthy, Munirathinam; Zhao, Bin; Waterman, Michael R; Stegeman, John J; Lamb, David C

2016-01-01

Biosynthesis of steroid hormones in vertebrates involves three cytochrome P450 hydroxylases, CYP11A1, CYP17A1 and CYP19A1, which catalyze sequential steps in steroidogenesis. These enzymes are conserved in the vertebrates, but their origin and existence in other chordate subphyla (Tunicata and Cephalochordata) have not been clearly established. In this study, selected protein sequences of CYP11A1, CYP17A1 and CYP19A1 were compiled and analyzed using multiple sequence alignment and phylogenetic analysis. Our analyses show that cephalochordates have sequences orthologous to vertebrate CYP11A1, CYP17A1 or CYP19A1, and that echinoderms and hemichordates possess CYP11-like but not CYP19 genes. While the cephalochordate sequences have low identity with the vertebrate sequences, reflecting evolutionary distance, the data show apparent origin of CYP11 prior to the evolution of CYP19 and possibly CYP17, thus indicating a sequential origin of these functionally related steroidogenic CYPs. Co-occurrence of the three CYPs in early chordates suggests that the three genes may have coevolved thereafter, and that functional conservation should be reflected in functionally important residues in the proteins. CYP19A1 has the largest number of conserved residues while CYP11A1 sequences are less conserved. Structural analyses of human CYP11A1, CYP17A1 and CYP19A1 show that critical substrate binding site residues are highly conserved in each enzyme family. The results emphasize that the steroidogenic pathways producing glucocorticoids and reproductive steroids are several hundred million years old and that the catalytic structural elements of the enzymes have been conserved over the same period of time. Analysis of these elements may help to identify when precursor functions linked to these enzymes first arose. Copyright © 2015 Elsevier Inc. All rights reserved.
Conserved Proteins of the RNA Interference System in the Arbuscular Mycorrhizal Fungus Rhizoglomus irregulare Provide New Insight into the Evolutionary History of Glomeromycota.

PubMed

Lee, Soon-Jae; Kong, Mengxuan; Harrison, Paul; Hijri, Mohamed

2018-01-01

Horizontal gene transfer (HGT) is an important mechanism in the evolution of many living organisms particularly in Prokaryotes where genes are frequently dispersed between taxa. Although, HGT has been reported in Eukaryotes, its accumulative effect and its frequency has been questioned. Arbuscular mycorrhizal fungi (AMF) are an early diverged fungal lineage belonging to phylum Glomeromycota, whose phylogenetic position is still under debate. The history of AMF and land plant symbiosis dates back to at least 460 Ma. However, Glomeromycota are estimated to have emerged much earlier than land plants. In this study, we surveyed genomic and transcriptomic data of the model arbuscular mycorrhizal fungus Rhizoglomus irregulare (synonym Rhizophagus irregularis) and its relatives to search for evidence of HGT that occurred during AMF evolution. Surprisingly, we found a signature of putative HGT of class I ribonuclease III protein-coding genes that occurred from autotrophic cyanobacteria genomes to R. irregulare. At least one of two HGTs was conserved among AMF species with high levels of sequence similarity. Previously, an example of intimate symbiosis between AM fungus and cyanobacteria was reported in the literature. Ribonuclease III family enzymes are important in small RNA regulation in Fungi together with two additional core proteins (Argonaute/piwi and RdRP). The eukaryotic RNA interference system found in AMF was conserved and showed homology with high sequence similarity in Mucoromycotina, a group of fungi closely related to Glomeromycota. Prior to this analysis, class I ribonuclease III has not been identified in any eukaryotes. Our results indicate that a unique acquisition of class I ribonuclease III in AMF is due to a HGT event that occurred from cyanobacteria to Glomeromycota, at the latest before the divergence of the two Glomeromycota orders Diversisporales and Glomerales. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Conserved Proteins of the RNA Interference System in the Arbuscular Mycorrhizal Fungus Rhizoglomus irregulare Provide New Insight into the Evolutionary History of Glomeromycota

PubMed Central

Lee, Soon-Jae; Kong, Mengxuan; Harrison, Paul

2018-01-01

Abstract Horizontal gene transfer (HGT) is an important mechanism in the evolution of many living organisms particularly in Prokaryotes where genes are frequently dispersed between taxa. Although, HGT has been reported in Eukaryotes, its accumulative effect and its frequency has been questioned. Arbuscular mycorrhizal fungi (AMF) are an early diverged fungal lineage belonging to phylum Glomeromycota, whose phylogenetic position is still under debate. The history of AMF and land plant symbiosis dates back to at least 460 Ma. However, Glomeromycota are estimated to have emerged much earlier than land plants. In this study, we surveyed genomic and transcriptomic data of the model arbuscular mycorrhizal fungus Rhizoglomus irregulare (synonym Rhizophagus irregularis) and its relatives to search for evidence of HGT that occurred during AMF evolution. Surprisingly, we found a signature of putative HGT of class I ribonuclease III protein-coding genes that occurred from autotrophic cyanobacteria genomes to R. irregulare. At least one of two HGTs was conserved among AMF species with high levels of sequence similarity. Previously, an example of intimate symbiosis between AM fungus and cyanobacteria was reported in the literature. Ribonuclease III family enzymes are important in small RNA regulation in Fungi together with two additional core proteins (Argonaute/piwi and RdRP). The eukaryotic RNA interference system found in AMF was conserved and showed homology with high sequence similarity in Mucoromycotina, a group of fungi closely related to Glomeromycota. Prior to this analysis, class I ribonuclease III has not been identified in any eukaryotes. Our results indicate that a unique acquisition of class I ribonuclease III in AMF is due to a HGT event that occurred from cyanobacteria to Glomeromycota, at the latest before the divergence of the two Glomeromycota orders Diversisporales and Glomerales. PMID:29329439
DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants

PubMed Central

Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B.; Tóth, Gábor; Ortutay, Csaba P.; Patthy, László

2005-01-01

DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21 061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically. PMID:15608291
DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants.

PubMed

Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B; Tóth, Gábor; Ortutay, Csaba P; Patthy, László

2005-01-01

DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically.
Environmental globalization, organizational form, and expected benefits from protected areas in Central America

Treesearch

Max J. Pfeffer; John W. Schelhas; Catherine Meola

2006-01-01

Environmental globalization has led to the implementation of conservation efforts like the creation of protected areas that often promote the interests of core countries in poorer regions. The creation of protected areas in poor areas frequently creates tensions between human needs like - food and shelter and environmental conservation. Support for such conservation...
Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence.

PubMed

Akkuratov, Evgeny E; Walters, Lorraine; Saha-Mandal, Arnab; Khandekar, Sushant; Crawford, Erin; Zirbel, Craig L; Leisner, Scott; Prakash, Ashwin; Fedorova, Larisa; Fedorov, Alexei

2014-09-10

Orthologous introns have identical positions relative to the coding sequence in orthologous genes of different species. By analyzing the complete genomes of five plants we generated a database of 40,512 orthologous intron groups of dicotyledonous plants, 28,519 orthologous intron groups of angiosperms, and 15,726 of land plants (moss and angiosperms). Multiple sequence alignments of each orthologous intron group were obtained using the Mafft algorithm. The number of conserved regions in plant introns appeared to be hundreds of times fewer than that in mammals or vertebrates. Approximately three quarters of conserved intronic regions among angiosperms and dicots, in particular, correspond to alternatively-spliced exonic sequences. We registered only a handful of conserved intronic ncRNAs of flowering plants. However, the most evolutionarily conserved intronic region, which is ubiquitous for all plants examined in this study, including moss, possessed multiple structural features of tRNAs, which caused us to classify it as a putative tRNA-like ncRNA. Intronic sequences encoding tRNA-like structures are not unique to plants. Bioinformatics examination of the presence of tRNA inside introns revealed an unusually long-term association of four glycine tRNAs inside the Vac14 gene of fish, amniotes, and mammals. Copyright © 2014 Elsevier B.V. All rights reserved.

High-Throughput Sequencing of Arabidopsis microRNAs: Evidence for Frequent Birth and Death of MIRNA Genes

PubMed Central

Fahlgren, Noah; Howell, Miya D.; Kasschau, Kristin D.; Chapman, Elisabeth J.; Sullivan, Christopher M.; Cumbie, Jason S.; Givan, Scott A.; Law, Theresa F.; Grant, Sarah R.; Dangl, Jeffery L.; Carrington, James C.

2007-01-01

In plants, microRNAs (miRNAs) comprise one of two classes of small RNAs that function primarily as negative regulators at the posttranscriptional level. Several MIRNA genes in the plant kingdom are ancient, with conservation extending between angiosperms and the mosses, whereas many others are more recently evolved. Here, we use deep sequencing and computational methods to identify, profile and analyze non-conserved MIRNA genes in Arabidopsis thaliana. 48 non-conserved MIRNA families, nearly all of which were represented by single genes, were identified. Sequence similarity analyses of miRNA precursor foldback arms revealed evidence for recent evolutionary origin of 16 MIRNA loci through inverted duplication events from protein-coding gene sequences. Interestingly, these recently evolved MIRNA genes have taken distinct paths. Whereas some non-conserved miRNAs interact with and regulate target transcripts from gene families that donated parental sequences, others have drifted to the point of non-interaction with parental gene family transcripts. Some young MIRNA loci clearly originated from one gene family but form miRNAs that target transcripts in another family. We suggest that MIRNA genes are undergoing relatively frequent birth and death, with only a subset being stabilized by integration into regulatory networks. PMID:17299599
Conserved noncoding sequences conserve biological networks and influence genome evolution.

PubMed

Xie, Jianbo; Qian, Kecheng; Si, Jingna; Xiao, Liang; Ci, Dong; Zhang, Deqiang

2018-05-01

Comparative genomics approaches have identified numerous conserved cis-regulatory sequences near genes in plant genomes. Despite the identification of these conserved noncoding sequences (CNSs), our knowledge of their functional importance and selection remains limited. Here, we used a combination of DNA methylome analysis, microarray expression analyses, and functional annotation to study these sequences in the model tree Populus trichocarpa. Methylation in CG contexts and non-CG contexts was lower in CNSs, particularly CNSs in the 5'-upstream regions of genes, compared with other sites in the genome. We observed that CNSs are enriched in genes with transcription and binding functions, and this also associated with syntenic genes and those from whole-genome duplications, suggesting that cis-regulatory sequences play a key role in genome evolution. We detected a significant positive correlation between CNS number and protein interactions, suggesting that CNSs may have roles in the evolution and maintenance of biological networks. The divergence of CNSs indicates that duplication-degeneration-complementation drives the subfunctionalization of a proportion of duplicated genes from whole-genome duplication. Furthermore, population genomics confirmed that most CNSs are under strong purifying selection and only a small subset of CNSs shows evidence of adaptive evolution. These findings provide a foundation for future studies exploring these key genomic features in the maintenance of biological networks, local adaptation, and transcription.
Chromosome ends: different sequences may provide conserved functions.

PubMed

Louis, Edward J; Vershinin, Alexander V

2005-07-01

The structures of specific chromosome regions, centromeres and telomeres, present a number of puzzles. As functions performed by these regions are ubiquitous and essential, their DNA, proteins and chromatin structure are expected to be conserved. Recent studies of centromeric DNA from human, Drosophila and plant species have demonstrated that a hidden universal centromere-specific sequence is highly unlikely. The DNA of telomeres is more conserved consisting of a tandemly repeated 6-8 bp Arabidopsis-like sequence in a majority of organisms as diverse as protozoan, fungi, mammals and plants. However, there are alternatives to short DNA repeats at the ends of chromosomes and for telomere elongation by telomerase. Here we focus on the similarities and diversity that exist among the structural elements, DNA sequences and proteins, that make up terminal domains (telomeres and subtelomeres), and how organisms use these in different ways to fulfil the functions of end-replication and end-protection. Copyright (c) 2005 Wiley Periodicals, Inc.
Comparative genomic analysis of the false killer whale (Pseudorca crassidens) LMBR1 locus.

PubMed

Kim, Dae-Won; Choi, Sang-Haeng; Kim, Ryong Nam; Kim, Sun-Hong; Paik, Sang-Gi; Nam, Seong-Hyeuk; Kim, Dong-Wook; Kim, Aeri; Kang, Aram; Park, Hong-Seog

2010-09-01

The sequencing and comparative genomic analysis of LMBR1 loci in mammals or other species, including human, would be very important in understanding evolutionary genetic changes underlying the evolution of limb development. In this regard, comparative genomic annotation of the false killer whale LMBR1 locus could shed new light on the evolution of limb development. We sequenced two false killer whale BAC clones, corresponding to 156 kb and 144 kb, respectively, harboring the tightly linked RNF32, LMBR1, and NOM1 genes. Our annotation of the false killer whale LMBR1 gene showed that it consists of 17 exons (1473 bp), in contrast to 18 exons (1596 bp) in human, and it displays 93.1% and 95.6% nucleotide and amino acid sequence similarity, respectively, compared with the human gene. In particular, we discovered that exon 10, deleted in the false killer whale LMBR1 gene, is present only in primates, and this fact strongly implies that exon 10 might be crucial in determining primate-specific limb development. ZRS and TFBS sequences have been well conserved across 11 species, suggesting that these regions could be involved in an important function of limb development and limb patterning. The neighboring gene RNF32 showed several lineage-conserved exons, such as exons 2 through 9 conserved in eutherian mammals, exons 3 through 9 conserved in mammals, and exons 5 through 9 conserved in vertebrates. The other neighboring gene, NOM1, had undergone a substitution (ATG→GTA) at the start codon, giving rise to a 36 bp shorter N-terminal sequence compared with the human sequence. Our comparative analysis of the false killer whale LMBR1 genomic locus provides important clues regarding the genetic regions that may play crucial roles in limb development and patterning.
Defining and predicting structurally conserved regions in protein superfamilies

PubMed Central

Huang, Ivan K.; Grishin, Nick V.

2013-01-01

Motivation: The structures of homologous proteins are generally better conserved than their sequences. This phenomenon is demonstrated by the prevalence of structurally conserved regions (SCRs) even in highly divergent protein families. Defining SCRs requires the comparison of two or more homologous structures and is affected by their availability and divergence, and our ability to deduce structurally equivalent positions among them. In the absence of multiple homologous structures, it is necessary to predict SCRs of a protein using information from only a set of homologous sequences and (if available) a single structure. Accurate SCR predictions can benefit homology modelling and sequence alignment. Results: Using pairwise DaliLite alignments among a set of homologous structures, we devised a simple measure of structural conservation, termed structural conservation index (SCI). SCI was used to distinguish SCRs from non-SCRs. A database of SCRs was compiled from 386 SCOP superfamilies containing 6489 protein domains. Artificial neural networks were then trained to predict SCRs with various features deduced from a single structure and homologous sequences. Assessment of the predictions via a 5-fold cross-validation method revealed that predictions based on features derived from a single structure perform similarly to ones based on homologous sequences, while combining sequence and structural features was optimal in terms of accuracy (0.755) and Matthews correlation coefficient (0.476). These results suggest that even without information from multiple structures, it is still possible to effectively predict SCRs for a protein. Finally, inspection of the structures with the worst predictions pinpoints difficulties in SCR definitions. Availability: The SCR database and the prediction server can be found at http://prodata.swmed.edu/SCR. Contact: 91huangi@gmail.com or grishin@chop.swmed.edu Supplementary information: Supplementary data are available at Bioinformatics Online PMID:23193223
AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

PubMed Central

2010-01-01

Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect divergent regions via several scoring methods that provide different levels of selectivity. Its predictions have been verified by experimental means. Hence, it is expected that its usage will save researchers' time and ensure an objective selection of the best-possible divergent region when closely related sequences are analysed. AlignMiner is freely available at http://www.scbi.uma.es/alignminer. PMID:20525162
A distal 594 bp ECR specifies Hmx1 expression in pinna and lateral facial morphogenesis and is regulated by the Hox-Pbx-Meis complex

DOE PAGES

Rosin, Jessica M.; Li, Wenjie; Cox, Liza L.; ...

2016-07-19

Hmx1 encodes a homeodomain transcription factor expressed in the developing lateral craniofacial mesenchyme, retina and sensory ganglia. Mutation or mis-regulation of Hmx1 underlies malformations of the eye and external ear in multiple species. Deletion or insertional duplication of an evolutionarily conserved region (ECR) downstream of Hmx1 has recently been described in rat and cow, respectively. Here, we demonstrate that the impact of Hmx1 loss is greater than previously appreciated, with a variety of lateral cranioskeletal defects, auriculofacial nerve deficits, and duplication of the caudal region of the external ear. Using a transgenic approach, we demonstrate that a 594 bp sequencemore » encompassing the ECR recapitulates specific aspects of the endogenous Hmx1 lateral facial expression pattern. Moreover, we show that Hoxa2, Meis and Pbx proteins act cooperatively on the ECR, via a core 32 bp sequence, to regulate Hmx1 expression. In conclusion, these studies highlight the conserved role for Hmx1 in BA2-derived tissues and provide an entry point for improved understanding of the causes of the frequent lateral facial birth defects in humans.« less
Systems-level feedback regulation of cell cycle transitions in Ostreococcus tauri.

PubMed

Kapuy, Orsolya; Vinod, P K; Bánhegyi, Gábor; Novák, Béla

2018-05-01

Ostreococcus tauri is the smallest free-living unicellular organism with one copy of each core cell cycle genes in its genome. There is a growing interest in this green algae due to its evolutionary origin. Since O. tauri is diverged early in the green lineage, relatively close to the ancestral eukaryotic cell, it might hold a key phylogenetic position in the eukaryotic tree of life. In this study, we focus on the regulatory network of its cell division cycle. We propose a mathematical modelling framework to integrate the existing knowledge of cell cycle network of O. tauri. We observe that feedback loop regulation of both G1/S and G2/M transitions in O. tauri is conserved, which can make the transition bistable. This is essential to make the transition irreversible as shown in other eukaryotic organisms. By performing sequence analysis, we also predict the presence of the Greatwall/PP2A pathway in the cell cycle of O. tauri. Since O. tauri cell cycle machinery is conserved, the exploration of the dynamical characteristic of the cell division cycle will help in further understanding the regulation of cell cycle in higher eukaryotes. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Domain architecture of the p62 subunit from the human transcription/repair factor TFIIH deduced by limited proteolysis and mass spectrometry analysis.

PubMed

Jawhari, Anass; Boussert, Stéphanie; Lamour, Valérie; Atkinson, R Andrew; Kieffer, Bruno; Poch, Olivier; Potier, Noelle; van Dorsselaer, Alain; Moras, Dino; Poterszman, Arnaud

2004-11-16

TFIIH is a multiprotein complex that plays a central role in both transcription and DNA repair. The subunit p62 is a structural component of the TFIIH core that is known to interact with VP16, p53, Eralpha, and E2F1 in the context of activated transcription, as well as with the endonuclease XPG in DNA repair. We used limited proteolysis experiments coupled to mass spectrometry to define structural domains within the conserved N-terminal part of the molecule. The first domain identified resulted from spontaneous proteolysis and corresponds to residues 1-108. The second domain encompasses residues 186-240, and biophysical characterization by fluorescence studies and NMR analysis indicated that it is at least partially folded and thus may correspond to a structural entity. This module contains a region of high sequence conservation with an invariant FWxxPhiPhi motif (Phi representing either tyrosine or phenylalanine), which was also found in other protein families and could play a key role as a protein-protein recognition module within TFIIH. The approach used in this study is general and can be straightforwardly applied to other multidomain proteins and/or multiprotein assemblies.
A distal 594 bp ECR specifies Hmx1 expression in pinna and lateral facial morphogenesis and is regulated by the Hox-Pbx-Meis complex

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rosin, Jessica M.; Li, Wenjie; Cox, Liza L.

Hmx1 encodes a homeodomain transcription factor expressed in the developing lateral craniofacial mesenchyme, retina and sensory ganglia. Mutation or mis-regulation of Hmx1 underlies malformations of the eye and external ear in multiple species. Deletion or insertional duplication of an evolutionarily conserved region (ECR) downstream of Hmx1 has recently been described in rat and cow, respectively. Here, we demonstrate that the impact of Hmx1 loss is greater than previously appreciated, with a variety of lateral cranioskeletal defects, auriculofacial nerve deficits, and duplication of the caudal region of the external ear. Using a transgenic approach, we demonstrate that a 594 bp sequencemore » encompassing the ECR recapitulates specific aspects of the endogenous Hmx1 lateral facial expression pattern. Moreover, we show that Hoxa2, Meis and Pbx proteins act cooperatively on the ECR, via a core 32 bp sequence, to regulate Hmx1 expression. In conclusion, these studies highlight the conserved role for Hmx1 in BA2-derived tissues and provide an entry point for improved understanding of the causes of the frequent lateral facial birth defects in humans.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kawagoe, Kazuyoshi; Takeda, Junji; Kinoshita, Taroh

Many membrane proteins are anchored to the cell membrane by glycosylphosphatidylinositol (GPI). The core structure and biosynthesis of the GPI anchor are well conserved in eukaryote cells. We previously cloned a human PIGA gene that participates in GPI anchor biosynthesis. We have now cloned complementary and genomic DNA of Pig-a, the murine homologue of PIGA, and compared its function and gene structure with those of PIGA. The deduced amino acid sequence of mouse PIG-A is 88% identical with that of human PIG-A. Transfection of Pig-a cDNA complemented the defects of both a PIG-A-deficient murine cell line and a PIG-A-deficient humanmore » cell line, demonstrating that functions of mouse and human PIG-A are conserved. Like human PIGA, the chromosomal Pig-a gene has six exons and spans approximately 16 kb. Moreover, Pig-a was mapped to X-F3/4, which is syntenic to human Xp22.1, where PIGA is located. Thus, murine Pig-a provides a good animal model to study paroxysmal nocturnal hemoglobinuria, a disease caused by a somatic mutation of PIGA. Database analysis demonstrated that a yeast gene, SPT14, is homologous to Pig-a and PIGA and that these genes are members of a glycosyltransferase gene family.« less
Differences in community composition of bacteria in four glaciers in western China

NASA Astrophysics Data System (ADS)

An, L. Z.; Chen, Y.; Xiang, S.-R.; Shang, T.-C.; Tian, L.-D.

2010-06-01

Microbial community patterns vary in glaciers worldwide, presenting unique responses to global climatic and environmental changes. Four bacterial clone libraries were established by 16S rRNA gene amplification from four ice layers along the 42-m-long ice core MuztB drilled from the Muztag Ata Glacier. A total of 151 bacterial sequences obtained from the ice core MuztB were phylogenetically compared with the 71 previously reported sequences from three ice cores extracted from ice caps Malan, Dunde, and Puruogangri. Six phylogenetic clusters Flavisolibacter, Flexibacter (Bacteroidetes), Acinetobacter, Enterobacter (Gammaproteobacteria), Planococcus/Anoxybacillus (Firmicutes), and Propionibacter/Luteococcus (Actinobacteria) frequently occurred along the Muztag Ata Glacier profile, and their proportion varied by seasons. Sequence analysis showed that most of the sequences from the ice core clustered with those from cold environments, and the sequence clusters from the same glacier more closely grouped together than those from the geographically isolated glaciers. Moreover, bacterial communities from the same location or similarly aged ice formed a cluster, and were clearly separate from those from other geographically isolated glaciers. In summary, the findings provide preliminary evidence of zonal distribution of microbial community, and suggest biogeography of microorganisms in glacier ice.
Differences in community composition of bacteria in four deep ice sheets in western China

NASA Astrophysics Data System (ADS)

An, L.; Chen, Y.; Xiang, S.-R.; Shang, T.-C.; Tian, L.-De

2010-02-01

Microbial community patterns vary in glaciers world wide, presenting unique responses to global climatic and environmental changes. Four bacterial clone libraries were established by 16S rRNA gene amplification from four ice layers along the 42-m-long ice core MuztB drilled from the Muztag Ata Glacier. A total of 152 bacterial sequences obtained from the ice core MuztB were phylogenetically compared with the 71 previously reported sequences from three ice cores extracted from ice caps Malan, Dunde, and Puruoganri. The six functional clusters Flavisolibacter, Flexibacter (Bacteroidetes), Acinetobacter, Enterobacter (Gammaproteobacteria), Planococcus/Anoxybacillus (Firmicutes), and Propionibacter/Luteococcus (Actinobacteria) frequently occurred along the Muztag Ata Glacier profile. Sequence analysis showed that most of the sequences from the ice core clustered with those from cold environments, and the sequences from the same glacier formed a distinct cluster. Moreover, bacterial communities from the same location or similarly aged ice formed a cluster, and were clearly separate from those from other geographically isolated glaciers. In a summary, the findings provide preliminary evidence of zone distribution of microbial community, support our hypothesis of the spatial and temporal biogeography of microorganisms in glacial ice.
Variation in the number of nucleoli and incomplete homogenization of 18S ribosomal DNA sequences in leaf cells of the cultivated Oriental ginseng (Panax ginseng Meyer).

PubMed

Chelomina, Galina N; Rozhkovan, Konstantin V; Voronova, Anastasia N; Burundukova, Olga L; Muzarok, Tamara I; Zhuravlev, Yuri N

2016-04-01

Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440-640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine.
Variation in the number of nucleoli and incomplete homogenization of 18S ribosomal DNA sequences in leaf cells of the cultivated Oriental ginseng (Panax ginseng Meyer)

PubMed Central

Chelomina, Galina N.; Rozhkovan, Konstantin V.; Voronova, Anastasia N.; Burundukova, Olga L.; Muzarok, Tamara I.; Zhuravlev, Yuri N.

2015-01-01

Background Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. Methods The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. Results In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440–640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. Conclusion This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine. PMID:27158239
Breaking the computational barriers of pairwise genome comparison.

PubMed

Torreno, Oscar; Trelles, Oswaldo

2015-08-11

Conventional pairwise sequence comparison software algorithms are being used to process much larger datasets than they were originally designed for. This can result in processing bottlenecks that limit software capabilities or prevent full use of the available hardware resources. Overcoming the barriers that limit the efficient computational analysis of large biological sequence datasets by retrofitting existing algorithms or by creating new applications represents a major challenge for the bioinformatics community. We have developed C libraries for pairwise sequence comparison within diverse architectures, ranging from commodity systems to high performance and cloud computing environments. Exhaustive tests were performed using different datasets of closely- and distantly-related sequences that span from small viral genomes to large mammalian chromosomes. The tests demonstrated that our solution is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods. We have addressed the problem of pairwise and all-versus-all comparison of large sequences in general, greatly increasing the limits on input data size. The approach described here is based on a modular out-of-core strategy that uses secondary storage to avoid reaching memory limits during the identification of High-scoring Segment Pairs (HSPs) between the sequences under comparison. Software engineering concepts were applied to avoid intermediate result re-calculation, to minimise the performance impact of input/output (I/O) operations and to modularise the process, thus enhancing application flexibility and extendibility. Our computationally-efficient approach allows tasks such as the massive comparison of complete genomes, evolutionary event detection, the identification of conserved synteny blocks and inter-genome distance calculations to be performed more effectively.
In vitro optimization of truncated stem-loop II variants of the hammerhead ribozyme for cleavage in low concentrations of magnesium under non-turnover conditions.

PubMed Central

Zillmann, M; Limauro, S E; Goodchild, J

1997-01-01

By truncating helix II to two base pairs in a hammerhead ribozyme having long flanking sequences (greater than 30 bases), the rate of cleavage in 1 mM magnesium can be increased roughly 100-fold. Replacing most of the nucleotides in a typical stem-loop II with 1-4 randomized nucleotides gave an RNA library that, even before selection, was more active in 1 mM magnesium than the parent ribozyme, but considerably less active than the truncated stem-loop II ribozyme. A novel, multiround selection for intermolecular cleavage was exploited to optimize this library for cleavage in low concentrations of magnesium. After three rounds of selection at sequentially lower concentrations of magnesium, the library cleaved substrate RNA 20-fold faster than the initial pool and was cloned. This pool was heavily enriched for one particular sequence (5'-CGUG-3') that represented 16 of 52 isolates (the next most common sequence was represented only six times). This sequence also represented the most active sequence, exceeding the activity of the short helix II variant under the conditions of the selection, thereby demonstrating the effectiveness of the selection technique. Analysis of the cleavage rates of RNAs made from eight isolates having different four-base insert sequences allowed assignment of highly preferred bases at each position in the insert. Analysis of pool clones having insert of differing lengths showed that, in general, activity decreased as the length of the insert decreased from 4 to 1. This supports the suggested role of stem-loop II in stabilizing the non-Watson-Crick interactions between the conserved bases of the catalytic core. PMID:9214657
Protein sequences and redox titrations indicate that the electron acceptors in reaction centers from heliobacteria are similar to Photosystem I

NASA Technical Reports Server (NTRS)

Trost, J. T.; Brune, D. C.; Blankenship, R. E.

1992-01-01

Photosynthetic reaction centers isolated from Heliobacillus mobilis exhibit a single major protein on SDS-PAGE of 47 000 Mr. Attempts to sequence the reaction center polypeptide indicated that the N-terminus is blocked. After enzymatic and chemical cleavage, four peptide fragments were sequenced from the Heliobacillus mobilis apoprotein. Only one of these sequences showed significant specific similarity to any of the protein and deduced protein sequences in the GenBank data base. This fragment is identical with 56% of the residues, including both cysteines, found in highly conserved region that is proposed to bind iron-sulfur center Fx in the Photosystem I reaction center peptide that is the psaB gene product. The similarity to the psaA gene product in this region is 48%. Redox titrations of laser-flash-induced photobleaching with millisecond decay kinetics on isolated reaction centers from Heliobacterium gestii indicate a midpoint potential of -414 mV with n = 2 titration behavior. In membranes, the behavior is intermediate between n = 1 and n = 2, and the apparent midpoint potential is -444 mV. This is compared to the behavior in Photosystem I, where the intermediate electron acceptor A1, thought to be a phylloquinone molecule, has been proposed to undergo a double reduction at low redox potentials in the presence of viologen redox mediators. These results strongly suggest that the acceptor side electron transfer system in reaction centers from heliobacteria is indeed analogous to that found in Photosystem I. The sequence similarities indicate that the divergence of the heliobacteria from the Photosystem I line occurred before the gene duplication and subsequent divergence that lead to the heterodimeric protein core of the Photosystem I reaction center.
Structural and Functional Studies of a Phosphatidic Acid-Binding Antifungal Plant Defensin MtDef4: Identification of an RGFRRR Motif Governing Fungal Cell Entry

PubMed Central

Buchko, Garry W.; Berg, Howard R.; Kaur, Jagdeep; Pandurangi, Raghu S.; Smith, Thomas J.; Shah, Dilip M.

2013-01-01

MtDef4 is a 47-amino acid cysteine-rich evolutionary conserved defensin from a model legume Medicago truncatula. It is an apoplast-localized plant defense protein that inhibits the growth of the ascomycetous fungal pathogen Fusarium graminearum in vitro at micromolar concentrations. Little is known about the mechanisms by which MtDef4 mediates its antifungal activity. In this study, we show that MtDef4 rapidly permeabilizes fungal plasma membrane and is internalized by the fungal cells where it accumulates in the cytoplasm. Furthermore, analysis of the structure of MtDef4 reveals the presence of a positively charged γ-core motif composed of β2 and β3 strands connected by a positively charged RGFRRR loop. Replacement of the RGFRRR sequence with AAAARR or RGFRAA abolishes the ability of MtDef4 to enter fungal cells, suggesting that the RGFRRR loop is a translocation signal required for the internalization of the protein. MtDef4 binds to phosphatidic acid (PA), a precursor for the biosynthesis of membrane phospholipids and a signaling lipid known to recruit cytosolic proteins to membranes. Amino acid substitutions in the RGFRRR sequence which abolish the ability of MtDef4 to enter fungal cells also impair its ability to bind PA. These findings suggest that MtDef4 is a novel antifungal plant defensin capable of entering into fungal cells and affecting intracellular targets and that these processes are mediated by the highly conserved cationic RGFRRR loop via its interaction with PA. PMID:24324798
DOE Office of Scientific and Technical Information (OSTI.GOV)

Van Dyk, Schuyler D.; De Mink, Selma E.; Zapartas, Emmanouil

Core-collapse supernovae (SNe), which mark the deaths of massive stars, are among the most powerful explosions in the universe and are responsible, e.g., for a predominant synthesis of chemical elements in their host galaxies. The majority of massive stars are thought to be born in close binary systems. To date, putative binary companions to the progenitors of SNe may have been detected in only two cases, SNe 1993J and 2011dh. We report on the search for a companion of the progenitor of the Type Ic SN 1994I, long considered to have been the result of binary interaction. Twenty years aftermore » explosion, we used the Hubble Space Telescope to observe the SN site in the ultraviolet (F275W and F336W bands), resulting in deep upper limits on the expected companion: F275W > 26.1 mag and F336W > 24.7 mag. These allow us to exclude the presence of a main sequence companion with a mass ≳10 M{sub ⊙}. Through comparison with theoretical simulations of possible progenitor populations, we show that the upper limits to a companion detection exclude interacting binaries with semi-conservative (late Case A or early Case B) mass transfer. These limits tend to favor systems with non-conservative, late Case B mass transfer with intermediate initial orbital periods and mass ratios. The most likely mass range for a putative main sequence companion would be ∼5–12 M{sub ⊙}, the upper end of which corresponds to the inferred upper detection limit.« less

Extensive recombination events and horizontal gene transfer shaped the Legionella pneumophila genomes

PubMed Central

2011-01-01

Background Legionella pneumophila is an intracellular pathogen of environmental protozoa. When humans inhale contaminated aerosols this bacterium may cause a severe pneumonia called Legionnaires' disease. Despite the abundance of dozens of Legionella species in aquatic reservoirs, the vast majority of human disease is caused by a single serogroup (Sg) of a single species, namely L. pneumophila Sg1. To get further insights into genome dynamics and evolution of Sg1 strains, we sequenced strains Lorraine and HL 0604 1035 (Sg1) and compared them to the available sequences of Sg1 strains Paris, Lens, Corby and Philadelphia, resulting in a comprehensive multigenome analysis. Results We show that L. pneumophila Sg1 has a highly conserved and syntenic core genome that comprises the many eukaryotic like proteins and a conserved repertoire of over 200 Dot/Icm type IV secreted substrates. However, recombination events and horizontal gene transfer are frequent. In particular the analyses of the distribution of nucleotide polymorphisms suggests that large chromosomal fragments of over 200 kbs are exchanged between L. pneumophila strains and contribute to the genome dynamics in the natural population. The many secretion systems present might be implicated in exchange of these fragments by conjugal transfer. Plasmids also play a role in genome diversification and are exchanged among strains and circulate between different Legionella species. Conclusion Horizontal gene transfer among bacteria and from eukaryotes to L. pneumophila as well as recombination between strains allows different clones to evolve into predominant disease clones and others to replace them subsequently within relatively short periods of time. PMID:22044686
Insights into the phylogenetic positions of photosynthetic bacteria obtained from 5S rRNA and 16S rRNA sequence data

NASA Technical Reports Server (NTRS)

Fox, G. E.

1985-01-01

Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.
Identification of protein W, the elusive sixth subunit of the Rhodopseudomonas palustris reaction center-light harvesting 1 core complex.

PubMed

Jackson, Philip J; Hitchcock, Andrew; Swainsbury, David J K; Qian, Pu; Martin, Elizabeth C; Farmer, David A; Dickman, Mark J; Canniffe, Daniel P; Hunter, C Neil

2018-02-01

The X-ray crystal structure of the Rhodopseudomonas (Rps.) palustris reaction center-light harvesting 1 (RC-LH1) core complex revealed the presence of a sixth protein component, variably referred to in the literature as helix W, subunit W or protein W. The position of this protein prevents closure of the LH1 ring, possibly to allow diffusion of ubiquinone/ubiquinol between the RC and the cytochrome bc 1 complex in analogous fashion to the well-studied PufX protein from Rhodobacter sphaeroides. The identity and function of helix W have remained unknown for over 13years; here we use a combination of biochemistry, mass spectrometry, molecular genetics and electron microscopy to identify this protein as RPA4402 in Rps. palustris CGA009. Protein W shares key conserved sequence features with PufX homologs, and although a deletion mutant was able to grow under photosynthetic conditions with no discernible phenotype, we show that a tagged version of protein W pulls down the RC-LH1 complex. Protein W is not encoded in the photosynthesis gene cluster and our data indicate that only approximately 10% of wild-type Rps. palustris core complexes contain this non-essential subunit; functional and evolutionary consequences of this observation are discussed. The ability to purify uniform RC-LH1 and RC-LH1-protein W preparations will also be beneficial for future structural studies of these bacterial core complexes. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
GalaxyTBM: template-based modeling by building a reliable core and refining unreliable local regions.

PubMed

Ko, Junsu; Park, Hahnbeom; Seok, Chaok

2012-08-10

Protein structures can be reliably predicted by template-based modeling (TBM) when experimental structures of homologous proteins are available. However, it is challenging to obtain structures more accurate than the single best templates by either combining information from multiple templates or by modeling regions that vary among templates or are not covered by any templates. We introduce GalaxyTBM, a new TBM method in which the more reliable core region is modeled first from multiple templates and less reliable, variable local regions, such as loops or termini, are then detected and re-modeled by an ab initio method. This TBM method is based on "Seok-server," which was tested in CASP9 and assessed to be amongst the top TBM servers. The accuracy of the initial core modeling is enhanced by focusing on more conserved regions in the multiple-template selection and multiple sequence alignment stages. Additional improvement is achieved by ab initio modeling of up to 3 unreliable local regions in the fixed framework of the core structure. Overall, GalaxyTBM reproduced the performance of Seok-server, with GalaxyTBM and Seok-server resulting in average GDT-TS of 68.1 and 68.4, respectively, when tested on 68 single-domain CASP9 TBM targets. For application to multi-domain proteins, GalaxyTBM must be combined with domain-splitting methods. Application of GalaxyTBM to CASP9 targets demonstrates that accurate protein structure prediction is possible by use of a multiple-template-based approach, and ab initio modeling of variable regions can further enhance the model quality.
Characteristics of the Lotus japonicus gene repertoire deduced from large-scale expressed sequence tag (EST) analysis.

PubMed

Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

2004-02-01

To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
Phylogeny and comparative genome analysis of a Basidiomycete fungi

DOE Office of Scientific and Technical Information (OSTI.GOV)

Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor

2011-03-14

Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein familiesmore » that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.« less
Characterization of a novel hepadnavirus in the white sucker (Catostomus commersonii) from the Great Lakes Region of the USA

USGS Publications Warehouse

Hahn, Cassidy M.; Iwanowicz, Luke R.; Cornman, Robert S.; Conway, Carla M.; Winton, James R.; Blazer, Vicki S.

2015-01-01

The white sucker Catostomus commersonii is a freshwater teleost often utilized as a resident sentinel. Here, we sequenced the full genome of a hepatitis B-like virus that infects white suckers from the Great Lakes Region of the USA. Dideoxysequencing confirmed the white sucker hepatitis B virus (WSHBV) has a circular genome (3542 bp) with the prototypical codon organization of hepadnaviruses. Electron microscopy demonstrated that complete virions of approximately 40 nm were present in the plasma of infected fish. Compared to avi- and orthohepadnaviruses, sequence conservation of the core, polymerase and surface proteins was low and ranged from 16-27% at the amino acid level. An X protein homologue common to the orthohepadnaviruses was not present. The WSHBV genome included an atypical, presumptively non-coding region absent in previously described hepadnaviruses. Phylogenetic analyses confirmed WSHBV as distinct from previously documented hepadnaviruses. The level of divergence in protein sequences between WSHBV other hepadnaviruses, and the identification of an HBV-like sequence in an African cichlid provide evidence that a novel genus of the family Hepadnaviridae may need to be established that includes these hepatitis B-like viruses in fishes. Viral transcription was observed in 9.5% (16 of 169) of white suckers evaluated. The prevalence of hepatic tumors in these fish was 4.9%, of which only 2.4% were positive for both virus and hepatic tumors. These results are not sufficient to draw inferences regarding the association of WSHBV and carcinogenesis in white sucker.
Combining Structural Modeling with Ensemble Machine Learning to Accurately Predict Protein Fold Stability and Binding Affinity Effects upon Mutation

PubMed Central

Garcia Lopez, Sebastian; Kim, Philip M.

2014-01-01

Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT) algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases. PMID:25243403
Discovery of magnetic A supergiants: the descendants of magnetic main-sequence B stars

NASA Astrophysics Data System (ADS)

Neiner, Coralie; Oksala, Mary E.; Georgy, Cyril; Przybilla, Norbert; Mathis, Stéphane; Wade, Gregg; Kondrak, Matthias; Fossati, Luca; Blazère, Aurore; Buysschaert, Bram; Grunhut, Jason

2017-10-01

In the context of the high resolution, high signal-to-noise ratio, high sensitivity, spectropolarimetric survey BritePol, which complements observations by the BRITE constellation of nanosatellites for asteroseismology, we are looking for and measuring the magnetic field of all stars brighter than V = 4. In this paper, we present circularly polarized spectra obtained with HarpsPol at ESO in La Silla (Chile) and ESPaDOnS at CFHT (Hawaii) for three hot evolved stars: ι Car, HR 3890 and ɛ CMa. We detected a magnetic field in all three stars. Each star has been observed several times to confirm the magnetic detections and check for variability. The stellar parameters of the three objects were determined and their evolutionary status was ascertained employing evolution models computed with the Geneva code. ɛ CMa was already known and is confirmed to be magnetic, but our modelling indicates that it is located near the end of the main sequence, I.e. it is still in a core hydrogen burning phase. ι Car and HR 3890 are the first discoveries of magnetic hot supergiants located well after the end of the main sequence on the Hertzsprung-Russell diagram. These stars are probably the descendants of main-sequence magnetic massive stars. Their current field strength (a few G) is compatible with magnetic flux conservation during stellar evolution. These results provide observational constraints for the development of future evolutionary models of hot stars including a fossil magnetic field.
Approach to numerical safety guidelines based on a core melt criterion. [PWR; BWR

DOE Office of Scientific and Technical Information (OSTI.GOV)

Azarm, M.A.; Hall, R.E.

1982-01-01

A plausible approach is proposed for translating a single level criterion to a set of numerical guidelines. The criterion for core melt probability is used to set numerical guidelines for various core melt sequences, systems and component unavailabilities. These guidelines can be used as a means for making decisions regarding the necessity for replacing a component or improving part of a safety system. This approach is applied to estimate a set of numerical guidelines for various sequences of core melts that are analyzed in Reactor Safety Study for the Peach Bottom Nuclear Power Plant.
Determination of the promoter region of mouse ribosomal RNA gene by an in vitro transcription system.

PubMed Central

Yamamoto, O; Takakusa, N; Mishima, Y; Kominami, R; Muramatsu, M

1984-01-01

Sequences required for a faithful and efficient transcription of a cloned mouse ribosomal RNA gene (rDNA) are determined by testing a series of deletion mutants in an in vitro transcription system utilizing two kinds of mouse cellular extract. Deletion of sequences upstream of -40 or downstream of +52 causes only slight reduction in promoter activity as compared with the "wild-type" template. For upstream deletion mutants, the removal of a sequence between -40 and -35 causes a significant decrease in the capacity to direct efficient initiation. This decrease becomes more pronounced when the deletion reaches -32 and the sequence A-T-C-T-T-T, conserved among mouse, rat, and human rDNAs, is lost. Residual template activity is further reduced as more upstream sequence is deleted and finally becomes undetectable when the deletion is extended from -22 down to -17, corresponding to the loss of the conserved sequence T-A-T-T-G. As for downstream deletion mutants, the removal of the sequence downstream of +23 causes some (and further deletions up to +11 cause a more) serious decrease in template activity in vitro. These deletions involve other conserved sequences downstream of the transcription start site. However, the removal of the original transcription start site does not abolish the transcription initiation completely, provided that the whole upstream sequence is intact. Images PMID:6320178
The Complete Sequence and Comparative Analysis of a Multidrug-Resistance and Virulence Multireplicon IncFII Plasmid pEC302/04 from an Extraintestinal Pathogenic Escherichia coli EC302/04 Indicate Extensive Diversity of IncFII Plasmids.

PubMed

Ho, Wing Sze; Yap, Kien-Pong; Yeo, Chew Chieng; Rajasekaram, Ganeswrie; Thong, Kwai Lin

2015-01-01

Extraintestinal pathogenic Escherichia coli (ExPEC) that causes extraintestinal infections often harbor plasmids encoding fitness traits such as resistance and virulence determinants that are of clinical importance. We determined the complete nucleotide sequence of plasmid pEC302/04 from a multidrug-resistant E. coli EC302/04 which was isolated from the tracheal aspirate of a patient in Malaysia. In addition, we also performed comparative sequence analyses of 18 related IncFIIA plasmids to determine the phylogenetic relationship and diversity of these plasmids. The 140,232 bp pEC302/04 is a multireplicon plasmid that bears three replication systems (FII, FIA, and FIB) with subtype of F2:A1:B1. The plasmid is self-transmissible with a complete transfer region. pEC302/04 also carries antibiotic resistance genes such as bla TEM-1 and a class I integron containing sul1, cml and aadA resistance genes, conferring multidrug resistance (MDR) to its host, E. coli EC302/04. Besides, two iron acquisition systems (SitABCD and IutA-IucABCD) which are the conserved virulence determinants of ExPEC-colicin V or B and M (ColV/ColBM)-producing plasmids were identified in pEC302/04. Multiple toxin-antitoxin (TA)-based addiction systems (i.e., PemI/PemK, VagC/VagD, CcdA/CcdB, and Hok/Sok) and a plasmid partitioning system, ParAB, and PsiAB, which are important for plasmid maintenance were also found. Comparative plasmid analysis revealed only one conserved gene, the repA1 as the core genome, showing that there is an extensive diversity among the IncFIIA plasmids. The phylogenetic relationship of 18 IncF plasmids based on the core regions revealed that ColV/ColBM-plasmids and non-ColV/ColBM plasmids were separated into two distinct groups. These plasmids, which carry highly diverse genetic contents, are also mosaic in nature. The atypical combination of genetic materials, i.e., the MDR- and ColV/ColBM-plasmid-virulence encoding regions in a single ExPEC plasmid is rare but of clinical importance. Such phenomenon is bothersome when the plasmids are transmissible, facilitating the spread of virulence and resistance plasmids among pathogenic bacteria. Notably, certain TA systems are more commonly found in particular ExPEC plasmid types, indicating the possible relationships between certain TA systems and ExPEC pathogenesis.
The Blue Straggler Star Population in NGC 1261: Evidence for a Post-core-collapse Bounce State

NASA Astrophysics Data System (ADS)

Simunovic, Mirko; Puzia, Thomas H.; Sills, Alison

2014-11-01

We present a multi-passband photometric study of the Blue Straggler Star (BSS) population in the Galactic globular cluster (GC) NGC 1261, using available space- and ground-based survey data. The inner BSS population is found to have two distinct sequences in the color-magnitude diagram (CMD), similar to double BSS sequences detected in other GCs. These well defined sequences are presumably linked to single short-lived events such as core collapse, which are expected to boost the formation of BSSs. In agreement with this, we find a BSS sequence in NGC 1261 which can be well reproduced individually by a theoretical model prediction of a 2 Gyr old population of stellar collision products, which are expected to form in the denser inner regions during short-lived core contraction phases. Additionally, we report the occurrence of a group of BSSs with unusually blue colors in the CMD, which are consistent with a corresponding model of a 200 Myr old population of stellar collision products. The properties of the NGC 1261 BSS populations, including their spatial distributions, suggest an advanced dynamical evolutionary state of the cluster, but the core of this GC does not show the classical signatures of core collapse. We argue that these apparent contradictions provide evidence for a post-core-collapse bounce state seen in dynamical simulations of old GCs.
Characterization of the cyanobacteria and associated bacterial community from an ephemeral wetland in New Zealand.

PubMed

Secker, Nick H; Chua, Jocelyn P S; Laurie, Rebecca E; McNoe, Les; Guy, Paul L; Orlovich, David A; Summerfield, Tina C

2016-10-01

New Zealand ephemeral wetlands are ecologically important, containing up to 12% of threatened native plant species and frequently exhibiting conspicuous cyanobacterial growth. In such environments, cyanobacteria and associated heterotrophs can influence primary production and nutrient cycling. Wetland communities, including bacteria, can be altered by increased nitrate and phosphate due to agricultural practices. We have characterized cyanobacteria from the Wairepo Kettleholes Conservation Area and their associated bacteria. Use of 16S rRNA amplicon sequencing identified several operational taxonomic units (OTUs) representing filamentous heterocystous and non-heterocystous cyanobacterial taxa. One Nostoc OTU that formed macroscopic colonies dominated the cyanobacterial community. A diverse bacterial community was associated with the Nostoc colonies, including a core microbiome of 39 OTUs. Identity of the core microbiome associated with macroscopic Nostoc colonies was not changed by the addition of nutrients. One OTU was highly represented in all Nostoc colonies (27.6%-42.6% of reads) and phylogenetic analyses identified this OTU as belonging to the genus Sphingomonas. Scanning electron microscopy showed the absence of heterotrophic bacteria within the Nostoc colony but revealed a diverse community associated with the colonies on the external surface. © 2016 Phycological Society of America.
Ten years of barcoding at the African Centre for DNA Barcoding.

PubMed

Bezeng, B S; Davies, T J; Daru, B H; Kabongo, R M; Maurin, O; Yessoufou, K; van der Bank, H; van der Bank, M

2017-07-01

The African Centre for DNA Barcoding (ACDB) was established in 2005 as part of a global initiative to accurately and rapidly survey biodiversity using short DNA sequences. The mitochondrial cytochrome c oxidase 1 gene (CO1) was rapidly adopted as the de facto barcode for animals. Following the evaluation of several candidate loci for plants, the Plant Working Group of the Consortium for the Barcoding of Life in 2009 recommended that two plastid genes, rbcLa and matK, be adopted as core DNA barcodes for terrestrial plants. To date, numerous studies continue to test the discriminatory power of these markers across various plant lineages. Over the past decade, we at the ACDB have used these core DNA barcodes to generate a barcode library for southern Africa. To date, the ACDB has contributed more than 21 000 plant barcodes and over 3000 CO1 barcodes for animals to the Barcode of Life Database (BOLD). Building upon this effort, we at the ACDB have addressed questions related to community assembly, biogeography, phylogenetic diversification, and invasion biology. Collectively, our work demonstrates the diverse applications of DNA barcoding in ecology, systematics, evolutionary biology, and conservation.
Brewhouse-resident microbiota are responsible for multi-stage fermentation of American coolship ale.

PubMed

Bokulich, Nicholas A; Bamforth, Charles W; Mills, David A

2012-01-01

American coolship ale (ACA) is a type of spontaneously fermented beer that employs production methods similar to traditional Belgian lambic. In spite of its growing popularity in the American craft-brewing sector, the fermentation microbiology of ACA has not been previously described, and thus the interface between production methodology and microbial community structure is unexplored. Using terminal restriction fragment length polymorphism (TRFLP), barcoded amplicon sequencing (BAS), quantitative PCR (qPCR) and culture-dependent analysis, ACA fermentations were shown to follow a consistent fermentation progression, initially dominated by Enterobacteriaceae and a range of oxidative yeasts in the first month, then ceding to Saccharomyces spp. and Lactobacillales for the following year. After one year of fermentation, Brettanomyces bruxellensis was the dominant yeast population (occasionally accompanied by minor populations of Candida spp., Pichia spp., and other yeasts) and Lactobacillales remained dominant, though various aerobic bacteria became more prevalent. This work demonstrates that ACA exhibits a conserved core microbial succession in absence of inoculation, supporting the role of a resident brewhouse microbiota. These findings establish this core microbial profile of spontaneous beer fermentations as a target for production control points and quality standards for these beers.
Brewhouse-Resident Microbiota Are Responsible for Multi-Stage Fermentation of American Coolship Ale

PubMed Central

Bokulich, Nicholas A.; Bamforth, Charles W.; Mills, David A.

2012-01-01

American coolship ale (ACA) is a type of spontaneously fermented beer that employs production methods similar to traditional Belgian lambic. In spite of its growing popularity in the American craft-brewing sector, the fermentation microbiology of ACA has not been previously described, and thus the interface between production methodology and microbial community structure is unexplored. Using terminal restriction fragment length polymorphism (TRFLP), barcoded amplicon sequencing (BAS), quantitative PCR (qPCR) and culture-dependent analysis, ACA fermentations were shown to follow a consistent fermentation progression, initially dominated by Enterobacteriaceae and a range of oxidative yeasts in the first month, then ceding to Saccharomyces spp. and Lactobacillales for the following year. After one year of fermentation, Brettanomyces bruxellensis was the dominant yeast population (occasionally accompanied by minor populations of Candida spp., Pichia spp., and other yeasts) and Lactobacillales remained dominant, though various aerobic bacteria became more prevalent. This work demonstrates that ACA exhibits a conserved core microbial succession in absence of inoculation, supporting the role of a resident brewhouse microbiota. These findings establish this core microbial profile of spontaneous beer fermentations as a target for production control points and quality standards for these beers. PMID:22530036
Myxobolus cerebralis internal transcribed spacer 1 (ITS-1) sequences support recent spread of the parasite to North America and within Europe

USGS Publications Warehouse

Whipps, Christopher M.; El-Matbouli, M.; Hedrick, R.P.; Blazer, V.; Kent, M.L.

2004-01-01

Molecular approaches for resolving relationships among the Myxozoa have relied mainly on small subunit (SSU) ribosomal DNA (rDNA) sequence analysis. This region of the gene is generally used for higher phylogenetic studies, and the conservative nature of this gene may make it inadequate for intraspecific comparisons. Previous intraspecific studies of Myxobolus cerebralis based on molecular analyses reported that the sequence of SSU rDNA and the internal transcribed spacer (ITS) were highly conserved in representatives of the parasite from North America and Europe. Considering that the ITS is usually a more variable region than the SSU, we reanalyzed available sequences on GenBank and obtained sequences from other M. cerebralis representatives from the states of California and West Virginia in the USA and from Germany and Russia. With the exception of 7 base pairs, most of the sequence designated as ITS-1 in GenBank was a highly conserved portion of the rDNA near the 3-prime end of the SSU region. Nonetheless, the additional ITS-1 sequences obtained from the available geographic representatives were well conserved. It is unlikely that we would have observed virtually identical ITS-1 sequences between European and American M. cerebralis samples had it spread naturally over time, particularly when compared to the variation seen between isolates of another myxozoan (Kudoa thyrsites) that has most likely spread naturally. These data further support the hypothesis that the current distribution of M. cerebralis in North America is a result of recent introductions followed by dispersal via anthropogenic means, largely through the stocking of infected trout for sport fishing.
DsaV methyltransferase and its isoschizomers contain a conserved segment that is similar to the segment in Hhai methyltransferase that is in contact with DNA bases.

PubMed Central

Gopal, J; Yebra, M J; Bhagwat, A S

1994-01-01

The methyltransferase (MTase) in the DsaV restriction--modification system methylates within 5'-CCNGG sequences. We have cloned the gene for this MTase and determined its sequence. The predicted sequence of the MTase protein contains sequence motifs conserved among all cytosine-5 MTases and is most similar to other MTases that methylate CCNGG sequences, namely M.ScrFI and M.SsoII. All three MTases methylate the internal cytosine within their recognition sequence. The 'variable' region within the three enzymes that methylate CCNGG can be aligned with the sequences of two enzymes that methylate CCWGG sequences. Remarkably, two segments within this region contain significant similarity with the region of M.HhaI that is known to contact DNA bases. These alignments suggest that many cytosine-5 MTases are likely to interact with DNA using a similar structural framework. Images PMID:7971279
Similar folds with different stabilization mechanisms: the cases of prion and doppel proteins

PubMed Central

Colacino, Stefano; Tiana, Guido; Colombo, Giorgio

2006-01-01

Background Protein misfolding is the main cause of a group of fatal neurodegenerative diseases in humans and animals. In particular, in Prion-related diseases the normal cellular form of the Prion Protein PrP (PrPC) is converted into the infectious PrPSc through a conformational process during which it acquires a high β-sheet content. Doppel is a protein that shares a similar native fold, but lacks the scrapie isoform. Understanding the molecular determinants of these different behaviours is important both for biomedical and biophysical research. Results In this paper, the dynamical and energetic properties of the two proteins in solution is comparatively analyzed by means of long time scale explicit solvent, all-atom molecular dynamics in different temperature conditions. The trajectories are analyzed by means of a recently introduced energy decomposition approach (Tiana et al, Prot. Sci. 2004) aimed at identifying the key residues for the stabilization and folding of the protein. Our analysis shows that Prion and Doppel have two different cores stabilizing the native state and that the relative contribution of the nucleus to the global stability of the protein for Doppel is sensitively higher than for PrP. Moreover, under misfolding conditions the Doppel core is conserved, while the energy stabilization network of PrP is disrupted. Conclusion These observations suggest that different sequences can share similar native topology with different stabilizing interactions and that the sequences of the Prion and Doppel proteins may have diverged under different evolutionary constraints resulting in different folding and stabilization mechanisms. PMID:16857062

Some links on this page may take you to non-federal websites. Their policies may differ from this site.